Type ExponentialMovingAverage
Namespace tensorflow.train
Parent PythonObjectContainer
Interfaces IExponentialMovingAverage
Maintains moving averages of variables by employing an exponential decay. When training a model, it is often beneficial to maintain moving averages of
the trained parameters. Evaluations that use averaged parameters sometimes
produce significantly better results than the final trained values. The `apply()` method adds shadow copies of trained variables and add ops that
maintain a moving average of the trained variables in their shadow copies.
It is used when building the training model. The ops that maintain moving
averages are typically run after each training step.
The `average()` and `average_name()` methods give access to the shadow
variables and their names. They are useful when building an evaluation
model, or when restoring a model from a checkpoint file. They help use the
moving averages in place of the last trained values for evaluations. The moving averages are computed using exponential decay. You specify the
decay value when creating the `ExponentialMovingAverage` object. The shadow
variables are initialized with the same initial values as the trained
variables. When you run the ops to maintain the moving averages, each
shadow variable is updated with the formula: `shadow_variable -= (1 - decay) * (shadow_variable - variable)` This is mathematically equivalent to the classic formula below, but the use
of an `assign_sub` op (the `"-="` in the formula) allows concurrent lockless
updates to the variables: `shadow_variable = decay * shadow_variable + (1 - decay) * variable` Reasonable values for `decay` are close to 1.0, typically in the
multiple-nines range: 0.999, 0.9999, etc. Example usage when creating a training model:
There are two ways to use the moving averages for evaluations: * Build a model that uses the shadow variables instead of the variables.
For this, use the `average()` method which returns the shadow variable
for a given variable.
* Build a model normally but load the checkpoint files to evaluate by using
the shadow variable names. For this use the `average_name()` method. See
the `tf.compat.v1.train.Saver` for more
information on restoring saved variables. Example of restoring the shadow variable values:
Show Example
# Create variables. var0 = tf.Variable(...) var1 = tf.Variable(...) #... use the variables to build a training model... ... # Create an op that applies the optimizer. This is what we usually # would use as a training op. opt_op = opt.minimize(my_loss, [var0, var1]) # Create an ExponentialMovingAverage object ema = tf.train.ExponentialMovingAverage(decay=0.9999) with tf.control_dependencies([opt_op]): # Create the shadow variables, and add ops to maintain moving averages # of var0 and var1. This also creates an op that will update the moving # averages after each training step. This is what we will use in place # of the usual training op. training_op = ema.apply([var0, var1]) ...train the model by running training_op...
Methods
- apply
- apply_dyn
- average
- average_dyn
- average_name
- average_name_dyn
- variables_to_restore
- variables_to_restore_dyn
Properties
Public instance methods
object apply(IEnumerable<Variable> var_list)
Maintains moving averages of variables. `var_list` must be a list of `Variable` or `Tensor` objects. This method
creates shadow variables for all elements of `var_list`. Shadow variables
for `Variable` objects are initialized to the variable's initial value.
They will be added to the `GraphKeys.MOVING_AVERAGE_VARIABLES` collection.
For `Tensor` objects, the shadow variables are initialized to 0 and zero
debiased (see docstring in `assign_moving_average` for more details). shadow variables are created with `trainable=False` and added to the
`GraphKeys.ALL_VARIABLES` collection. They will be returned by calls to
`tf.compat.v1.global_variables()`. Returns an op that updates all shadow variables from the current value of
their associated variables. Note that `apply()` can be called multiple times. When eager execution is
enabled each call to apply will update the variables once, so this needs to
be called in a loop.
Parameters
-
IEnumerable<Variable>
var_list - A list of Variable or Tensor objects. The variables and Tensors must be of types bfloat16, float16, float32, or float64.
Returns
-
object
- An Operation that updates the moving averages.
object apply_dyn(object var_list)
Maintains moving averages of variables. `var_list` must be a list of `Variable` or `Tensor` objects. This method
creates shadow variables for all elements of `var_list`. Shadow variables
for `Variable` objects are initialized to the variable's initial value.
They will be added to the `GraphKeys.MOVING_AVERAGE_VARIABLES` collection.
For `Tensor` objects, the shadow variables are initialized to 0 and zero
debiased (see docstring in `assign_moving_average` for more details). shadow variables are created with `trainable=False` and added to the
`GraphKeys.ALL_VARIABLES` collection. They will be returned by calls to
`tf.compat.v1.global_variables()`. Returns an op that updates all shadow variables from the current value of
their associated variables. Note that `apply()` can be called multiple times. When eager execution is
enabled each call to apply will update the variables once, so this needs to
be called in a loop.
Parameters
-
object
var_list - A list of Variable or Tensor objects. The variables and Tensors must be of types bfloat16, float16, float32, or float64.
Returns
-
object
- An Operation that updates the moving averages.
object average(Variable var)
Returns the `Variable` holding the average of `var`.
Parameters
-
Variable
var - A `Variable` object.
Returns
-
object
- A `Variable` object or `None` if the moving average of `var` is not maintained.
object average_dyn(object var)
Returns the `Variable` holding the average of `var`.
Parameters
-
object
var - A `Variable` object.
Returns
-
object
- A `Variable` object or `None` if the moving average of `var` is not maintained.
string average_name(Variable var)
Returns the name of the `Variable` holding the average for `var`. The typical scenario for `ExponentialMovingAverage` is to compute moving
averages of variables during training, and restore the variables from the
computed moving averages during evaluations. To restore variables, you have to know the name of the shadow variables.
That name and the original variable can then be passed to a `Saver()` object
to restore the variable from the moving average value with:
`saver = tf.compat.v1.train.Saver({ema.average_name(var): var})` `average_name()` can be called whether or not `apply()` has been called.
Parameters
-
Variable
var - A `Variable` object.
Returns
-
string
- A string: The name of the variable that will be used or was used by the `ExponentialMovingAverage class` to hold the moving average of `var`.
object average_name_dyn(object var)
Returns the name of the `Variable` holding the average for `var`. The typical scenario for `ExponentialMovingAverage` is to compute moving
averages of variables during training, and restore the variables from the
computed moving averages during evaluations. To restore variables, you have to know the name of the shadow variables.
That name and the original variable can then be passed to a `Saver()` object
to restore the variable from the moving average value with:
`saver = tf.compat.v1.train.Saver({ema.average_name(var): var})` `average_name()` can be called whether or not `apply()` has been called.
Parameters
-
object
var - A `Variable` object.
Returns
-
object
- A string: The name of the variable that will be used or was used by the `ExponentialMovingAverage class` to hold the moving average of `var`.
IDictionary<object, Variable> variables_to_restore(IEnumerable<Variable> moving_avg_variables)
Returns a map of names to `Variables` to restore. If a variable has a moving average, use the moving average variable name as
the restore name; otherwise, use the variable name. For example,
Below is an example of such mapping: ```
conv/batchnorm/gamma/ExponentialMovingAverage: conv/batchnorm/gamma,
conv_4/conv2d_params/ExponentialMovingAverage: conv_4/conv2d_params,
global_step: global_step
```
Parameters
-
IEnumerable<Variable>
moving_avg_variables - a list of variables that require to use of the moving average variable name to be restored. If None, it will default to variables.moving_average_variables() + variables.trainable_variables()
Returns
-
IDictionary<object, Variable>
- A map from restore_names to variables. The restore_name is either the original or the moving average version of the variable name, depending on whether the variable name is in the `moving_avg_variables`.
Show Example
variables_to_restore = ema.variables_to_restore() saver = tf.compat.v1.train.Saver(variables_to_restore)
object variables_to_restore_dyn(object moving_avg_variables)
Returns a map of names to `Variables` to restore. If a variable has a moving average, use the moving average variable name as
the restore name; otherwise, use the variable name. For example,
Below is an example of such mapping: ```
conv/batchnorm/gamma/ExponentialMovingAverage: conv/batchnorm/gamma,
conv_4/conv2d_params/ExponentialMovingAverage: conv_4/conv2d_params,
global_step: global_step
```
Parameters
-
object
moving_avg_variables - a list of variables that require to use of the moving average variable name to be restored. If None, it will default to variables.moving_average_variables() + variables.trainable_variables()
Returns
-
object
- A map from restore_names to variables. The restore_name is either the original or the moving average version of the variable name, depending on whether the variable name is in the `moving_avg_variables`.
Show Example
variables_to_restore = ema.variables_to_restore() saver = tf.compat.v1.train.Saver(variables_to_restore)
Public properties
string name get;
The name of this ExponentialMovingAverage object.
object name_dyn get;
The name of this ExponentialMovingAverage object.