Type Adadelta
Namespace tensorflow.keras.optimizers
Parent Optimizer
Interfaces IAdadelta
Optimizer that implements the Adadelta algorithm. Adadelta optimization is a stochastic gradient descent method that is based on
adaptive learning rate per dimension to address two drawbacks:
1) the continual decay of learning rates throughout training
2) the need for a manually selected global learning rate Two accumulation steps are required:
1) the accumulation of gradients squared,
2) the accumulation of updates squared. Initialization: $$E[g^2]_0 := 0 \text{(Initialize gradient 2nd order moment vector)}$$
$$E[\Delta x^2]_0 := 0 \text{(Initialize 2nd order variable update)}$$ $$t := t + 1$$
$$E[g^2]_t := \rho * E[g^2]_{t-1} + (1 - \rho) * g^2$$
$$\Delta x_t = -RMS[\Delta x]_{t-1} * g_t / RMS[g]_t$$
$$E[\Delta x^2]_t := \rho * E[\Delta x^2]_{t-1} + (1 - \rho) * \Delta x_t^2$$
$$x_t := x_{t-1} + \Delta x_{t}$$ References
See [M. D. Zeiler](http://arxiv.org/abs/1212.5701)
([pdf](http://arxiv.org/pdf/1212.5701v1.pdf))
Methods
- add_slot
- add_slot
- add_slot
- add_slot
- add_slot_dyn
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight
- add_weight_dyn
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients
- apply_gradients_dyn
- get_gradients
- get_gradients
- get_gradients_dyn
- get_updates
- get_updates_dyn
- minimize
- minimize_dyn
- NewDyn
Properties
Public instance methods
Variable add_slot(object var, string slot_name, constant_initializer initializer)
Add a new slot variable for `var`.
Variable add_slot(object var, string slot_name, CheckpointInitialValue initializer)
Add a new slot variable for `var`.
Variable add_slot(object var, string slot_name, object initializer)
Add a new slot variable for `var`.
Variable add_slot(object var, string slot_name, string initializer)
Add a new slot variable for `var`.
object add_slot_dyn(object var, object slot_name, ImplicitContainer<T> initializer)
Add a new slot variable for `var`.
object add_weight(string name, int shape, DType dtype, PythonFunctionContainer initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, int shape, DType dtype, PythonFunctionContainer initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, int shape, DType dtype, ImplicitContainer<T> initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, int shape, DType dtype, ImplicitContainer<T> initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, TensorShape shape, DType dtype, PythonFunctionContainer initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, TensorShape shape, DType dtype, PythonFunctionContainer initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, TensorShape shape, DType dtype, ImplicitContainer<T> initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, TensorShape shape, DType dtype, ImplicitContainer<T> initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, Dimension shape, DType dtype, PythonFunctionContainer initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, Dimension shape, DType dtype, ImplicitContainer<T> initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, Dimension shape, DType dtype, ImplicitContainer<T> initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, Dimension shape, DType dtype, PythonFunctionContainer initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, IEnumerable<object> shape, DType dtype, PythonFunctionContainer initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, IEnumerable<object> shape, DType dtype, ImplicitContainer<T> initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, ValueTuple shape, DType dtype, PythonFunctionContainer initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, ValueTuple shape, DType dtype, PythonFunctionContainer initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, IEnumerable<object> shape, DType dtype, PythonFunctionContainer initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, IEnumerable<object> shape, DType dtype, ImplicitContainer<T> initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, ValueTuple shape, DType dtype, ImplicitContainer<T> initializer, object trainable, Nullable<bool> synchronization, ImplicitContainer<T> aggregation)
object add_weight(string name, ValueTuple shape, DType dtype, ImplicitContainer<T> initializer, object trainable, VariableSynchronization synchronization, ImplicitContainer<T> aggregation)
object add_weight_dyn(object name, object shape, object dtype, ImplicitContainer<T> initializer, object trainable, ImplicitContainer<T> synchronization, ImplicitContainer<T> aggregation)
object apply_gradients(object grads_and_vars, int name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
object
grads_and_vars - List of (gradient, variable) pairs.
-
int
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(IEnumerable<object> grads_and_vars, BaseResourceVariable name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
IEnumerable<object>
grads_and_vars - List of (gradient, variable) pairs.
-
BaseResourceVariable
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(IEnumerable<object> grads_and_vars, IGraphNodeBase name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
IEnumerable<object>
grads_and_vars - List of (gradient, variable) pairs.
-
IGraphNodeBase
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(object grads_and_vars, IGraphNodeBase name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
object
grads_and_vars - List of (gradient, variable) pairs.
-
IGraphNodeBase
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(IEnumerable<object> grads_and_vars, string name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
IEnumerable<object>
grads_and_vars - List of (gradient, variable) pairs.
-
string
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(object grads_and_vars, string name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
object
grads_and_vars - List of (gradient, variable) pairs.
-
string
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(ValueTuple<IEnumerable<object>, object> grads_and_vars, string name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
ValueTuple<IEnumerable<object>, object>
grads_and_vars - List of (gradient, variable) pairs.
-
string
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(ValueTuple<IEnumerable<object>, object> grads_and_vars, BaseResourceVariable name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
ValueTuple<IEnumerable<object>, object>
grads_and_vars - List of (gradient, variable) pairs.
-
BaseResourceVariable
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(ValueTuple<IEnumerable<object>, object> grads_and_vars, int name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
ValueTuple<IEnumerable<object>, object>
grads_and_vars - List of (gradient, variable) pairs.
-
int
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(ValueTuple<IEnumerable<object>, object> grads_and_vars, IGraphNodeBase name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
ValueTuple<IEnumerable<object>, object>
grads_and_vars - List of (gradient, variable) pairs.
-
IGraphNodeBase
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(IEnumerable<object> grads_and_vars, int name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
IEnumerable<object>
grads_and_vars - List of (gradient, variable) pairs.
-
int
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients(object grads_and_vars, BaseResourceVariable name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
object
grads_and_vars - List of (gradient, variable) pairs.
-
BaseResourceVariable
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
object apply_gradients_dyn(object grads_and_vars, object name)
Apply gradients to variables. This is the second part of `minimize()`. It returns an `Operation` that
applies gradients.
Parameters
-
object
grads_and_vars - List of (gradient, variable) pairs.
-
object
name - Optional name for the returned operation. Default to the name passed to the `Optimizer` constructor.
Returns
-
object
- An `Operation` that applies the specified gradients. The `iterations` will be automatically increased by 1.
IList<object> get_gradients(object loss, IEnumerable<Variable> params)
Returns gradients of `loss` with respect to `params`.
Parameters
-
object
loss - Loss tensor.
-
IEnumerable<Variable>
params - List of variables.
Returns
-
IList<object>
- List of gradient tensors.
IList<object> get_gradients(double loss, IEnumerable<Variable> params)
Returns gradients of `loss` with respect to `params`.
Parameters
-
double
loss - Loss tensor.
-
IEnumerable<Variable>
params - List of variables.
Returns
-
IList<object>
- List of gradient tensors.
object get_gradients_dyn(object loss, object params)
Returns gradients of `loss` with respect to `params`.
Parameters
-
object
loss - Loss tensor.
-
object
params - List of variables.
Returns
-
object
- List of gradient tensors.
IList<object> get_updates(Nullable<double> loss, IEnumerable<Variable> params)
object get_updates_dyn(object loss, object params)
object minimize(IGraphNodeBase loss, IEnumerable<object> var_list, IGraphNodeBase grad_loss, string name)
Minimize `loss` by updating `var_list`. This method simply computes gradient using
tf.GradientTape
and calls
`apply_gradients()`. If you want to process the gradient before applying
then call tf.GradientTape
and `apply_gradients()` explicitly instead
of using this function.
Parameters
-
IGraphNodeBase
loss - A callable taking no arguments which returns the value to minimize.
-
IEnumerable<object>
var_list - list or tuple of `Variable` objects to update to minimize `loss`, or a callable returning the list or tuple of `Variable` objects. Use callable when the variable list would otherwise be incomplete before `minimize` since the variables are created at the first time `loss` is called.
-
IGraphNodeBase
grad_loss - Optional. A `Tensor` holding the gradient computed for `loss`.
-
string
name - Optional name for the returned operation.
Returns
-
object
- An Operation that updates the variables in `var_list`. If `global_step` was not `None`, that operation also increments `global_step`.
object minimize_dyn(object loss, object var_list, object grad_loss, object name)
Minimize `loss` by updating `var_list`. This method simply computes gradient using
tf.GradientTape
and calls
`apply_gradients()`. If you want to process the gradient before applying
then call tf.GradientTape
and `apply_gradients()` explicitly instead
of using this function.
Parameters
-
object
loss - A callable taking no arguments which returns the value to minimize.
-
object
var_list - list or tuple of `Variable` objects to update to minimize `loss`, or a callable returning the list or tuple of `Variable` objects. Use callable when the variable list would otherwise be incomplete before `minimize` since the variables are created at the first time `loss` is called.
-
object
grad_loss - Optional. A `Tensor` holding the gradient computed for `loss`.
-
object
name - Optional name for the returned operation.
Returns
-
object
- An Operation that updates the variables in `var_list`. If `global_step` was not `None`, that operation also increments `global_step`.
Public static methods
Adadelta NewDyn(ImplicitContainer<T> learning_rate, ImplicitContainer<T> rho, ImplicitContainer<T> epsilon, ImplicitContainer<T> name, IDictionary<string, object> kwargs)
Construct a new Stochastic Gradient Descent or Momentum optimizer.
Parameters
-
ImplicitContainer<T>
learning_rate - float hyperparameter >= 0. Learning rate.
-
ImplicitContainer<T>
rho -
ImplicitContainer<T>
epsilon -
ImplicitContainer<T>
name - Optional name prefix for the operations created when applying gradients. Defaults to 'SGD'.
-
IDictionary<string, object>
kwargs - keyword arguments. Allowed to be {`clipnorm`, `clipvalue`, `lr`, `decay`}. `clipnorm` is clip gradients by norm; `clipvalue` is clip gradients by value, `decay` is included for backward compatibility to allow time inverse decay of learning rate. `lr` is included for backward compatibility, recommended to use `learning_rate` instead.