Type AdagradDAOptimizer
Namespace tensorflow.train
Parent Optimizer
Interfaces IAdagradDAOptimizer
Adagrad Dual Averaging algorithm for sparse linear models. See this [paper](http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf). This optimizer takes care of regularization of unseen features in a mini batch
by updating them when they are seen with a closed form update rule that is
equivalent to having updated them on every mini-batch. AdagradDA is typically used when there is a need for large sparsity in the
trained model. This optimizer only guarantees sparsity for linear models. Be
careful when using AdagradDA for deep networks as it will require careful
initialization of the gradient accumulators for it to train.
Methods
Properties
Public static methods
AdagradDAOptimizer NewDyn(object learning_rate, object global_step, ImplicitContainer<T> initial_gradient_squared_accumulator_value, ImplicitContainer<T> l1_regularization_strength, ImplicitContainer<T> l2_regularization_strength, ImplicitContainer<T> use_locking, ImplicitContainer<T> name)
Construct a new AdagradDA optimizer.
Parameters
-
object
learning_rate - A `Tensor` or a floating point value. The learning rate.
-
object
global_step - A `Tensor` containing the current training step number.
-
ImplicitContainer<T>
initial_gradient_squared_accumulator_value - A floating point value. Starting value for the accumulators, must be positive.
-
ImplicitContainer<T>
l1_regularization_strength - A float value, must be greater than or equal to zero.
-
ImplicitContainer<T>
l2_regularization_strength - A float value, must be greater than or equal to zero.
-
ImplicitContainer<T>
use_locking - If `True` use locks for update operations.
-
ImplicitContainer<T>
name - Optional name prefix for the operations created when applying gradients. Defaults to "AdagradDA".