Nadam - LostTech.TensorFlow Documentation

Type Nadam

Namespace tensorflow.keras.optimizers

Interfaces INadam

Optimizer that implements the NAdam algorithm.

Much like Adam is essentially RMSprop with momentum, Nadam is Adam with Nesterov momentum.

Initialization:

$$m_0 := 0 \text{(Initialize 1st moment vector)}$$ $$v_0 := 0 \text{(Initialize 2nd moment vector)}$$ $$mu_0 := 1$$ $$t := 0 \text{(Initialize timestep)}$$

Computes: $$t := t + 1$$ $$\mu_t := \beta_1 * (1 - 0.5 * 0.96^{0.004 * t})$$ $$g' := g / (1 - \prod_{i=1}^{t}{\mu_i})$$ $$m_t := \beta_1 * m_{t-1} + (1 - \beta_1) * g$$ $$m' := m_t / (1 - \prod_{i=1}^{t+1}{\mu_i})$$ $$v_t := \beta_2 * v_{t-1} + (1 - \beta_2) * g * g$$ $$v' := v_t / (1 - \beta_2^t)$$ $$\bar{m} := (1 - \mu_t) * g' + \mu_{t+1} * m'$$ $$\theta_t := \theta_{t-1} - lr * \bar{m} / (\sqrt{v'} + \epsilon)$$

gradient is evaluated at theta(t) + momentum * v(t), and the variables always store theta + beta_1 * m / sqrt(v) instead of theta.

References See [Dozat, T., 2015](http://cs229.stanford.edu/proj2015/054_report.pdf).

LostTech.TensorFlow : API Documentation

Properties

Public properties

object clipnorm get; set;

object clipvalue get; set;

double epsilon get; set;

object iterations get; set;

object iterations_dyn get; set;

object PythonObject get;

IList<object> weights get;

object weights_dyn get;