Type GradientTape
Namespace tensorflow
Parent PythonObjectContainer
Interfaces IGradientTape, IContextManager<T>
Record operations for automatic differentiation. Operations are recorded if they are executed within this context manager and
at least one of their inputs is being "watched". Trainable variables (created by
tf.Variable
or `tf.compat.v1.get_variable`,
where `trainable=True` is default in both cases) are automatically watched.
Tensors can be manually watched by invoking the `watch` method on this context
manager. For example, consider the function `y = x * x`. The gradient at `x = 3.0` can
be computed as:
GradientTapes can be nested to compute higher-order derivatives. For example,
By default, the resources held by a GradientTape are released as soon as
GradientTape.gradient() method is called. To compute multiple gradients over
the same computation, create a persistent gradient tape. This allows multiple
calls to the gradient() method as resources are released when the tape object
is garbage collected.
By default GradientTape will automatically watch any trainable variables that
are accessed inside the context. If you want fine grained control over which
variables are watched you can disable automatic tracking by passing
`watch_accessed_variables=False` to the tape constructor:
Note that when using models you should ensure that your variables exist when
using `watch_accessed_variables=False`. Otherwise it's quite easy to make your
first iteration not have any gradients:
Note that only tensors with real or complex dtypes are differentiable.
Show Example
x = tf.constant(3.0) with tf.GradientTape() as g: g.watch(x) y = x * x dy_dx = g.gradient(y, x) # Will compute to 6.0
Methods
- batch_jacobian
- batch_jacobian_dyn
- gradient
- gradient
- gradient
- gradient
- gradient
- gradient
- gradient
- gradient
- gradient
- gradient_dyn
- jacobian
- jacobian
- jacobian_dyn
- reset
- reset_dyn
- stop_recording
- stop_recording_dyn
- watch
- watch
- watch_dyn
- watched_variables
- watched_variables_dyn
Properties
Public instance methods
Tensor batch_jacobian(IGraphNodeBase target, IGraphNodeBase source, ImplicitContainer<T> unconnected_gradients, Nullable<int> parallel_iterations, bool experimental_use_pfor)
Computes and stacks per-example jacobians. See [wikipedia article](http://en.wikipedia.org/wiki/jacobian_matrix_and_determinant) for the
definition of a Jacobian. This function is essentially an efficient
implementation of the following: `tf.stack([self.jacobian(y[i], x[i]) for i in range(x.shape[0])])`. Note that compared to `GradientTape.jacobian` which computes gradient of
each output value w.r.t each input value, this function is useful when
`target[i,...]` is independent of `source[j,...]` for `j != i`. This
assumption allows more efficient computation as compared to
`GradientTape.jacobian`. The output, as well as intermediate activations,
are lower dimensional and avoid a bunch of redundant zeros which would
result in the jacobian computation given the independence assumption. Example usage:
Parameters
-
IGraphNodeBase
target - A tensor with rank 2 or higher and with shape [b, y1,..., y_n]. `target[i,...]` should only depend on `source[i,...]`.
-
IGraphNodeBase
source - A tensor with rank 2 or higher and with shape [b, x1,..., x_m].
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
-
Nullable<int>
parallel_iterations - A knob to control how many iterations are dispatched in parallel. This knob can be used to control the total memory usage.
-
bool
experimental_use_pfor - If true, uses pfor for computing the Jacobian. Else uses a tf.while_loop.
Returns
-
Tensor
- A tensor `t` with shape [b, y_1,..., y_n, x1,..., x_m] where `t[i,...]` is the jacobian of `target[i,...]` w.r.t. `source[i,...]`, i.e. stacked per-example jacobians.
Show Example
with tf.GradientTape() as g: x = tf.constant([[1., 2.], [3., 4.]], dtype=tf.float32) g.watch(x) y = x * x batch_jacobian = g.batch_jacobian(y, x) # batch_jacobian is [[[2, 0], [0, 4]], [[6, 0], [0, 8]]]
object batch_jacobian_dyn(object target, object source, ImplicitContainer<T> unconnected_gradients, object parallel_iterations, ImplicitContainer<T> experimental_use_pfor)
Computes and stacks per-example jacobians. See [wikipedia article](http://en.wikipedia.org/wiki/jacobian_matrix_and_determinant) for the
definition of a Jacobian. This function is essentially an efficient
implementation of the following: `tf.stack([self.jacobian(y[i], x[i]) for i in range(x.shape[0])])`. Note that compared to `GradientTape.jacobian` which computes gradient of
each output value w.r.t each input value, this function is useful when
`target[i,...]` is independent of `source[j,...]` for `j != i`. This
assumption allows more efficient computation as compared to
`GradientTape.jacobian`. The output, as well as intermediate activations,
are lower dimensional and avoid a bunch of redundant zeros which would
result in the jacobian computation given the independence assumption. Example usage:
Parameters
-
object
target - A tensor with rank 2 or higher and with shape [b, y1,..., y_n]. `target[i,...]` should only depend on `source[i,...]`.
-
object
source - A tensor with rank 2 or higher and with shape [b, x1,..., x_m].
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
-
object
parallel_iterations - A knob to control how many iterations are dispatched in parallel. This knob can be used to control the total memory usage.
-
ImplicitContainer<T>
experimental_use_pfor - If true, uses pfor for computing the Jacobian. Else uses a tf.while_loop.
Returns
-
object
- A tensor `t` with shape [b, y_1,..., y_n, x1,..., x_m] where `t[i,...]` is the jacobian of `target[i,...]` w.r.t. `source[i,...]`, i.e. stacked per-example jacobians.
Show Example
with tf.GradientTape() as g: x = tf.constant([[1., 2.], [3., 4.]], dtype=tf.float32) g.watch(x) y = x * x batch_jacobian = g.batch_jacobian(y, x) # batch_jacobian is [[[2, 0], [0, 4]], [[6, 0], [0, 8]]]
object gradient(object target, object sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
object
target - a list or nested structure of Tensors or Variables to be differentiated.
-
object
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient(object target, PythonFunctionContainer sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
object
target - a list or nested structure of Tensors or Variables to be differentiated.
-
PythonFunctionContainer
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient(object target, IEnumerable<object> sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
object
target - a list or nested structure of Tensors or Variables to be differentiated.
-
IEnumerable<object>
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient(PythonFunctionContainer target, object sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
PythonFunctionContainer
target - a list or nested structure of Tensors or Variables to be differentiated.
-
object
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient(PythonFunctionContainer target, IEnumerable<object> sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
PythonFunctionContainer
target - a list or nested structure of Tensors or Variables to be differentiated.
-
IEnumerable<object>
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient(IEnumerable<object> target, IEnumerable<object> sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
IEnumerable<object>
target - a list or nested structure of Tensors or Variables to be differentiated.
-
IEnumerable<object>
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient(PythonFunctionContainer target, PythonFunctionContainer sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
PythonFunctionContainer
target - a list or nested structure of Tensors or Variables to be differentiated.
-
PythonFunctionContainer
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient(IEnumerable<IGraphNodeBase> target, object sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
IEnumerable<IGraphNodeBase>
target - a list or nested structure of Tensors or Variables to be differentiated.
-
object
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient(IEnumerable<IGraphNodeBase> target, PythonFunctionContainer sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
IEnumerable<IGraphNodeBase>
target - a list or nested structure of Tensors or Variables to be differentiated.
-
PythonFunctionContainer
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object gradient_dyn(object target, object sources, object output_gradients, ImplicitContainer<T> unconnected_gradients)
Computes the gradient using operations recorded in context of this tape.
Parameters
-
object
target - a list or nested structure of Tensors or Variables to be differentiated.
-
object
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
object
output_gradients - a list of gradients, one for each element of target. Defaults to None.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
Returns
-
object
- a list or nested structure of Tensors (or IndexedSlices, or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`.
object jacobian(IGraphNodeBase target, IEnumerable<IGraphNodeBase> sources, ImplicitContainer<T> unconnected_gradients, Nullable<int> parallel_iterations, bool experimental_use_pfor)
Computes the jacobian using operations recorded in context of this tape. See [wikipedia article](http://en.wikipedia.org/wiki/jacobian_matrix_and_determinant) for the
definition of a Jacobian. Example usage:
Parameters
-
IGraphNodeBase
target - Tensor to be differentiated.
-
IEnumerable<IGraphNodeBase>
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
-
Nullable<int>
parallel_iterations - A knob to control how many iterations are dispatched in parallel. This knob can be used to control the total memory usage.
-
bool
experimental_use_pfor - If true, vectorizes the jacobian computation. Else falls back to a sequential while_loop. Vectorization can sometimes fail or lead to excessive memory usage. This option can be used to disable vectorization in such cases.
Returns
-
object
- A list or nested structure of Tensors (or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`. Note if any gradient is sparse (IndexedSlices), jacobian function currently makes it dense and returns a Tensor instead. This may change in the future.
Show Example
with tf.GradientTape() as g: x = tf.constant([1.0, 2.0]) g.watch(x) y = x * x jacobian = g.jacobian(y, x) # jacobian value is [[2., 0.], [0., 4.]]
object jacobian(IGraphNodeBase target, IGraphNodeBase sources, ImplicitContainer<T> unconnected_gradients, Nullable<int> parallel_iterations, bool experimental_use_pfor)
Computes the jacobian using operations recorded in context of this tape. See [wikipedia article](http://en.wikipedia.org/wiki/jacobian_matrix_and_determinant) for the
definition of a Jacobian. Example usage:
Parameters
-
IGraphNodeBase
target - Tensor to be differentiated.
-
IGraphNodeBase
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
-
Nullable<int>
parallel_iterations - A knob to control how many iterations are dispatched in parallel. This knob can be used to control the total memory usage.
-
bool
experimental_use_pfor - If true, vectorizes the jacobian computation. Else falls back to a sequential while_loop. Vectorization can sometimes fail or lead to excessive memory usage. This option can be used to disable vectorization in such cases.
Returns
-
object
- A list or nested structure of Tensors (or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`. Note if any gradient is sparse (IndexedSlices), jacobian function currently makes it dense and returns a Tensor instead. This may change in the future.
Show Example
with tf.GradientTape() as g: x = tf.constant([1.0, 2.0]) g.watch(x) y = x * x jacobian = g.jacobian(y, x) # jacobian value is [[2., 0.], [0., 4.]]
object jacobian_dyn(object target, object sources, ImplicitContainer<T> unconnected_gradients, object parallel_iterations, ImplicitContainer<T> experimental_use_pfor)
Computes the jacobian using operations recorded in context of this tape. See [wikipedia article](http://en.wikipedia.org/wiki/jacobian_matrix_and_determinant) for the
definition of a Jacobian. Example usage:
Parameters
-
object
target - Tensor to be differentiated.
-
object
sources - a list or nested structure of Tensors or Variables. `target` will be differentiated against elements in `sources`.
-
ImplicitContainer<T>
unconnected_gradients - a value which can either hold 'none' or 'zero' and alters the value which will be returned if the target and sources are unconnected. The possible values and effects are detailed in 'UnconnectedGradients' and it defaults to 'none'.
-
object
parallel_iterations - A knob to control how many iterations are dispatched in parallel. This knob can be used to control the total memory usage.
-
ImplicitContainer<T>
experimental_use_pfor - If true, vectorizes the jacobian computation. Else falls back to a sequential while_loop. Vectorization can sometimes fail or lead to excessive memory usage. This option can be used to disable vectorization in such cases.
Returns
-
object
- A list or nested structure of Tensors (or None), one for each element in `sources`. Returned structure is the same as the structure of `sources`. Note if any gradient is sparse (IndexedSlices), jacobian function currently makes it dense and returns a Tensor instead. This may change in the future.
Show Example
with tf.GradientTape() as g: x = tf.constant([1.0, 2.0]) g.watch(x) y = x * x jacobian = g.jacobian(y, x) # jacobian value is [[2., 0.], [0., 4.]]
void reset()
Resets the timer.
object reset_dyn()
Resets the timer.
IContextManager<T> stop_recording()
Temporarily stops recording operations on this tape. Operations executed while this context manager is active will not be
recorded on the tape. This is useful for reducing the memory used by tracing
all computations. For example: ```
with tf.GradientTape(persistent=True) as t:
loss = compute_loss(model)
with t.stop_recording():
# The gradient computation below is not traced, saving memory.
grads = t.gradient(loss, model.variables)
```
object stop_recording_dyn()
Temporarily stops recording operations on this tape. Operations executed while this context manager is active will not be
recorded on the tape. This is useful for reducing the memory used by tracing
all computations. For example: ```
with tf.GradientTape(persistent=True) as t:
loss = compute_loss(model)
with t.stop_recording():
# The gradient computation below is not traced, saving memory.
grads = t.gradient(loss, model.variables)
```
void watch(IEnumerable<object> tensor)
Ensures that `tensor` is being traced by this tape.
Parameters
-
IEnumerable<object>
tensor - a Tensor or list of Tensors.
void watch(PythonFunctionContainer tensor)
Ensures that `tensor` is being traced by this tape.
Parameters
-
PythonFunctionContainer
tensor - a Tensor or list of Tensors.
object watch_dyn(object tensor)
Ensures that `tensor` is being traced by this tape.
Parameters
-
object
tensor - a Tensor or list of Tensors.
object watched_variables()
Returns variables watched by this tape in order of construction.
object watched_variables_dyn()
Returns variables watched by this tape in order of construction.