Type MirroredStrategy
Namespace tensorflow.distribute
Parent Strategy
Interfaces IMirroredStrategy
Mirrors vars to distribute across multiple devices and machines. This strategy uses one replica per device and sync replication for its
multi-GPU version. To use `MirroredStrategy` with multiple workers, please refer to
`tf.distribute.MultiWorkerMirroredStrategy`.
Methods
- colocate_vars_with
- colocate_vars_with_dyn
- configure
- configure_dyn
- experimental_distribute_dataset
- experimental_distribute_dataset_dyn
- experimental_distribute_datasets_from_function
- experimental_distribute_datasets_from_function_dyn
- experimental_local_results
- experimental_local_results
- experimental_local_results
- experimental_local_results
- experimental_local_results
- experimental_local_results
- experimental_local_results_dyn
- experimental_make_numpy_dataset
- experimental_make_numpy_dataset
- experimental_make_numpy_dataset_dyn
- experimental_run
- experimental_run
- experimental_run
- experimental_run
- experimental_run_dyn
- experimental_run_v2
- experimental_run_v2
- experimental_run_v2
- experimental_run_v2_dyn
- group
- group_dyn
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- reduce
- scope
- scope_dyn
- unwrap
- unwrap_dyn
- update_config_proto
- update_config_proto_dyn
Properties
Public instance methods
object colocate_vars_with(object colocate_with_variable)
object colocate_vars_with_dyn(object colocate_with_variable)
object configure(object session_config, IDictionary<string, IEnumerable<string>> cluster_spec, string task_type, Nullable<int> task_id)
object configure_dyn(object session_config, object cluster_spec, object task_type, object task_id)
object experimental_distribute_dataset(object dataset)
object experimental_distribute_dataset_dyn(object dataset)
object experimental_distribute_datasets_from_function(PythonFunctionContainer dataset_fn)
object experimental_distribute_datasets_from_function_dyn(object dataset_fn)
object experimental_local_results(object value)
object experimental_local_results(PythonClassContainer value)
object experimental_local_results(IGraphNodeBase value)
object experimental_local_results(int value)
object experimental_local_results(ValueTuple<int, object, double> value)
object experimental_local_results(IEnumerable<IGraphNodeBase> value)
object experimental_local_results_dyn(object value)
object experimental_make_numpy_dataset(ndarray numpy_input)
object experimental_make_numpy_dataset(ndarray numpy_input, object session)
Makes a tf.data.Dataset for input provided via a numpy array. This avoids adding `numpy_input` as a large constant in the graph,
and copies the data to the machine or machines that will be processing
the input. Note that you will likely need to use
tf.distribute.Strategy.experimental_distribute_dataset
with the returned dataset to further distribute it with the strategy. Example:
```
numpy_input = np.ones([10], dtype=np.float32)
dataset = strategy.experimental_make_numpy_dataset(numpy_input)
dist_dataset = strategy.experimental_distribute_dataset(dataset)
```
Parameters
-
ndarray
numpy_input - A nest of NumPy input arrays that will be converted into a
dataset. Note that lists of Numpy arrays are stacked, as that is normal
tf.data.Dataset
behavior. -
object
session - (TensorFlow v1.x graph execution only) A session used for initialization.
Returns
-
object
- A
tf.data.Dataset
representing `numpy_input`.
object experimental_make_numpy_dataset_dyn(object numpy_input)
object experimental_run(object fn, IEnumerable<double> input_iterator)
Runs ops in `fn` on each replica, with inputs from `input_iterator`. DEPRECATED: This method is not available in TF 2.x. Please switch
to using `experimental_run_v2` instead. When eager execution is enabled, executes ops specified by `fn` on each
replica. Otherwise, builds a graph to execute the ops on each replica. Each replica will take a single, different input from the inputs provided by
one `get_next` call on the input iterator. `fn` may call `tf.distribute.get_replica_context()` to access members such
as `replica_id_in_sync_group`. IMPORTANT: Depending on the
tf.distribute.Strategy
implementation being
used, and whether eager execution is enabled, `fn` may be called one or more
times (once for each replica).
Parameters
-
object
fn - The function to run. The inputs to the function must match the outputs
of `input_iterator.get_next()`. The output must be a
tf.nest
of `Tensor`s. -
IEnumerable<double>
input_iterator - (Optional) input iterator from which the inputs are taken.
Returns
-
object
- Merged return value of `fn` across replicas. The structure of the return value is the same as the return value from `fn`. Each element in the structure can either be `PerReplica` (if the values are unsynchronized), `Mirrored` (if the values are kept in sync), or `Tensor` (if running on a single replica).
object experimental_run(object fn, ValueTuple<double, IEnumerable<object>> input_iterator)
Runs ops in `fn` on each replica, with inputs from `input_iterator`. DEPRECATED: This method is not available in TF 2.x. Please switch
to using `experimental_run_v2` instead. When eager execution is enabled, executes ops specified by `fn` on each
replica. Otherwise, builds a graph to execute the ops on each replica. Each replica will take a single, different input from the inputs provided by
one `get_next` call on the input iterator. `fn` may call `tf.distribute.get_replica_context()` to access members such
as `replica_id_in_sync_group`. IMPORTANT: Depending on the
tf.distribute.Strategy
implementation being
used, and whether eager execution is enabled, `fn` may be called one or more
times (once for each replica).
Parameters
-
object
fn - The function to run. The inputs to the function must match the outputs
of `input_iterator.get_next()`. The output must be a
tf.nest
of `Tensor`s. -
ValueTuple<double, IEnumerable<object>>
input_iterator - (Optional) input iterator from which the inputs are taken.
Returns
-
object
- Merged return value of `fn` across replicas. The structure of the return value is the same as the return value from `fn`. Each element in the structure can either be `PerReplica` (if the values are unsynchronized), `Mirrored` (if the values are kept in sync), or `Tensor` (if running on a single replica).
object experimental_run(object fn, _DefaultDistributionExtended.DefaultInputIterator input_iterator)
object experimental_run(object fn, DatasetIterator input_iterator)
Runs ops in `fn` on each replica, with inputs from `input_iterator`. DEPRECATED: This method is not available in TF 2.x. Please switch
to using `experimental_run_v2` instead. When eager execution is enabled, executes ops specified by `fn` on each
replica. Otherwise, builds a graph to execute the ops on each replica. Each replica will take a single, different input from the inputs provided by
one `get_next` call on the input iterator. `fn` may call `tf.distribute.get_replica_context()` to access members such
as `replica_id_in_sync_group`. IMPORTANT: Depending on the
tf.distribute.Strategy
implementation being
used, and whether eager execution is enabled, `fn` may be called one or more
times (once for each replica).
Parameters
-
object
fn - The function to run. The inputs to the function must match the outputs
of `input_iterator.get_next()`. The output must be a
tf.nest
of `Tensor`s. -
DatasetIterator
input_iterator - (Optional) input iterator from which the inputs are taken.
Returns
-
object
- Merged return value of `fn` across replicas. The structure of the return value is the same as the return value from `fn`. Each element in the structure can either be `PerReplica` (if the values are unsynchronized), `Mirrored` (if the values are kept in sync), or `Tensor` (if running on a single replica).
object experimental_run_dyn(object fn, object input_iterator)
Runs ops in `fn` on each replica, with inputs from `input_iterator`. DEPRECATED: This method is not available in TF 2.x. Please switch
to using `experimental_run_v2` instead. When eager execution is enabled, executes ops specified by `fn` on each
replica. Otherwise, builds a graph to execute the ops on each replica. Each replica will take a single, different input from the inputs provided by
one `get_next` call on the input iterator. `fn` may call `tf.distribute.get_replica_context()` to access members such
as `replica_id_in_sync_group`. IMPORTANT: Depending on the
tf.distribute.Strategy
implementation being
used, and whether eager execution is enabled, `fn` may be called one or more
times (once for each replica).
Parameters
-
object
fn - The function to run. The inputs to the function must match the outputs
of `input_iterator.get_next()`. The output must be a
tf.nest
of `Tensor`s. -
object
input_iterator - (Optional) input iterator from which the inputs are taken.
Returns
-
object
- Merged return value of `fn` across replicas. The structure of the return value is the same as the return value from `fn`. Each element in the structure can either be `PerReplica` (if the values are unsynchronized), `Mirrored` (if the values are kept in sync), or `Tensor` (if running on a single replica).
object experimental_run_v2(Template fn, ImplicitContainer<T> args, IDictionary<string, object> kwargs)
See base class.
object experimental_run_v2(object fn, IEnumerable<object> args, IDictionary<string, object> kwargs)
See base class.
object experimental_run_v2(Template fn, IEnumerable<object> args, IDictionary<string, object> kwargs)
See base class.
object experimental_run_v2_dyn(object fn, ImplicitContainer<T> args, object kwargs)
See base class.
object group(object value, string name)
object group_dyn(object value, object name)
Tensor reduce(ReduceOp reduce_op, IDictionary<object, object> value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
ReduceOp
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IDictionary<object, object>
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(ReduceOp reduce_op, object value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
ReduceOp
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
object
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(ReduceOp reduce_op, IEnumerable<object> value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
ReduceOp
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IEnumerable<object>
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(ReduceOp reduce_op, PerReplica value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
ReduceOp
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
PerReplica
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(ReduceOp reduce_op, IGraphNodeBase value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
ReduceOp
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IGraphNodeBase
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(string reduce_op, object value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
string
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
object
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(object reduce_op, IDictionary<object, object> value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
object
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IDictionary<object, object>
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(object reduce_op, IEnumerable<object> value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
object
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IEnumerable<object>
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(object reduce_op, IGraphNodeBase value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
object
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IGraphNodeBase
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(string reduce_op, IDictionary<object, object> value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
string
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IDictionary<object, object>
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(string reduce_op, IEnumerable<object> value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
string
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IEnumerable<object>
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(string reduce_op, PerReplica value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
string
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
PerReplica
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(string reduce_op, IGraphNodeBase value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
string
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
IGraphNodeBase
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
Tensor reduce(object reduce_op, PerReplica value, object axis)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
object
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
PerReplica
value - A "per replica" value, e.g. returned by `experimental_run_v2` to be combined into a single tensor.
-
object
axis - Specifies the dimension to reduce along within each replica's tensor. Should typically be set to the batch dimension, or `None` to only reduce across replicas (e.g. if the tensor has no batch dimension).
Returns
-
Tensor
- A `Tensor`.
object scope()
object scope_dyn()
object unwrap(object value)
object unwrap_dyn(object value)
object update_config_proto(object config_proto)
Returns a copy of `config_proto` modified for use with this strategy. DEPRECATED: This method is not available in TF 2.x. The updated config has something needed to run a strategy, e.g.
configuration to run collective ops, or device filters to improve
distributed training performance.
Parameters
-
object
config_proto - a
tf.ConfigProto
object.
Returns
-
object
- The updated copy of the `config_proto`.
object update_config_proto_dyn(object config_proto)
Returns a copy of `config_proto` modified for use with this strategy. DEPRECATED: This method is not available in TF 2.x. The updated config has something needed to run a strategy, e.g.
configuration to run collective ops, or device filters to improve
distributed training performance.
Parameters
-
object
config_proto - a
tf.ConfigProto
object.
Returns
-
object
- The updated copy of the `config_proto`.