Type ICrossDeviceOps
Namespace tensorflow.distribute
Interfaces IPythonObjectContainer
Public instance methods
object broadcast(object tensor, object destinations)
object broadcast_implementation(object tensor, object destinations)
object reduce(object reduce_op, object per_replica_value, object destinations)
Reduce `value` across replicas. Given a per-replica value returned by `experimental_run_v2`, say a
per-example loss, the batch will be divided across all the replicas. This
function allows you to aggregate across replicas and optionally also across
batch elements. For example, if you have a global batch size of 8 and 2
replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and
`[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just
aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful
when each replica is computing a scalar or some other value that doesn't
have a "batch" dimension (like a gradient). More often you will want to
aggregate across the global batch, which you can get by specifying the batch
dimension as the `axis`, typically `axis=0`. In this case it would return a
scalar `0+1+2+3+4+5+6+7`. If there is a last partial batch, you will need to specify an axis so
that the resulting shape is consistent across replicas. So if the last
batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you
would get a shape mismatch unless you specify `axis=0`. If you specify
tf.distribute.ReduceOp.MEAN
, using `axis=0` will use the correct
denominator of 6. Contrast this with computing `reduce_mean` to get a
scalar value on each replica and this function to average those means,
which will weigh some values `1/8` and others `1/4`.
Parameters
-
object
reduce_op - A
tf.distribute.ReduceOp
value specifying how values should be combined. -
object
per_replica_value -
object
destinations
Returns
-
object
- A `Tensor`.