ICrossDeviceOps - LostTech.TensorFlow Documentation

Type ICrossDeviceOps

Namespace tensorflow.distribute

Interfaces IPythonObjectContainer

Methods

Public instance methods

object broadcast(object tensor, object destinations)

object broadcast_implementation(object tensor, object destinations)

object reduce(object reduce_op, object per_replica_value, object destinations)

Reduce `value` across replicas.

Given a per-replica value returned by `experimental_run_v2`, say a per-example loss, the batch will be divided across all the replicas. This function allows you to aggregate across replicas and optionally also across batch elements. For example, if you have a global batch size of 8 and 2 replicas, values for examples `[0, 1, 2, 3]` will be on replica 0 and `[4, 5, 6, 7]` will be on replica 1. By default, `reduce` will just aggregate across replicas, returning `[0+4, 1+5, 2+6, 3+7]`. This is useful when each replica is computing a scalar or some other value that doesn't have a "batch" dimension (like a gradient). More often you will want to aggregate across the global batch, which you can get by specifying the batch dimension as the `axis`, typically `axis=0`. In this case it would return a scalar `0+1+2+3+4+5+6+7`.

If there is a last partial batch, you will need to specify an axis so that the resulting shape is consistent across replicas. So if the last batch has size 6 and it is divided into [0, 1, 2, 3] and [4, 5], you would get a shape mismatch unless you specify `axis=0`. If you specify tf.distribute.ReduceOp.MEAN, using `axis=0` will use the correct denominator of 6. Contrast this with computing `reduce_mean` to get a scalar value on each replica and this function to average those means, which will weigh some values `1/8` and others `1/4`.

Parameters

object reduce_op: A tf.distribute.ReduceOp value specifying how values should be combined.
object per_replica_value
object destinations

Returns

object: A `Tensor`.

LostTech.TensorFlow : API Documentation