Type HierarchicalCopyAllReduce
Namespace tensorflow.distribute
Parent AllReduceCrossDeviceOps
Interfaces IHierarchicalCopyAllReduce
Reduction using hierarchical copy all-reduce. It reduces to one GPU along edges in some hierarchy and broadcasts back to
each GPU along the same path. Before performing all-reduce, tensors will be
repacked or aggregated for more efficient cross-device transportation. This is a reduction created for Nvidia DGX-1 which assumes GPUs connects like
that on DGX-1 machine. If you have different GPU inter-connections, it is
likely that it would be slower than
tf.distribute.ReductionToOneDevice
.