# LostTech.TensorFlow : API Documentation

Type Autoregressive

Namespace tensorflow.contrib.distributions

Parent Distribution

Interfaces IAutoregressive

Autoregressive distributions.

The Autoregressive distribution enables learning (often) richer multivariate distributions by repeatedly applying a [diffeomorphic]( https://en.wikipedia.org/wiki/Diffeomorphism) transformation (such as implemented by `Bijector`s). Regarding terminology,

"Autoregressive models decompose the joint density as a product of conditionals, and model each conditional in turn. Normalizing flows transform a base density (e.g. a standard Gaussian) into the target density by an invertible transformation with tractable Jacobian." [(Papamakarios et al., 2016)][1]

In other words, the "autoregressive property" is equivalent to the decomposition, `p(x) = prod{ p(x[i] | x[0:i]) : i=0,..., d }`. The provided `shift_and_log_scale_fn`, `masked_autoregressive_default_template`, achieves this property by zeroing out weights in its `masked_dense` layers.

Practically speaking the autoregressive property means that there exists a permutation of the event coordinates such that each coordinate is a diffeomorphic function of only preceding coordinates [(van den Oord et al., 2016)][2].

#### Mathematical Details

The probability function is

```none prob(x; fn, n) = fn(x).prob(x) ```

And a sample is generated by

```none x = fn(...fn(fn(x0).sample()).sample()).sample() ```

where the ellipses (`...`) represent `n-2` composed calls to `fn`, `fn` constructs a `tfp.distributions.Distribution`-like instance, and `x0` is a fixed initializing `Tensor`.

#### Examples #### References

[1]: George Papamakarios, Theo Pavlakou, and Iain Murray. Masked Autoregressive Flow for Density Estimation. In _Neural Information Processing Systems_, 2017. https://arxiv.org/abs/1705.07057

[2]: Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, and Koray Kavukcuoglu. Conditional Image Generation with PixelCNN Decoders. In _Neural Information Processing Systems_, 2016. https://arxiv.org/abs/1606.05328
Show Example
```import tensorflow_probability as tfp
tfd = tfp.distributions  def normal_fn(self, event_size):
n = event_size * (event_size + 1) / 2
p = tf.Variable(tfd.Normal(loc=0., scale=1.).sample(n))
affine = tfd.bijectors.Affine(
scale_tril=tfd.fill_triangular(0.25 * p))
def _fn(samples):
scale = math_ops.exp(affine.forward(samples)).eval()
return independent_lib.Independent(
normal_lib.Normal(loc=0., scale=scale, validate_args=True),
reinterpreted_batch_ndims=1)
return _fn  batch_and_event_shape = [3, 2, 4]
sample0 = array_ops.zeros(batch_and_event_shape)
ar = autoregressive_lib.Autoregressive(
self._normal_fn(batch_and_event_shape[-1]), sample0)
x = ar.sample([6, 5])
# ==> x.shape = [6, 5, 3, 2, 4]
prob_x = ar.prob(x)
# ==> x.shape = [6, 5, 3, 2] ```