MaskedAutoregressiveFlow - LostTech.TensorFlow Documentation

Type MaskedAutoregressiveFlow

Namespace tensorflow.contrib.distributions.bijectors

Interfaces IMaskedAutoregressiveFlow

Affine MaskedAutoregressiveFlow bijector for vector-valued events.

The affine autoregressive flow [(Papamakarios et al., 2016)][3] provides a relatively simple framework for user-specified (deep) architectures to learn a distribution over vector-valued events. Regarding terminology,

"Autoregressive models decompose the joint density as a product of conditionals, and model each conditional in turn. Normalizing flows transform a base density (e.g. a standard Gaussian) into the target density by an invertible transformation with tractable Jacobian." [(Papamakarios et al., 2016)][3]

In other words, the "autoregressive property" is equivalent to the decomposition, `p(x) = prod{ p(x[i] | x[0:i]) : i=0,..., d }`. The provided `shift_and_log_scale_fn`, `masked_autoregressive_default_template`, achieves this property by zeroing out weights in its `masked_dense` layers.

In the `tfp` framework, a "normalizing flow" is implemented as a `tfp.bijectors.Bijector`. The `forward` "autoregression" is implemented using a tf.while_loop and a deep neural network (DNN) with masked weights such that the autoregressive property is automatically met in the `inverse`.

A `TransformedDistribution` using `MaskedAutoregressiveFlow(...)` uses the (expensive) forward-mode calculation to draw samples and the (cheap) reverse-mode calculation to compute log-probabilities. Conversely, a `TransformedDistribution` using `Invert(MaskedAutoregressiveFlow(...))` uses the (expensive) forward-mode calculation to compute log-probabilities and the (cheap) reverse-mode calculation to compute samples. See "Example Use" [below] for more details.

Given a `shift_and_log_scale_fn`, the forward and inverse transformations are (a sequence of) affine transformations. A "valid" `shift_and_log_scale_fn` must compute each `shift` (aka `loc` or "mu" in [Germain et al. (2015)][1]) and `log(scale)` (aka "alpha" in [Germain et al. (2015)][1]) such that each are broadcastable with the arguments to `forward` and `inverse`, i.e., such that the calculations in `forward`, `inverse` [below] are possible.

For convenience, `masked_autoregressive_default_template` is offered as a possible `shift_and_log_scale_fn` function. It implements the MADE architecture [(Germain et al., 2015)][1]. MADE is a feed-forward network that computes a `shift` and `log(scale)` using `masked_dense` layers in a deep neural network. Weights are masked to ensure the autoregressive property. It is possible that this architecture is suboptimal for your task. To build alternative networks, either change the arguments to `masked_autoregressive_default_template`, use the `masked_dense` function to roll-out your own, or use some other architecture, e.g., using tf.layers.

Warning: no attempt is made to validate that the `shift_and_log_scale_fn` enforces the "autoregressive property".

Assuming `shift_and_log_scale_fn` has valid shape and autoregressive semantics, the forward transformation is and the inverse transformation is Notice that the `inverse` does not need a for-loop. This is because in the forward pass each calculation of `shift` and `log_scale` is based on the `y` calculated so far (not `x`). In the `inverse`, the `y` is fully known, thus is equivalent to the scaling used in `forward` after `event_size` passes, i.e., the "last" `y` used to compute `shift`, `log_scale`. (Roughly speaking, this also proves the transform is bijective.)

#### Examples #### References

[1]: Mathieu Germain, Karol Gregor, Iain Murray, and Hugo Larochelle. MADE: Masked Autoencoder for Distribution Estimation. In _International Conference on Machine Learning_, 2015. https://arxiv.org/abs/1502.03509

[2]: Diederik P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. Improving Variational Inference with Inverse Autoregressive Flow. In _Neural Information Processing Systems_, 2016. https://arxiv.org/abs/1606.04934

[3]: George Papamakarios, Theo Pavlakou, and Iain Murray. Masked Autoregressive Flow for Density Estimation. In _Neural Information Processing Systems_, 2017. https://arxiv.org/abs/1705.07057

Show Example

def forward(x):
              y = zeros_like(x)
              event_size = x.shape[-1]
              for _ in range(event_size):
                shift, log_scale = shift_and_log_scale_fn(y)
                y = x * math_ops.exp(log_scale) + shift
              return y

Methods

NewDyn

Properties

Public static methods

MaskedAutoregressiveFlow NewDyn(object shift_and_log_scale_fn, ImplicitContainer<T> is_constant_jacobian, ImplicitContainer<T> validate_args, ImplicitContainer<T> unroll_loop, object name)

Creates the MaskedAutoregressiveFlow bijector. (deprecated)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed after 2018-10-01. Instructions for updating: The TensorFlow Distributions library has moved to TensorFlow Probability (https://github.com/tensorflow/probability). You should update all references to use `tfp.distributions` instead of tf.contrib.distributions.

Parameters

object shift_and_log_scale_fn: Python `callable` which computes `shift` and `log_scale` from both the forward domain (`x`) and the inverse domain (`y`). Calculation must respect the "autoregressive property" (see class docstring). Suggested default `masked_autoregressive_default_template(hidden_layers=...)`. Typically the function contains `tf.Variables` and is wrapped using `tf.compat.v1.make_template`. Returning `None` for either (both) `shift`, `log_scale` is equivalent to (but more efficient than) returning zero.
ImplicitContainer<T> is_constant_jacobian: Python `bool`. Default: `False`. When `True` the implementation assumes `log_scale` does not depend on the forward domain (`x`) or inverse domain (`y`) values. (No validation is made; `is_constant_jacobian=False` is always safe but possibly computationally inefficient.)
ImplicitContainer<T> validate_args: Python `bool` indicating whether arguments should be checked for correctness.
ImplicitContainer<T> unroll_loop: Python `bool` indicating whether the tf.while_loop in `_forward` should be replaced with a static for loop. Requires that the final dimension of `x` be known at graph construction time. Defaults to `False`.
object name: Python `str`, name given to ops managed by this object.

LostTech.TensorFlow : API Documentation

Methods

Properties

Public static methods

MaskedAutoregressiveFlow NewDyn(object shift_and_log_scale_fn, ImplicitContainer<T> is_constant_jacobian, ImplicitContainer<T> validate_args, ImplicitContainer<T> unroll_loop, object name)

Parameters

Public properties

object dtype get;

object dtype_dyn get;

object forward_min_event_ndims get;

object forward_min_event_ndims_dyn get;

IList<object> graph_parents get;

object graph_parents_dyn get;

object inverse_min_event_ndims get;

object inverse_min_event_ndims_dyn get;

bool is_constant_jacobian get;

object is_constant_jacobian_dyn get;

object name get;

object name_dyn get;

object PythonObject get;

bool validate_args get;

object validate_args_dyn get;