LostTech.TensorFlow : API Documentation

Type VectorDiffeomixture

Namespace tensorflow.contrib.distributions

Parent Distribution

Interfaces IVectorDiffeomixture

VectorDiffeomixture distribution.

A vector diffeomixture (VDM) is a distribution parameterized by a convex combination of `K` component `loc` vectors, `loc[k], k = 0,...,K-1`, and `K` `scale` matrices `scale[k], k = 0,..., K-1`. It approximates the following [compound distribution] (https://en.wikipedia.org/wiki/Compound_probability_distribution)

```none p(x) = int p(x | z) p(z) dz, where z is in the K-simplex, and p(x | z) := p(x | loc=sum_k z[k] loc[k], scale=sum_k z[k] scale[k]) ```

The integral `int p(x | z) p(z) dz` is approximated with a quadrature scheme adapted to the mixture density `p(z)`. The `N` quadrature points `z_{N, n}` and weights `w_{N, n}` (which are non-negative and sum to 1) are chosen such that

```q_N(x) := sum_{n=1}^N w_{n, N} p(x | z_{N, n}) --> p(x)```

as `N --> infinity`.

Since `q_N(x)` is in fact a mixture (of `N` points), we may sample from `q_N` exactly. It is important to note that the VDM is *defined* as `q_N` above, and *not* `p(x)`. Therefore, sampling and pdf may be implemented as exact (up to floating point error) methods.

A common choice for the conditional `p(x | z)` is a multivariate Normal.

The implemented marginal `p(z)` is the `SoftmaxNormal`, which is a `K-1` dimensional Normal transformed by a `SoftmaxCentered` bijector, making it a density on the `K`-simplex. That is,

``` Z = SoftmaxCentered(X), X = Normal(mix_loc / temperature, 1 / temperature) ```

The default quadrature scheme chooses `z_{N, n}` as `N` midpoints of the quantiles of `p(z)` (generalized quantiles if `K > 2`).

See [Dillon and Langmore (2018)][1] for more details.

#### About `Vector` distributions in TensorFlow.

The `VectorDiffeomixture` is a non-standard distribution that has properties particularly useful in [variational Bayesian methods](https://en.wikipedia.org/wiki/Variational_Bayesian_methods).

Conditioned on a draw from the SoftmaxNormal, `X|z` is a vector whose components are linear combinations of affine transformations, thus is itself an affine transformation.

Note: The marginals `X_1|v,..., X_d|v` are *not* generally identical to some parameterization of `distribution`. This is due to the fact that the sum of draws from `distribution` are not generally itself the same `distribution`.

#### About `Diffeomixture`s and reparameterization.

The `VectorDiffeomixture` is designed to be reparameterized, i.e., its parameters are only used to transform samples from a distribution which has no trainable parameters. This property is important because backprop stops at sources of stochasticity. That is, as long as the parameters are used *after* the underlying source of stochasticity, the computed gradient is accurate.

Reparametrization means that we can use gradient-descent (via backprop) to optimize Monte-Carlo objectives. Such objectives are a finite-sample approximation of an expectation and arise throughout scientific computing.

WARNING: If you backprop through a VectorDiffeomixture sample and the "base" distribution is both: not `FULLY_REPARAMETERIZED` and a function of trainable variables, then the gradient is not guaranteed correct!

#### Examples #### References

[1]: Joshua Dillon and Ian Langmore. Quadrature Compound: An approximating family of distributions. _arXiv preprint arXiv:1801.03080_, 2018. https://arxiv.org/abs/1801.03080
Show Example
import tensorflow_probability as tfp
            tfd = tfp.distributions 

# Create two batches of VectorDiffeomixtures, one with mix_loc=[0.], # another with mix_loc=[1]. In both cases, `K=2` and the affine # transformations involve: # k=0: loc=zeros(dims) scale=LinearOperatorScaledIdentity # k=1: loc=[2.]*dims scale=LinOpDiag dims = 5 vdm = tfd.VectorDiffeomixture( mix_loc=[[0.], [1]], temperature=[1.], distribution=tfd.Normal(loc=0., scale=1.), loc=[ None, # Equivalent to `np.zeros(dims, dtype=np.float32)`. np.float32([2.]*dims), ], scale=[ tf.linalg.LinearOperatorScaledIdentity( num_rows=dims, multiplier=np.float32(1.1), is_positive_definite=True), tf.linalg.LinearOperatorDiag( diag=np.linspace(2.5, 3.5, dims, dtype=np.float32), is_positive_definite=True), ], validate_args=True)

Methods

Properties

Public static methods

VectorDiffeomixture NewDyn(object mix_loc, object temperature, object distribution, object loc, object scale, ImplicitContainer<T> quadrature_size, ImplicitContainer<T> quadrature_fn, ImplicitContainer<T> validate_args, ImplicitContainer<T> allow_nan_stats, ImplicitContainer<T> name)

Constructs the VectorDiffeomixture on `R^d`. (deprecated)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed after 2018-10-01. Instructions for updating: The TensorFlow Distributions library has moved to TensorFlow Probability (https://github.com/tensorflow/probability). You should update all references to use `tfp.distributions` instead of tf.contrib.distributions.

The vector diffeomixture (VDM) approximates the compound distribution

```none p(x) = int p(x | z) p(z) dz, where z is in the K-simplex, and p(x | z) := p(x | loc=sum_k z[k] loc[k], scale=sum_k z[k] scale[k]) ```
Parameters
object mix_loc
`float`-like `Tensor` with shape `[b1,..., bB, K-1]`. In terms of samples, larger `mix_loc[..., k]` ==> `Z` is more likely to put more weight on its `kth` component.
object temperature
`float`-like `Tensor`. Broadcastable with `mix_loc`. In terms of samples, smaller `temperature` means one component is more likely to dominate. I.e., smaller `temperature` makes the VDM look more like a standard mixture of `K` components.
object distribution
`tf.Distribution`-like instance. Distribution from which `d` iid samples are used as input to the selected affine transformation. Must be a scalar-batch, scalar-event distribution. Typically `distribution.reparameterization_type = FULLY_REPARAMETERIZED` or it is a function of non-trainable parameters. WARNING: If you backprop through a VectorDiffeomixture sample and the `distribution` is not `FULLY_REPARAMETERIZED` yet is a function of trainable variables, then the gradient will be incorrect!
object loc
Length-`K` list of `float`-type `Tensor`s. The `k`-th element represents the `shift` used for the `k`-th affine transformation. If the `k`-th item is `None`, `loc` is implicitly `0`. When specified, must have shape `[B1,..., Bb, d]` where `b >= 0` and `d` is the event size.
object scale
Length-`K` list of `LinearOperator`s. Each should be positive-definite and operate on a `d`-dimensional vector space. The `k`-th element represents the `scale` used for the `k`-th affine transformation. `LinearOperator`s must have shape `[B1,..., Bb, d, d]`, `b >= 0`, i.e., characterizes `b`-batches of `d x d` matrices
ImplicitContainer<T> quadrature_size
Python `int` scalar representing number of quadrature points. Larger `quadrature_size` means `q_N(x)` better approximates `p(x)`.
ImplicitContainer<T> quadrature_fn
Python callable taking `normal_loc`, `normal_scale`, `quadrature_size`, `validate_args` and returning `tuple(grid, probs)` representing the SoftmaxNormal grid and corresponding normalized weight. normalized) weight. Default value: `quadrature_scheme_softmaxnormal_quantiles`.
ImplicitContainer<T> validate_args
Python `bool`, default `False`. When `True` distribution parameters are checked for validity despite possibly degrading runtime performance. When `False` invalid inputs may silently render incorrect outputs.
ImplicitContainer<T> allow_nan_stats
Python `bool`, default `True`. When `True`, statistics (e.g., mean, mode, variance) use the value "`NaN`" to indicate the result is undefined. When `False`, an exception is raised if one or more of the statistic's batch members are undefined.
ImplicitContainer<T> name
Python `str` name prefixed to Ops created by this class.

Public properties

object allow_nan_stats get;

object allow_nan_stats_dyn get;

TensorShape batch_shape get;

object batch_shape_dyn get;

object distribution get;

Base scalar-event, scalar-batch distribution.

object distribution_dyn get;

Base scalar-event, scalar-batch distribution.

object dtype get;

object dtype_dyn get;

IList<AffineLinearOperator> endpoint_affine get;

Affine transformation for each of `K` components.

object endpoint_affine_dyn get;

Affine transformation for each of `K` components.

TensorShape event_shape get;

object event_shape_dyn get;

object grid get;

Grid of mixing probabilities, one for each grid point.

object grid_dyn get;

Grid of mixing probabilities, one for each grid point.

IList<AffineLinearOperator> interpolated_affine get;

Affine transformation for each convex combination of `K` components.

object interpolated_affine_dyn get;

Affine transformation for each convex combination of `K` components.

Categorical mixture_distribution get;

Distribution used to select a convex combination of affine transforms.

object mixture_distribution_dyn get;

Distribution used to select a convex combination of affine transforms.

string name get;

object name_dyn get;

IDictionary<object, object> parameters get;

object parameters_dyn get;

object PythonObject get;

object reparameterization_type get;

object reparameterization_type_dyn get;

object validate_args get;

object validate_args_dyn get;