Type VectorDiffeomixture
Namespace tensorflow.contrib.distributions
Parent Distribution
Interfaces IVectorDiffeomixture
VectorDiffeomixture distribution. A vector diffeomixture (VDM) is a distribution parameterized by a convex
combination of `K` component `loc` vectors, `loc[k], k = 0,...,K-1`, and `K`
`scale` matrices `scale[k], k = 0,..., K-1`. It approximates the following
[compound distribution]
(https://en.wikipedia.org/wiki/Compound_probability_distribution) ```none
p(x) = int p(x | z) p(z) dz,
where z is in the K-simplex, and
p(x | z) := p(x | loc=sum_k z[k] loc[k], scale=sum_k z[k] scale[k])
``` The integral `int p(x | z) p(z) dz` is approximated with a quadrature scheme
adapted to the mixture density `p(z)`. The `N` quadrature points `z_{N, n}`
and weights `w_{N, n}` (which are non-negative and sum to 1) are chosen
such that ```q_N(x) := sum_{n=1}^N w_{n, N} p(x | z_{N, n}) --> p(x)``` as `N --> infinity`. Since `q_N(x)` is in fact a mixture (of `N` points), we may sample from
`q_N` exactly. It is important to note that the VDM is *defined* as `q_N`
above, and *not* `p(x)`. Therefore, sampling and pdf may be implemented as
exact (up to floating point error) methods. A common choice for the conditional `p(x | z)` is a multivariate Normal. The implemented marginal `p(z)` is the `SoftmaxNormal`, which is a
`K-1` dimensional Normal transformed by a `SoftmaxCentered` bijector, making
it a density on the `K`-simplex. That is, ```
Z = SoftmaxCentered(X),
X = Normal(mix_loc / temperature, 1 / temperature)
``` The default quadrature scheme chooses `z_{N, n}` as `N` midpoints of
the quantiles of `p(z)` (generalized quantiles if `K > 2`). See [Dillon and Langmore (2018)][1] for more details. #### About `Vector` distributions in TensorFlow. The `VectorDiffeomixture` is a non-standard distribution that has properties
particularly useful in [variational Bayesian
methods](https://en.wikipedia.org/wiki/Variational_Bayesian_methods). Conditioned on a draw from the SoftmaxNormal, `X|z` is a vector whose
components are linear combinations of affine transformations, thus is itself
an affine transformation. Note: The marginals `X_1|v,..., X_d|v` are *not* generally identical to some
parameterization of `distribution`. This is due to the fact that the sum of
draws from `distribution` are not generally itself the same `distribution`. #### About `Diffeomixture`s and reparameterization. The `VectorDiffeomixture` is designed to be reparameterized, i.e., its
parameters are only used to transform samples from a distribution which has no
trainable parameters. This property is important because backprop stops at
sources of stochasticity. That is, as long as the parameters are used *after*
the underlying source of stochasticity, the computed gradient is accurate. Reparametrization means that we can use gradient-descent (via backprop) to
optimize Monte-Carlo objectives. Such objectives are a finite-sample
approximation of an expectation and arise throughout scientific computing. WARNING: If you backprop through a VectorDiffeomixture sample and the "base"
distribution is both: not `FULLY_REPARAMETERIZED` and a function of trainable
variables, then the gradient is not guaranteed correct! #### Examples
#### References [1]: Joshua Dillon and Ian Langmore. Quadrature Compound: An approximating
family of distributions. _arXiv preprint arXiv:1801.03080_, 2018.
https://arxiv.org/abs/1801.03080
Show Example
import tensorflow_probability as tfp tfd = tfp.distributions # Create two batches of VectorDiffeomixtures, one with mix_loc=[0.], # another with mix_loc=[1]. In both cases, `K=2` and the affine # transformations involve: # k=0: loc=zeros(dims) scale=LinearOperatorScaledIdentity # k=1: loc=[2.]*dims scale=LinOpDiag dims = 5 vdm = tfd.VectorDiffeomixture( mix_loc=[[0.], [1]], temperature=[1.], distribution=tfd.Normal(loc=0., scale=1.), loc=[ None, # Equivalent to `np.zeros(dims, dtype=np.float32)`. np.float32([2.]*dims), ], scale=[ tf.linalg.LinearOperatorScaledIdentity( num_rows=dims, multiplier=np.float32(1.1), is_positive_definite=True), tf.linalg.LinearOperatorDiag( diag=np.linspace(2.5, 3.5, dims, dtype=np.float32), is_positive_definite=True), ], validate_args=True)
Methods
Properties
- allow_nan_stats
- allow_nan_stats_dyn
- batch_shape
- batch_shape_dyn
- distribution
- distribution_dyn
- dtype
- dtype_dyn
- endpoint_affine
- endpoint_affine_dyn
- event_shape
- event_shape_dyn
- grid
- grid_dyn
- interpolated_affine
- interpolated_affine_dyn
- mixture_distribution
- mixture_distribution_dyn
- name
- name_dyn
- parameters
- parameters_dyn
- PythonObject
- reparameterization_type
- reparameterization_type_dyn
- validate_args
- validate_args_dyn
Public static methods
VectorDiffeomixture NewDyn(object mix_loc, object temperature, object distribution, object loc, object scale, ImplicitContainer<T> quadrature_size, ImplicitContainer<T> quadrature_fn, ImplicitContainer<T> validate_args, ImplicitContainer<T> allow_nan_stats, ImplicitContainer<T> name)
Constructs the VectorDiffeomixture on `R^d`. (deprecated) Warning: THIS FUNCTION IS DEPRECATED. It will be removed after 2018-10-01.
Instructions for updating:
The TensorFlow Distributions library has moved to TensorFlow Probability (https://github.com/tensorflow/probability). You should update all references to use `tfp.distributions` instead of
tf.contrib.distributions
. The vector diffeomixture (VDM) approximates the compound distribution ```none
p(x) = int p(x | z) p(z) dz,
where z is in the K-simplex, and
p(x | z) := p(x | loc=sum_k z[k] loc[k], scale=sum_k z[k] scale[k])
```
Parameters
-
object
mix_loc - `float`-like `Tensor` with shape `[b1,..., bB, K-1]`. In terms of samples, larger `mix_loc[..., k]` ==> `Z` is more likely to put more weight on its `kth` component.
-
object
temperature - `float`-like `Tensor`. Broadcastable with `mix_loc`. In terms of samples, smaller `temperature` means one component is more likely to dominate. I.e., smaller `temperature` makes the VDM look more like a standard mixture of `K` components.
-
object
distribution - `tf.Distribution`-like instance. Distribution from which `d` iid samples are used as input to the selected affine transformation. Must be a scalar-batch, scalar-event distribution. Typically `distribution.reparameterization_type = FULLY_REPARAMETERIZED` or it is a function of non-trainable parameters. WARNING: If you backprop through a VectorDiffeomixture sample and the `distribution` is not `FULLY_REPARAMETERIZED` yet is a function of trainable variables, then the gradient will be incorrect!
-
object
loc - Length-`K` list of `float`-type `Tensor`s. The `k`-th element represents the `shift` used for the `k`-th affine transformation. If the `k`-th item is `None`, `loc` is implicitly `0`. When specified, must have shape `[B1,..., Bb, d]` where `b >= 0` and `d` is the event size.
-
object
scale - Length-`K` list of `LinearOperator`s. Each should be positive-definite and operate on a `d`-dimensional vector space. The `k`-th element represents the `scale` used for the `k`-th affine transformation. `LinearOperator`s must have shape `[B1,..., Bb, d, d]`, `b >= 0`, i.e., characterizes `b`-batches of `d x d` matrices
-
ImplicitContainer<T>
quadrature_size - Python `int` scalar representing number of quadrature points. Larger `quadrature_size` means `q_N(x)` better approximates `p(x)`.
-
ImplicitContainer<T>
quadrature_fn - Python callable taking `normal_loc`, `normal_scale`, `quadrature_size`, `validate_args` and returning `tuple(grid, probs)` representing the SoftmaxNormal grid and corresponding normalized weight. normalized) weight. Default value: `quadrature_scheme_softmaxnormal_quantiles`.
-
ImplicitContainer<T>
validate_args - Python `bool`, default `False`. When `True` distribution parameters are checked for validity despite possibly degrading runtime performance. When `False` invalid inputs may silently render incorrect outputs.
-
ImplicitContainer<T>
allow_nan_stats - Python `bool`, default `True`. When `True`, statistics (e.g., mean, mode, variance) use the value "`NaN`" to indicate the result is undefined. When `False`, an exception is raised if one or more of the statistic's batch members are undefined.
-
ImplicitContainer<T>
name - Python `str` name prefixed to Ops created by this class.
Public properties
object allow_nan_stats get;
object allow_nan_stats_dyn get;
TensorShape batch_shape get;
object batch_shape_dyn get;
object distribution get;
Base scalar-event, scalar-batch distribution.
object distribution_dyn get;
Base scalar-event, scalar-batch distribution.
object dtype get;
object dtype_dyn get;
IList<AffineLinearOperator> endpoint_affine get;
Affine transformation for each of `K` components.
object endpoint_affine_dyn get;
Affine transformation for each of `K` components.
TensorShape event_shape get;
object event_shape_dyn get;
object grid get;
Grid of mixing probabilities, one for each grid point.
object grid_dyn get;
Grid of mixing probabilities, one for each grid point.
IList<AffineLinearOperator> interpolated_affine get;
Affine transformation for each convex combination of `K` components.
object interpolated_affine_dyn get;
Affine transformation for each convex combination of `K` components.
Categorical mixture_distribution get;
Distribution used to select a convex combination of affine transforms.
object mixture_distribution_dyn get;
Distribution used to select a convex combination of affine transforms.