Type BahdanauMonotonicAttention
Namespace tensorflow.contrib.seq2seq
Parent _BaseMonotonicAttentionMechanism
Interfaces IBahdanauMonotonicAttention
Monotonic attention mechanism with Bahadanau-style energy function. This type of attention enforces a monotonic constraint on the attention
distributions; that is once the model attends to a given point in the memory
it can't attend to any prior points at subsequence output timesteps. It
achieves this by using the _monotonic_probability_fn instead of softmax to
construct its attention distributions. Since the attention scores are passed
through a sigmoid, a learnable scalar bias parameter is applied after the
score function and before the sigmoid. Otherwise, it is equivalent to
BahdanauAttention. This approach is proposed in Colin Raffel, Minh-Thang Luong, Peter J. Liu, Ron J. Weiss, Douglas Eck,
"Online and Linear-Time Attention by Enforcing Monotonic Alignments."
ICML 2017. https://arxiv.org/abs/1704.00784
Methods
Properties
Public static methods
BahdanauMonotonicAttention NewDyn(object num_units, object memory, object memory_sequence_length, ImplicitContainer<T> normalize, object score_mask_value, ImplicitContainer<T> sigmoid_noise, object sigmoid_noise_seed, ImplicitContainer<T> score_bias_init, ImplicitContainer<T> mode, object dtype, ImplicitContainer<T> name)
Construct the Attention mechanism.
Parameters
-
object
num_units - The depth of the query mechanism.
-
object
memory - The memory to query; usually the output of an RNN encoder. This tensor should be shaped `[batch_size, max_time,...]`. memory_sequence_length (optional): Sequence lengths for the batch entries in memory. If provided, the memory tensor rows are masked with zeros for values past the respective sequence lengths.
-
object
memory_sequence_length -
ImplicitContainer<T>
normalize -
object
score_mask_value - (optional): The mask value for score before passing into `probability_fn`. The default is -inf. Only used if `memory_sequence_length` is not None.
-
ImplicitContainer<T>
sigmoid_noise - Standard deviation of pre-sigmoid noise. See the docstring for `_monotonic_probability_fn` for more information.
-
object
sigmoid_noise_seed - (optional) Random seed for pre-sigmoid noise.
-
ImplicitContainer<T>
score_bias_init - Initial value for score bias scalar. It's recommended to initialize this to a negative value when the length of the memory is large.
-
ImplicitContainer<T>
mode - How to compute the attention distribution. Must be one of
'recursive', 'parallel', or 'hard'. See the docstring for
tf.contrib.seq2seq.monotonic_attention
for more information. -
object
dtype - The data type for the query and memory layers of the attention mechanism.
-
ImplicitContainer<T>
name - Name to use when creating ops.