Type AdditiveAttention
Namespace tensorflow.keras.layers
Parent BaseDenseAttention
Interfaces IAdditiveAttention
Additive attention layer, a.k.a. Bahdanau-style attention. Inputs are `query` tensor of shape `[batch_size, Tq, dim]`, `value` tensor of
shape `[batch_size, Tv, dim]` and `key` tensor of shape
`[batch_size, Tv, dim]`. The calculation follows the steps: 1. Reshape `query` and `value` into shapes `[batch_size, Tq, 1, dim]`
and `[batch_size, 1, Tv, dim]` respectively.
2. Calculate scores with shape `[batch_size, Tq, Tv]` as a non-linear
sum: `scores = tf.reduce_sum(tf.tanh(query + value), axis=-1)`
3. Use scores to calculate a distribution with shape
`[batch_size, Tq, Tv]`: `distribution = tf.nn.softmax(scores)`.
4. Use `distribution` to create a linear combination of `value` with
shape `batch_size, Tq, dim]`:
`return tf.matmul(distribution, value)`.
Properties
- activity_regularizer
- activity_regularizer_dyn
- built
- causal
- dtype
- dtype_dyn
- dynamic
- dynamic_dyn
- inbound_nodes
- inbound_nodes_dyn
- input
- input_dyn
- input_mask
- input_mask_dyn
- input_shape
- input_shape_dyn
- input_spec
- input_spec_dyn
- losses
- losses_dyn
- metrics
- metrics_dyn
- name
- name_dyn
- name_scope
- name_scope_dyn
- non_trainable_variables
- non_trainable_variables_dyn
- non_trainable_weights
- non_trainable_weights_dyn
- outbound_nodes
- outbound_nodes_dyn
- output
- output_dyn
- output_mask
- output_mask_dyn
- output_shape
- output_shape_dyn
- PythonObject
- scale
- stateful
- submodules
- submodules_dyn
- supports_masking
- trainable
- trainable_dyn
- trainable_variables
- trainable_variables_dyn
- trainable_weights
- trainable_weights_dyn
- updates
- updates_dyn
- use_scale
- variables
- variables_dyn
- weights
- weights_dyn