Type tf.signal
Namespace tensorflow
Methods
- dct
- dct
- dct
- dct_dyn
- fftshift
- fftshift
- fftshift_dyn
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame
- frame_dyn
- hamming_window
- hamming_window_dyn
- hann_window
- hann_window
- hann_window_dyn
- idct
- idct
- idct_dyn
- ifftshift
- ifftshift
- ifftshift_dyn
- inverse_stft
- inverse_stft_dyn
- inverse_stft_window_fn
- inverse_stft_window_fn
- inverse_stft_window_fn_dyn
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix
- linear_to_mel_weight_matrix_dyn
- mfccs_from_log_mel_spectrograms
- mfccs_from_log_mel_spectrograms_dyn
- overlap_and_add
- overlap_and_add
- overlap_and_add
- overlap_and_add
- overlap_and_add_dyn
- stft
- stft_dyn
Properties
Public static methods
Tensor dct(IEnumerable<object> input, int type, Nullable<int> n, int axis, string norm, string name)
Computes the 1D [Discrete Cosine Transform (DCT)][dct] of `input`. Currently only Types I, II and III are supported.
Type I is implemented using a length `2N` padded
tf.signal.rfft
.
Type II is implemented using a length `2N` padded tf.signal.rfft
, as
described here: [Type 2 DCT using 2N FFT padded (Makhoul)](https://dsp.stackexchange.com/a/10606).
Type III is a fairly straightforward inverse of Type II
(i.e. using a length `2N` padded tf.signal.irfft
).
Parameters
-
IEnumerable<object>
input - A `[..., samples]` `float32` `Tensor` containing the signals to take the DCT of.
-
int
type - The DCT type to perform. Must be 1, 2 or 3.
-
Nullable<int>
n - The length of the transform. If length is less than sequence length, only the first n elements of the sequence are considered for the DCT. If n is greater than the sequence length, zeros are padded and then the DCT is computed as usual.
-
int
axis - For future expansion. The axis to compute the DCT along. Must be `-1`.
-
string
norm - The normalization to apply. `None` for no normalization or `'ortho'` for orthonormal normalization.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `[..., samples]` `float32` `Tensor` containing the DCT of `input`.
Tensor dct(object input, int type, Nullable<int> n, int axis, string norm, string name)
Computes the 1D [Discrete Cosine Transform (DCT)][dct] of `input`. Currently only Types I, II and III are supported.
Type I is implemented using a length `2N` padded
tf.signal.rfft
.
Type II is implemented using a length `2N` padded tf.signal.rfft
, as
described here: [Type 2 DCT using 2N FFT padded (Makhoul)](https://dsp.stackexchange.com/a/10606).
Type III is a fairly straightforward inverse of Type II
(i.e. using a length `2N` padded tf.signal.irfft
).
Parameters
-
object
input - A `[..., samples]` `float32` `Tensor` containing the signals to take the DCT of.
-
int
type - The DCT type to perform. Must be 1, 2 or 3.
-
Nullable<int>
n - The length of the transform. If length is less than sequence length, only the first n elements of the sequence are considered for the DCT. If n is greater than the sequence length, zeros are padded and then the DCT is computed as usual.
-
int
axis - For future expansion. The axis to compute the DCT along. Must be `-1`.
-
string
norm - The normalization to apply. `None` for no normalization or `'ortho'` for orthonormal normalization.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `[..., samples]` `float32` `Tensor` containing the DCT of `input`.
Tensor dct(IGraphNodeBase input, int type, Nullable<int> n, int axis, string norm, string name)
Computes the 1D [Discrete Cosine Transform (DCT)][dct] of `input`. Currently only Types I, II and III are supported.
Type I is implemented using a length `2N` padded
tf.signal.rfft
.
Type II is implemented using a length `2N` padded tf.signal.rfft
, as
described here: [Type 2 DCT using 2N FFT padded (Makhoul)](https://dsp.stackexchange.com/a/10606).
Type III is a fairly straightforward inverse of Type II
(i.e. using a length `2N` padded tf.signal.irfft
).
Parameters
-
IGraphNodeBase
input - A `[..., samples]` `float32` `Tensor` containing the signals to take the DCT of.
-
int
type - The DCT type to perform. Must be 1, 2 or 3.
-
Nullable<int>
n - The length of the transform. If length is less than sequence length, only the first n elements of the sequence are considered for the DCT. If n is greater than the sequence length, zeros are padded and then the DCT is computed as usual.
-
int
axis - For future expansion. The axis to compute the DCT along. Must be `-1`.
-
string
norm - The normalization to apply. `None` for no normalization or `'ortho'` for orthonormal normalization.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `[..., samples]` `float32` `Tensor` containing the DCT of `input`.
object dct_dyn(object input, ImplicitContainer<T> type, object n, ImplicitContainer<T> axis, object norm, object name)
Computes the 1D [Discrete Cosine Transform (DCT)][dct] of `input`. Currently only Types I, II and III are supported.
Type I is implemented using a length `2N` padded
tf.signal.rfft
.
Type II is implemented using a length `2N` padded tf.signal.rfft
, as
described here: [Type 2 DCT using 2N FFT padded (Makhoul)](https://dsp.stackexchange.com/a/10606).
Type III is a fairly straightforward inverse of Type II
(i.e. using a length `2N` padded tf.signal.irfft
).
Parameters
-
object
input - A `[..., samples]` `float32` `Tensor` containing the signals to take the DCT of.
-
ImplicitContainer<T>
type - The DCT type to perform. Must be 1, 2 or 3.
-
object
n - The length of the transform. If length is less than sequence length, only the first n elements of the sequence are considered for the DCT. If n is greater than the sequence length, zeros are padded and then the DCT is computed as usual.
-
ImplicitContainer<T>
axis - For future expansion. The axis to compute the DCT along. Must be `-1`.
-
object
norm - The normalization to apply. `None` for no normalization or `'ortho'` for orthonormal normalization.
-
object
name - An optional name for the operation.
Returns
-
object
- A `[..., samples]` `float32` `Tensor` containing the DCT of `input`.
Tensor fftshift(IEnumerable<int> x, Nullable<ValueTuple<int, int>> axes, string name)
Shift the zero-frequency component to the center of the spectrum. This function swaps half-spaces for all axes listed (defaults to all).
Note that ``y[0]`` is the Nyquist component only if ``len(x)`` is even.
Parameters
-
IEnumerable<int>
x - `Tensor`, input tensor.
-
Nullable<ValueTuple<int, int>>
axes - `int` or shape `tuple`, optional Axes over which to shift. Default is None, which shifts all axes.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor`, The shifted tensor.
Show Example
x = tf.signal.fftshift([ 0., 1., 2., 3., 4., -5., -4., -3., -2., -1.]) x.numpy() # array([-5., -4., -3., -2., -1., 0., 1., 2., 3., 4.])
Tensor fftshift(IEnumerable<int> x, int axes, string name)
Shift the zero-frequency component to the center of the spectrum. This function swaps half-spaces for all axes listed (defaults to all).
Note that ``y[0]`` is the Nyquist component only if ``len(x)`` is even.
Parameters
-
IEnumerable<int>
x - `Tensor`, input tensor.
-
int
axes - `int` or shape `tuple`, optional Axes over which to shift. Default is None, which shifts all axes.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor`, The shifted tensor.
Show Example
x = tf.signal.fftshift([ 0., 1., 2., 3., 4., -5., -4., -3., -2., -1.]) x.numpy() # array([-5., -4., -3., -2., -1., 0., 1., 2., 3., 4.])
object fftshift_dyn(object x, object axes, object name)
Shift the zero-frequency component to the center of the spectrum. This function swaps half-spaces for all axes listed (defaults to all).
Note that ``y[0]`` is the Nyquist component only if ``len(x)`` is even.
Parameters
-
object
x - `Tensor`, input tensor.
-
object
axes - `int` or shape `tuple`, optional Axes over which to shift. Default is None, which shifts all axes.
-
object
name - An optional name for the operation.
Returns
-
object
- A `Tensor`, The shifted tensor.
Show Example
x = tf.signal.fftshift([ 0., 1., 2., 3., 4., -5., -4., -3., -2., -1.]) x.numpy() # array([-5., -4., -3., -2., -1., 0., 1., 2., 3., 4.])
Tensor frame(int signal, IEnumerable<int> frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, IEnumerable<int> frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, IEnumerable<int> frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, int frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, int frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, int frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, IGraphNodeBase frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, IGraphNodeBase frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, IGraphNodeBase frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IEnumerable<int> signal, IEnumerable<int> frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IEnumerable<int>
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(int signal, IEnumerable<int> frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(int signal, IEnumerable<int> frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, int frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(int signal, IGraphNodeBase frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(int signal, int frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(int signal, int frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, int frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, IEnumerable<int> frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, int frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, IGraphNodeBase frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(int signal, IGraphNodeBase frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(int signal, IGraphNodeBase frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, IGraphNodeBase frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, IEnumerable<int> frame_length, IGraphNodeBase frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IGraphNodeBase
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, IEnumerable<int> frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IEnumerable<int>
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(int signal, int frame_length, IEnumerable<int> frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
int
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
int
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
IEnumerable<int>
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor frame(IGraphNodeBase signal, IGraphNodeBase frame_length, int frame_step, bool pad_end, ImplicitContainer<T> pad_value, int axis, string name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
IGraphNodeBase
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
IGraphNodeBase
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
int
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
bool
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
int
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
object frame_dyn(object signal, object frame_length, object frame_step, ImplicitContainer<T> pad_end, ImplicitContainer<T> pad_value, ImplicitContainer<T> axis, object name)
Expands `signal`'s `axis` dimension into frames of `frame_length`. Slides a window of size `frame_length` over `signal`'s `axis` dimension
with a stride of `frame_step`, replacing the `axis` dimension with
`[frames, frame_length]` frames. If `pad_end` is True, window positions that are past the end of the `axis`
dimension are padded with `pad_value` until the window moves fully past the
end of the dimension. Otherwise, only window positions that fully overlap the
`axis` dimension are produced.
Parameters
-
object
signal - A `[..., samples,...]` `Tensor`. The rank and dimensions may be unknown. Rank must be at least 1.
-
object
frame_length - The frame length in samples. An integer or scalar `Tensor`.
-
object
frame_step - The frame hop size in samples. An integer or scalar `Tensor`.
-
ImplicitContainer<T>
pad_end - Whether to pad the end of `signal` with `pad_value`.
-
ImplicitContainer<T>
pad_value - An optional scalar `Tensor` to use where the input signal does not exist when `pad_end` is True.
-
ImplicitContainer<T>
axis - A scalar integer `Tensor` indicating the axis to frame. Defaults to the last axis. Supports negative values for indexing from the end.
-
object
name - An optional name for the operation.
Returns
-
object
- A `Tensor` of frames with shape `[..., frames, frame_length,...]`.
Show Example
pcm = tf.compat.v1.placeholder(tf.float32, [None, 9152]) frames = tf.signal.frame(pcm, 512, 180) magspec = tf.abs(tf.signal.rfft(frames, [512])) image = tf.expand_dims(magspec, 3)
Tensor hamming_window(int window_length, bool periodic, ImplicitContainer<T> dtype, string name)
Generate a [Hamming][hamming] window.
Parameters
-
int
window_length - A scalar `Tensor` indicating the window length to generate.
-
bool
periodic - A bool `Tensor` indicating whether to generate a periodic or symmetric window. Periodic windows are typically used for spectral analysis while symmetric windows are typically used for digital filter design.
-
ImplicitContainer<T>
dtype - The data type to produce. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[window_length]` of type `dtype`.
object hamming_window_dyn(object window_length, ImplicitContainer<T> periodic, ImplicitContainer<T> dtype, object name)
Generate a [Hamming][hamming] window.
Parameters
-
object
window_length - A scalar `Tensor` indicating the window length to generate.
-
ImplicitContainer<T>
periodic - A bool `Tensor` indicating whether to generate a periodic or symmetric window. Periodic windows are typically used for spectral analysis while symmetric windows are typically used for digital filter design.
-
ImplicitContainer<T>
dtype - The data type to produce. Must be a floating point type.
-
object
name - An optional name for the operation.
Returns
-
object
- A `Tensor` of shape `[window_length]` of type `dtype`.
Tensor hann_window(int window_length, bool periodic, ImplicitContainer<T> dtype, string name)
Generate a [Hann window][hann].
Parameters
-
int
window_length - A scalar `Tensor` indicating the window length to generate.
-
bool
periodic - A bool `Tensor` indicating whether to generate a periodic or symmetric window. Periodic windows are typically used for spectral analysis while symmetric windows are typically used for digital filter design.
-
ImplicitContainer<T>
dtype - The data type to produce. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[window_length]` of type `dtype`.
Tensor hann_window(IGraphNodeBase window_length, bool periodic, ImplicitContainer<T> dtype, string name)
Generate a [Hann window][hann].
Parameters
-
IGraphNodeBase
window_length - A scalar `Tensor` indicating the window length to generate.
-
bool
periodic - A bool `Tensor` indicating whether to generate a periodic or symmetric window. Periodic windows are typically used for spectral analysis while symmetric windows are typically used for digital filter design.
-
ImplicitContainer<T>
dtype - The data type to produce. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[window_length]` of type `dtype`.
object hann_window_dyn(object window_length, ImplicitContainer<T> periodic, ImplicitContainer<T> dtype, object name)
Generate a [Hann window][hann].
Parameters
-
object
window_length - A scalar `Tensor` indicating the window length to generate.
-
ImplicitContainer<T>
periodic - A bool `Tensor` indicating whether to generate a periodic or symmetric window. Periodic windows are typically used for spectral analysis while symmetric windows are typically used for digital filter design.
-
ImplicitContainer<T>
dtype - The data type to produce. Must be a floating point type.
-
object
name - An optional name for the operation.
Returns
-
object
- A `Tensor` of shape `[window_length]` of type `dtype`.
Tensor idct(IEnumerable<object> input, int type, object n, int axis, string norm, string name)
Computes the 1D [Inverse Discrete Cosine Transform (DCT)][idct] of `input`. Currently only Types I, II and III are supported. Type III is the inverse of
Type II, and vice versa. Note that you must re-normalize by 1/(2n) to obtain an inverse if `norm` is
not `'ortho'`. That is:
`signal == idct(dct(signal)) * 0.5 / signal.shape[-1]`.
When `norm='ortho'`, we have:
`signal == idct(dct(signal, norm='ortho'), norm='ortho')`.
Parameters
-
IEnumerable<object>
input - A `[..., samples]` `float32` `Tensor` containing the signals to take the DCT of.
-
int
type - The IDCT type to perform. Must be 1, 2 or 3.
-
object
n - For future expansion. The length of the transform. Must be `None`.
-
int
axis - For future expansion. The axis to compute the DCT along. Must be `-1`.
-
string
norm - The normalization to apply. `None` for no normalization or `'ortho'` for orthonormal normalization.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `[..., samples]` `float32` `Tensor` containing the IDCT of `input`.
Tensor idct(object input, int type, object n, int axis, string norm, string name)
Computes the 1D [Inverse Discrete Cosine Transform (DCT)][idct] of `input`. Currently only Types I, II and III are supported. Type III is the inverse of
Type II, and vice versa. Note that you must re-normalize by 1/(2n) to obtain an inverse if `norm` is
not `'ortho'`. That is:
`signal == idct(dct(signal)) * 0.5 / signal.shape[-1]`.
When `norm='ortho'`, we have:
`signal == idct(dct(signal, norm='ortho'), norm='ortho')`.
Parameters
-
object
input - A `[..., samples]` `float32` `Tensor` containing the signals to take the DCT of.
-
int
type - The IDCT type to perform. Must be 1, 2 or 3.
-
object
n - For future expansion. The length of the transform. Must be `None`.
-
int
axis - For future expansion. The axis to compute the DCT along. Must be `-1`.
-
string
norm - The normalization to apply. `None` for no normalization or `'ortho'` for orthonormal normalization.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `[..., samples]` `float32` `Tensor` containing the IDCT of `input`.
object idct_dyn(object input, ImplicitContainer<T> type, object n, ImplicitContainer<T> axis, object norm, object name)
Computes the 1D [Inverse Discrete Cosine Transform (DCT)][idct] of `input`. Currently only Types I, II and III are supported. Type III is the inverse of
Type II, and vice versa. Note that you must re-normalize by 1/(2n) to obtain an inverse if `norm` is
not `'ortho'`. That is:
`signal == idct(dct(signal)) * 0.5 / signal.shape[-1]`.
When `norm='ortho'`, we have:
`signal == idct(dct(signal, norm='ortho'), norm='ortho')`.
Parameters
-
object
input - A `[..., samples]` `float32` `Tensor` containing the signals to take the DCT of.
-
ImplicitContainer<T>
type - The IDCT type to perform. Must be 1, 2 or 3.
-
object
n - For future expansion. The length of the transform. Must be `None`.
-
ImplicitContainer<T>
axis - For future expansion. The axis to compute the DCT along. Must be `-1`.
-
object
norm - The normalization to apply. `None` for no normalization or `'ortho'` for orthonormal normalization.
-
object
name - An optional name for the operation.
Returns
-
object
- A `[..., samples]` `float32` `Tensor` containing the IDCT of `input`.
Tensor ifftshift(IEnumerable<int> x, int axes, string name)
The inverse of fftshift. Although identical for even-length x,
the functions differ by one sample for odd-length x.
Parameters
-
IEnumerable<int>
x - `Tensor`, input tensor.
-
int
axes - `int` or shape `tuple` Axes over which to calculate. Defaults to None, which shifts all axes.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor`, The shifted tensor.
Show Example
x = tf.signal.ifftshift([[ 0., 1., 2.],[ 3., 4., -4.],[-3., -2., -1.]]) x.numpy() # array([[ 4., -4., 3.],[-2., -1., -3.],[ 1., 2., 0.]])
Tensor ifftshift(IEnumerable<int> x, Nullable<ValueTuple<int, int>> axes, string name)
The inverse of fftshift. Although identical for even-length x,
the functions differ by one sample for odd-length x.
Parameters
-
IEnumerable<int>
x - `Tensor`, input tensor.
-
Nullable<ValueTuple<int, int>>
axes - `int` or shape `tuple` Axes over which to calculate. Defaults to None, which shifts all axes.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor`, The shifted tensor.
Show Example
x = tf.signal.ifftshift([[ 0., 1., 2.],[ 3., 4., -4.],[-3., -2., -1.]]) x.numpy() # array([[ 4., -4., 3.],[-2., -1., -3.],[ 1., 2., 0.]])
object ifftshift_dyn(object x, object axes, object name)
The inverse of fftshift. Although identical for even-length x,
the functions differ by one sample for odd-length x.
Parameters
-
object
x - `Tensor`, input tensor.
-
object
axes - `int` or shape `tuple` Axes over which to calculate. Defaults to None, which shifts all axes.
-
object
name - An optional name for the operation.
Returns
-
object
- A `Tensor`, The shifted tensor.
Show Example
x = tf.signal.ifftshift([[ 0., 1., 2.],[ 3., 4., -4.],[-3., -2., -1.]]) x.numpy() # array([[ 4., -4., 3.],[-2., -1., -3.],[ 1., 2., 0.]])
Tensor inverse_stft(IGraphNodeBase stfts, int frame_length, int frame_step, Nullable<int> fft_length, ImplicitContainer<T> window_fn, string name)
Computes the inverse [Short-time Fourier Transform][stft] of `stfts`. To reconstruct an original waveform, a complimentary window function should
be used in inverse_stft. Such a window function can be constructed with
tf.signal.inverse_stft_window_fn. Example:
if a custom window_fn is used in stft, it must be passed to
inverse_stft_window_fn:
Implemented with GPU-compatible ops and supports gradients.
Parameters
-
IGraphNodeBase
stfts - A `complex64` `[..., frames, fft_unique_bins]` `Tensor` of STFT bins representing a batch of `fft_length`-point STFTs where `fft_unique_bins` is `fft_length // 2 + 1`
-
int
frame_length - An integer scalar `Tensor`. The window length in samples.
-
int
frame_step - An integer scalar `Tensor`. The number of samples to step.
-
Nullable<int>
fft_length - An integer scalar `Tensor`. The size of the FFT that produced `stfts`. If not provided, uses the smallest power of 2 enclosing `frame_length`.
-
ImplicitContainer<T>
window_fn - A callable that takes a window length and a `dtype` keyword argument and returns a `[window_length]` `Tensor` of samples in the provided datatype. If set to `None`, no windowing is used.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `[..., samples]` `Tensor` of `float32` signals representing the inverse STFT for each input STFT in `stfts`.
Show Example
frame_length = 400 frame_step = 160 waveform = tf.compat.v1.placeholder(dtype=tf.float32, shape=[1000]) stft = tf.signal.stft(waveform, frame_length, frame_step) inverse_stft = tf.signal.inverse_stft( stft, frame_length, frame_step, window_fn=tf.signal.inverse_stft_window_fn(frame_step))
object inverse_stft_dyn(object stfts, object frame_length, object frame_step, object fft_length, ImplicitContainer<T> window_fn, object name)
Computes the inverse [Short-time Fourier Transform][stft] of `stfts`. To reconstruct an original waveform, a complimentary window function should
be used in inverse_stft. Such a window function can be constructed with
tf.signal.inverse_stft_window_fn. Example:
if a custom window_fn is used in stft, it must be passed to
inverse_stft_window_fn:
Implemented with GPU-compatible ops and supports gradients.
Parameters
-
object
stfts - A `complex64` `[..., frames, fft_unique_bins]` `Tensor` of STFT bins representing a batch of `fft_length`-point STFTs where `fft_unique_bins` is `fft_length // 2 + 1`
-
object
frame_length - An integer scalar `Tensor`. The window length in samples.
-
object
frame_step - An integer scalar `Tensor`. The number of samples to step.
-
object
fft_length - An integer scalar `Tensor`. The size of the FFT that produced `stfts`. If not provided, uses the smallest power of 2 enclosing `frame_length`.
-
ImplicitContainer<T>
window_fn - A callable that takes a window length and a `dtype` keyword argument and returns a `[window_length]` `Tensor` of samples in the provided datatype. If set to `None`, no windowing is used.
-
object
name - An optional name for the operation.
Returns
-
object
- A `[..., samples]` `Tensor` of `float32` signals representing the inverse STFT for each input STFT in `stfts`.
Show Example
frame_length = 400 frame_step = 160 waveform = tf.compat.v1.placeholder(dtype=tf.float32, shape=[1000]) stft = tf.signal.stft(waveform, frame_length, frame_step) inverse_stft = tf.signal.inverse_stft( stft, frame_length, frame_step, window_fn=tf.signal.inverse_stft_window_fn(frame_step))
object inverse_stft_window_fn(int frame_step, ImplicitContainer<T> forward_window_fn, string name)
Generates a window function that can be used in `inverse_stft`. Constructs a window that is equal to the forward window with a further
pointwise amplitude correction. `inverse_stft_window_fn` is equivalent to
`forward_window_fn` in the case where it would produce an exact inverse. See examples in `inverse_stft` documentation for usage.
Parameters
-
int
frame_step - An integer scalar `Tensor`. The number of samples to step.
-
ImplicitContainer<T>
forward_window_fn - window_fn used in the forward transform, `stft`.
-
string
name - An optional name for the operation.
Returns
-
object
- A callable that takes a window length and a `dtype` keyword argument and returns a `[window_length]` `Tensor` of samples in the provided datatype. The returned window is suitable for reconstructing original waveform in inverse_stft.
object inverse_stft_window_fn(IGraphNodeBase frame_step, ImplicitContainer<T> forward_window_fn, string name)
Generates a window function that can be used in `inverse_stft`. Constructs a window that is equal to the forward window with a further
pointwise amplitude correction. `inverse_stft_window_fn` is equivalent to
`forward_window_fn` in the case where it would produce an exact inverse. See examples in `inverse_stft` documentation for usage.
Parameters
-
IGraphNodeBase
frame_step - An integer scalar `Tensor`. The number of samples to step.
-
ImplicitContainer<T>
forward_window_fn - window_fn used in the forward transform, `stft`.
-
string
name - An optional name for the operation.
Returns
-
object
- A callable that takes a window length and a `dtype` keyword argument and returns a `[window_length]` `Tensor` of samples in the provided datatype. The returned window is suitable for reconstructing original waveform in inverse_stft.
object inverse_stft_window_fn_dyn(object frame_step, ImplicitContainer<T> forward_window_fn, object name)
Generates a window function that can be used in `inverse_stft`. Constructs a window that is equal to the forward window with a further
pointwise amplitude correction. `inverse_stft_window_fn` is equivalent to
`forward_window_fn` in the case where it would produce an exact inverse. See examples in `inverse_stft` documentation for usage.
Parameters
-
object
frame_step - An integer scalar `Tensor`. The number of samples to step.
-
ImplicitContainer<T>
forward_window_fn - window_fn used in the forward transform, `stft`.
-
object
name - An optional name for the operation.
Returns
-
object
- A callable that takes a window length and a `dtype` keyword argument and returns a `[window_length]` `Tensor` of samples in the provided datatype. The returned window is suitable for reconstructing original waveform in inverse_stft.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, IGraphNodeBase sample_rate, int lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
IGraphNodeBase
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, IGraphNodeBase sample_rate, int lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
IGraphNodeBase
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, IGraphNodeBase sample_rate, double lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
IGraphNodeBase
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, IGraphNodeBase sample_rate, double lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
IGraphNodeBase
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, int sample_rate, int lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
int
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, int sample_rate, int lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
int
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, int sample_rate, double lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
int
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, IGraphNodeBase sample_rate, double lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
IGraphNodeBase
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, double sample_rate, int lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
double
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, double sample_rate, double lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
double
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, double sample_rate, int lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
double
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, double sample_rate, int lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
double
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, int sample_rate, double lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
int
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, int sample_rate, double lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
int
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, int sample_rate, int lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
int
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, int sample_rate, int lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
int
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, IGraphNodeBase sample_rate, double lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
IGraphNodeBase
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, int sample_rate, double lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
int
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, double sample_rate, double lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
double
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, IGraphNodeBase sample_rate, int lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
IGraphNodeBase
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, int num_spectrogram_bins, IGraphNodeBase sample_rate, int lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
int
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
IGraphNodeBase
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, double sample_rate, double lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
double
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, double sample_rate, double lower_edge_hertz, int upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
double
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
double
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
int
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
Tensor linear_to_mel_weight_matrix(int num_mel_bins, IGraphNodeBase num_spectrogram_bins, double sample_rate, int lower_edge_hertz, double upper_edge_hertz, ImplicitContainer<T> dtype, string name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
int
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
IGraphNodeBase
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
double
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
int
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
double
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
object linear_to_mel_weight_matrix_dyn(ImplicitContainer<T> num_mel_bins, ImplicitContainer<T> num_spectrogram_bins, ImplicitContainer<T> sample_rate, ImplicitContainer<T> lower_edge_hertz, ImplicitContainer<T> upper_edge_hertz, ImplicitContainer<T> dtype, object name)
Returns a matrix to warp linear scale spectrograms to the [mel scale][mel]. Returns a weight matrix that can be used to re-weight a `Tensor` containing
`num_spectrogram_bins` linearly sampled frequency information from
`[0, sample_rate / 2]` into `num_mel_bins` frequency information from
`[lower_edge_hertz, upper_edge_hertz]` on the [mel scale][mel]. For example, the returned matrix `A` can be used to right-multiply a
spectrogram `S` of shape `[frames, num_spectrogram_bins]` of linear
scale spectrum values (e.g. STFT magnitudes) to generate a "mel spectrogram"
`M` of shape `[frames, num_mel_bins]`. # `S` has shape [frames, num_spectrogram_bins]
# `M` has shape [frames, num_mel_bins]
M = tf.matmul(S, A) The matrix can be used with
tf.tensordot
to convert an arbitrary rank
`Tensor` of linear-scale spectral bins into the mel scale. # S has shape [..., num_spectrogram_bins].
# M has shape [..., num_mel_bins].
M = tf.tensordot(S, A, 1)
# tf.tensordot does not support shape inference for this case yet.
M.set_shape(S.shape[:-1].concatenate(A.shape[-1:]))
Parameters
-
ImplicitContainer<T>
num_mel_bins - Python int. How many bands in the resulting mel spectrum.
-
ImplicitContainer<T>
num_spectrogram_bins - An integer `Tensor`. How many bins there are in the source spectrogram data, which is understood to be `fft_size // 2 + 1`, i.e. the spectrogram only contains the nonredundant FFT bins.
-
ImplicitContainer<T>
sample_rate - Python float. Samples per second of the input signal used to create the spectrogram. We need this to figure out the actual frequencies for each spectrogram bin, which dictates how they are mapped into the mel scale.
-
ImplicitContainer<T>
lower_edge_hertz - Python float. Lower bound on the frequencies to be included in the mel spectrum. This corresponds to the lower edge of the lowest triangular band.
-
ImplicitContainer<T>
upper_edge_hertz - Python float. The desired top edge of the highest frequency band.
-
ImplicitContainer<T>
dtype - The `DType` of the result matrix. Must be a floating point type.
-
object
name - An optional name for the operation.
Returns
-
object
- A `Tensor` of shape `[num_spectrogram_bins, num_mel_bins]`.
object mfccs_from_log_mel_spectrograms(IGraphNodeBase log_mel_spectrograms, string name)
Computes [MFCCs][mfcc] of `log_mel_spectrograms`. Implemented with GPU-compatible ops and supports gradients. [Mel-Frequency Cepstral Coefficient (MFCC)][mfcc] calculation consists of
taking the DCT-II of a log-magnitude mel-scale spectrogram. [HTK][htk]'s MFCCs
use a particular scaling of the DCT-II which is almost orthogonal
normalization. We follow this convention. All `num_mel_bins` MFCCs are returned and it is up to the caller to select
a subset of the MFCCs based on their application. For example, it is typical
to only use the first few for speech recognition, as this results in
an approximately pitch-invariant representation of the signal.
Parameters
-
IGraphNodeBase
log_mel_spectrograms - A `[..., num_mel_bins]` `float32` `Tensor` of log-magnitude mel-scale spectrograms.
-
string
name - An optional name for the operation.
Returns
-
object
- A `[..., num_mel_bins]` `float32` `Tensor` of the MFCCs of `log_mel_spectrograms`.
Show Example
sample_rate = 16000.0 # A Tensor of [batch_size, num_samples] mono PCM samples in the range [-1, 1]. pcm = tf.compat.v1.placeholder(tf.float32, [None, None]) # A 1024-point STFT with frames of 64 ms and 75% overlap. stfts = tf.signal.stft(pcm, frame_length=1024, frame_step=256, fft_length=1024) spectrograms = tf.abs(stfts) # Warp the linear scale spectrograms into the mel-scale. num_spectrogram_bins = stfts.shape[-1].value lower_edge_hertz, upper_edge_hertz, num_mel_bins = 80.0, 7600.0, 80 linear_to_mel_weight_matrix = tf.signal.linear_to_mel_weight_matrix( num_mel_bins, num_spectrogram_bins, sample_rate, lower_edge_hertz, upper_edge_hertz) mel_spectrograms = tf.tensordot( spectrograms, linear_to_mel_weight_matrix, 1) mel_spectrograms.set_shape(spectrograms.shape[:-1].concatenate( linear_to_mel_weight_matrix.shape[-1:])) # Compute a stabilized log to get log-magnitude mel-scale spectrograms. log_mel_spectrograms = tf.math.log(mel_spectrograms + 1e-6) # Compute MFCCs from log_mel_spectrograms and take the first 13. mfccs = tf.signal.mfccs_from_log_mel_spectrograms( log_mel_spectrograms)[..., :13]
object mfccs_from_log_mel_spectrograms_dyn(object log_mel_spectrograms, object name)
Computes [MFCCs][mfcc] of `log_mel_spectrograms`. Implemented with GPU-compatible ops and supports gradients. [Mel-Frequency Cepstral Coefficient (MFCC)][mfcc] calculation consists of
taking the DCT-II of a log-magnitude mel-scale spectrogram. [HTK][htk]'s MFCCs
use a particular scaling of the DCT-II which is almost orthogonal
normalization. We follow this convention. All `num_mel_bins` MFCCs are returned and it is up to the caller to select
a subset of the MFCCs based on their application. For example, it is typical
to only use the first few for speech recognition, as this results in
an approximately pitch-invariant representation of the signal.
Parameters
-
object
log_mel_spectrograms - A `[..., num_mel_bins]` `float32` `Tensor` of log-magnitude mel-scale spectrograms.
-
object
name - An optional name for the operation.
Returns
-
object
- A `[..., num_mel_bins]` `float32` `Tensor` of the MFCCs of `log_mel_spectrograms`.
Show Example
sample_rate = 16000.0 # A Tensor of [batch_size, num_samples] mono PCM samples in the range [-1, 1]. pcm = tf.compat.v1.placeholder(tf.float32, [None, None]) # A 1024-point STFT with frames of 64 ms and 75% overlap. stfts = tf.signal.stft(pcm, frame_length=1024, frame_step=256, fft_length=1024) spectrograms = tf.abs(stfts) # Warp the linear scale spectrograms into the mel-scale. num_spectrogram_bins = stfts.shape[-1].value lower_edge_hertz, upper_edge_hertz, num_mel_bins = 80.0, 7600.0, 80 linear_to_mel_weight_matrix = tf.signal.linear_to_mel_weight_matrix( num_mel_bins, num_spectrogram_bins, sample_rate, lower_edge_hertz, upper_edge_hertz) mel_spectrograms = tf.tensordot( spectrograms, linear_to_mel_weight_matrix, 1) mel_spectrograms.set_shape(spectrograms.shape[:-1].concatenate( linear_to_mel_weight_matrix.shape[-1:])) # Compute a stabilized log to get log-magnitude mel-scale spectrograms. log_mel_spectrograms = tf.math.log(mel_spectrograms + 1e-6) # Compute MFCCs from log_mel_spectrograms and take the first 13. mfccs = tf.signal.mfccs_from_log_mel_spectrograms( log_mel_spectrograms)[..., :13]
Tensor overlap_and_add(ndarray signal, IGraphNodeBase frame_step, string name)
Reconstructs a signal from a framed representation. Adds potentially overlapping frames of a signal with shape
`[..., frames, frame_length]`, offsetting subsequent frames by `frame_step`.
The resulting tensor has shape `[..., output_size]` where output_size = (frames - 1) * frame_step + frame_length
Parameters
-
ndarray
signal - A [..., frames, frame_length] `Tensor`. All dimensions may be unknown, and rank must be at least 2.
-
IGraphNodeBase
frame_step - An integer or scalar `Tensor` denoting overlap offsets. Must be less than or equal to `frame_length`.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` with shape `[..., output_size]` containing the overlap-added frames of `signal`'s inner-most two dimensions.
Tensor overlap_and_add(IGraphNodeBase signal, IGraphNodeBase frame_step, string name)
Reconstructs a signal from a framed representation. Adds potentially overlapping frames of a signal with shape
`[..., frames, frame_length]`, offsetting subsequent frames by `frame_step`.
The resulting tensor has shape `[..., output_size]` where output_size = (frames - 1) * frame_step + frame_length
Parameters
-
IGraphNodeBase
signal - A [..., frames, frame_length] `Tensor`. All dimensions may be unknown, and rank must be at least 2.
-
IGraphNodeBase
frame_step - An integer or scalar `Tensor` denoting overlap offsets. Must be less than or equal to `frame_length`.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` with shape `[..., output_size]` containing the overlap-added frames of `signal`'s inner-most two dimensions.
Tensor overlap_and_add(ndarray signal, int frame_step, string name)
Reconstructs a signal from a framed representation. Adds potentially overlapping frames of a signal with shape
`[..., frames, frame_length]`, offsetting subsequent frames by `frame_step`.
The resulting tensor has shape `[..., output_size]` where output_size = (frames - 1) * frame_step + frame_length
Parameters
-
ndarray
signal - A [..., frames, frame_length] `Tensor`. All dimensions may be unknown, and rank must be at least 2.
-
int
frame_step - An integer or scalar `Tensor` denoting overlap offsets. Must be less than or equal to `frame_length`.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` with shape `[..., output_size]` containing the overlap-added frames of `signal`'s inner-most two dimensions.
Tensor overlap_and_add(IGraphNodeBase signal, int frame_step, string name)
Reconstructs a signal from a framed representation. Adds potentially overlapping frames of a signal with shape
`[..., frames, frame_length]`, offsetting subsequent frames by `frame_step`.
The resulting tensor has shape `[..., output_size]` where output_size = (frames - 1) * frame_step + frame_length
Parameters
-
IGraphNodeBase
signal - A [..., frames, frame_length] `Tensor`. All dimensions may be unknown, and rank must be at least 2.
-
int
frame_step - An integer or scalar `Tensor` denoting overlap offsets. Must be less than or equal to `frame_length`.
-
string
name - An optional name for the operation.
Returns
-
Tensor
- A `Tensor` with shape `[..., output_size]` containing the overlap-added frames of `signal`'s inner-most two dimensions.
object overlap_and_add_dyn(object signal, object frame_step, object name)
Reconstructs a signal from a framed representation. Adds potentially overlapping frames of a signal with shape
`[..., frames, frame_length]`, offsetting subsequent frames by `frame_step`.
The resulting tensor has shape `[..., output_size]` where output_size = (frames - 1) * frame_step + frame_length
Parameters
-
object
signal - A [..., frames, frame_length] `Tensor`. All dimensions may be unknown, and rank must be at least 2.
-
object
frame_step - An integer or scalar `Tensor` denoting overlap offsets. Must be less than or equal to `frame_length`.
-
object
name - An optional name for the operation.
Returns
-
object
- A `Tensor` with shape `[..., output_size]` containing the overlap-added frames of `signal`'s inner-most two dimensions.
object stft(IGraphNodeBase signals, int frame_length, int frame_step, Nullable<int> fft_length, ImplicitContainer<T> window_fn, bool pad_end, string name)
Computes the [Short-time Fourier Transform][stft] of `signals`. Implemented with GPU-compatible ops and supports gradients.
Parameters
-
IGraphNodeBase
signals - A `[..., samples]` `float32` `Tensor` of real-valued signals.
-
int
frame_length - An integer scalar `Tensor`. The window length in samples.
-
int
frame_step - An integer scalar `Tensor`. The number of samples to step.
-
Nullable<int>
fft_length - An integer scalar `Tensor`. The size of the FFT to apply. If not provided, uses the smallest power of 2 enclosing `frame_length`.
-
ImplicitContainer<T>
window_fn - A callable that takes a window length and a `dtype` keyword argument and returns a `[window_length]` `Tensor` of samples in the provided datatype. If set to `None`, no windowing is used.
-
bool
pad_end - Whether to pad the end of `signals` with zeros when the provided frame length and step produces a frame that lies partially past its end.
-
string
name - An optional name for the operation.
Returns
-
object
- A `[..., frames, fft_unique_bins]` `Tensor` of `complex64` STFT values where `fft_unique_bins` is `fft_length // 2 + 1` (the unique components of the FFT).
object stft_dyn(object signals, object frame_length, object frame_step, object fft_length, ImplicitContainer<T> window_fn, ImplicitContainer<T> pad_end, object name)
Computes the [Short-time Fourier Transform][stft] of `signals`. Implemented with GPU-compatible ops and supports gradients.
Parameters
-
object
signals - A `[..., samples]` `float32` `Tensor` of real-valued signals.
-
object
frame_length - An integer scalar `Tensor`. The window length in samples.
-
object
frame_step - An integer scalar `Tensor`. The number of samples to step.
-
object
fft_length - An integer scalar `Tensor`. The size of the FFT to apply. If not provided, uses the smallest power of 2 enclosing `frame_length`.
-
ImplicitContainer<T>
window_fn - A callable that takes a window length and a `dtype` keyword argument and returns a `[window_length]` `Tensor` of samples in the provided datatype. If set to `None`, no windowing is used.
-
ImplicitContainer<T>
pad_end - Whether to pad the end of `signals` with zeros when the provided frame length and step produces a frame that lies partially past its end.
-
object
name - An optional name for the operation.
Returns
-
object
- A `[..., frames, fft_unique_bins]` `Tensor` of `complex64` STFT values where `fft_unique_bins` is `fft_length // 2 + 1` (the unique components of the FFT).