Type tf.audio
Namespace tensorflow
Methods
Properties
Public static methods
object decode_wav(IGraphNodeBase contents, int desired_channels, int desired_samples, string name)
Decode a 16-bit PCM WAV file to a float tensor. The -32768 to 32767 signed 16-bit values will be scaled to -1.0 to 1.0 in float. When desired_channels is set, if the input contains fewer channels than this
then the last channel will be duplicated to give the requested number, else if
the input has more channels than requested then the additional channels will be
ignored. If desired_samples is set, then the audio will be cropped or padded with zeroes
to the requested length. The first output contains a Tensor with the content of the audio samples. The
lowest dimension will be the number of channels, and the second will be the
number of samples. For example, a ten-sample-long stereo WAV file should give an
output shape of [10, 2].
Parameters
-
IGraphNodeBase
contents - A `Tensor` of type `string`. The WAV-encoded audio, usually from a file.
-
int
desired_channels - An optional `int`. Defaults to `-1`. Number of sample channels wanted.
-
int
desired_samples - An optional `int`. Defaults to `-1`. Length of audio requested.
-
string
name - A name for the operation (optional).
Returns
-
object
- A tuple of `Tensor` objects (audio, sample_rate).
object decode_wav(IGraphNodeBase contents, int desired_channels, string desired_samples, string name)
Decode a 16-bit PCM WAV file to a float tensor. The -32768 to 32767 signed 16-bit values will be scaled to -1.0 to 1.0 in float. When desired_channels is set, if the input contains fewer channels than this
then the last channel will be duplicated to give the requested number, else if
the input has more channels than requested then the additional channels will be
ignored. If desired_samples is set, then the audio will be cropped or padded with zeroes
to the requested length. The first output contains a Tensor with the content of the audio samples. The
lowest dimension will be the number of channels, and the second will be the
number of samples. For example, a ten-sample-long stereo WAV file should give an
output shape of [10, 2].
Parameters
-
IGraphNodeBase
contents - A `Tensor` of type `string`. The WAV-encoded audio, usually from a file.
-
int
desired_channels - An optional `int`. Defaults to `-1`. Number of sample channels wanted.
-
string
desired_samples - An optional `int`. Defaults to `-1`. Length of audio requested.
-
string
name - A name for the operation (optional).
Returns
-
object
- A tuple of `Tensor` objects (audio, sample_rate).
object decode_wav_dyn(object contents, ImplicitContainer<T> desired_channels, ImplicitContainer<T> desired_samples, object name)
Decode a 16-bit PCM WAV file to a float tensor. The -32768 to 32767 signed 16-bit values will be scaled to -1.0 to 1.0 in float. When desired_channels is set, if the input contains fewer channels than this
then the last channel will be duplicated to give the requested number, else if
the input has more channels than requested then the additional channels will be
ignored. If desired_samples is set, then the audio will be cropped or padded with zeroes
to the requested length. The first output contains a Tensor with the content of the audio samples. The
lowest dimension will be the number of channels, and the second will be the
number of samples. For example, a ten-sample-long stereo WAV file should give an
output shape of [10, 2].
Parameters
-
object
contents - A `Tensor` of type `string`. The WAV-encoded audio, usually from a file.
-
ImplicitContainer<T>
desired_channels - An optional `int`. Defaults to `-1`. Number of sample channels wanted.
-
ImplicitContainer<T>
desired_samples - An optional `int`. Defaults to `-1`. Length of audio requested.
-
object
name - A name for the operation (optional).
Returns
-
object
- A tuple of `Tensor` objects (audio, sample_rate).
Tensor encode_wav(IGraphNodeBase audio, IGraphNodeBase sample_rate, string name)
Encode audio data using the WAV file format. This operation will generate a string suitable to be saved out to create a.wav
audio file. It will be encoded in the 16-bit PCM format. It takes in float
values in the range -1.0f to 1.0f, and any outside that value will be clamped to
that range. `audio` is a 2-D float Tensor of shape `[length, channels]`.
`sample_rate` is a scalar Tensor holding the rate to use (e.g. 44100).
Parameters
-
IGraphNodeBase
audio - A `Tensor` of type `float32`. 2-D with shape `[length, channels]`.
-
IGraphNodeBase
sample_rate - A `Tensor` of type `int32`. Scalar containing the sample frequency.
-
string
name - A name for the operation (optional).
Returns
-
Tensor
- A `Tensor` of type `string`.
object encode_wav_dyn(object audio, object sample_rate, object name)
Encode audio data using the WAV file format. This operation will generate a string suitable to be saved out to create a.wav
audio file. It will be encoded in the 16-bit PCM format. It takes in float
values in the range -1.0f to 1.0f, and any outside that value will be clamped to
that range. `audio` is a 2-D float Tensor of shape `[length, channels]`.
`sample_rate` is a scalar Tensor holding the rate to use (e.g. 44100).
Parameters
-
object
audio - A `Tensor` of type `float32`. 2-D with shape `[length, channels]`.
-
object
sample_rate - A `Tensor` of type `int32`. Scalar containing the sample frequency.
-
object
name - A name for the operation (optional).
Returns
-
object
- A `Tensor` of type `string`.