LostTech.TensorFlow : API Documentation

Type tf.audio

Namespace tensorflow

Public static methods

object decode_wav(IGraphNodeBase contents, int desired_channels, int desired_samples, string name)

Decode a 16-bit PCM WAV file to a float tensor.

The -32768 to 32767 signed 16-bit values will be scaled to -1.0 to 1.0 in float.

When desired_channels is set, if the input contains fewer channels than this then the last channel will be duplicated to give the requested number, else if the input has more channels than requested then the additional channels will be ignored.

If desired_samples is set, then the audio will be cropped or padded with zeroes to the requested length.

The first output contains a Tensor with the content of the audio samples. The lowest dimension will be the number of channels, and the second will be the number of samples. For example, a ten-sample-long stereo WAV file should give an output shape of [10, 2].
Parameters
IGraphNodeBase contents
A `Tensor` of type `string`. The WAV-encoded audio, usually from a file.
int desired_channels
An optional `int`. Defaults to `-1`. Number of sample channels wanted.
int desired_samples
An optional `int`. Defaults to `-1`. Length of audio requested.
string name
A name for the operation (optional).
Returns
object
A tuple of `Tensor` objects (audio, sample_rate).

object decode_wav(IGraphNodeBase contents, int desired_channels, string desired_samples, string name)

Decode a 16-bit PCM WAV file to a float tensor.

The -32768 to 32767 signed 16-bit values will be scaled to -1.0 to 1.0 in float.

When desired_channels is set, if the input contains fewer channels than this then the last channel will be duplicated to give the requested number, else if the input has more channels than requested then the additional channels will be ignored.

If desired_samples is set, then the audio will be cropped or padded with zeroes to the requested length.

The first output contains a Tensor with the content of the audio samples. The lowest dimension will be the number of channels, and the second will be the number of samples. For example, a ten-sample-long stereo WAV file should give an output shape of [10, 2].
Parameters
IGraphNodeBase contents
A `Tensor` of type `string`. The WAV-encoded audio, usually from a file.
int desired_channels
An optional `int`. Defaults to `-1`. Number of sample channels wanted.
string desired_samples
An optional `int`. Defaults to `-1`. Length of audio requested.
string name
A name for the operation (optional).
Returns
object
A tuple of `Tensor` objects (audio, sample_rate).

object decode_wav_dyn(object contents, ImplicitContainer<T> desired_channels, ImplicitContainer<T> desired_samples, object name)

Decode a 16-bit PCM WAV file to a float tensor.

The -32768 to 32767 signed 16-bit values will be scaled to -1.0 to 1.0 in float.

When desired_channels is set, if the input contains fewer channels than this then the last channel will be duplicated to give the requested number, else if the input has more channels than requested then the additional channels will be ignored.

If desired_samples is set, then the audio will be cropped or padded with zeroes to the requested length.

The first output contains a Tensor with the content of the audio samples. The lowest dimension will be the number of channels, and the second will be the number of samples. For example, a ten-sample-long stereo WAV file should give an output shape of [10, 2].
Parameters
object contents
A `Tensor` of type `string`. The WAV-encoded audio, usually from a file.
ImplicitContainer<T> desired_channels
An optional `int`. Defaults to `-1`. Number of sample channels wanted.
ImplicitContainer<T> desired_samples
An optional `int`. Defaults to `-1`. Length of audio requested.
object name
A name for the operation (optional).
Returns
object
A tuple of `Tensor` objects (audio, sample_rate).

Tensor encode_wav(IGraphNodeBase audio, IGraphNodeBase sample_rate, string name)

Encode audio data using the WAV file format.

This operation will generate a string suitable to be saved out to create a.wav audio file. It will be encoded in the 16-bit PCM format. It takes in float values in the range -1.0f to 1.0f, and any outside that value will be clamped to that range.

`audio` is a 2-D float Tensor of shape `[length, channels]`. `sample_rate` is a scalar Tensor holding the rate to use (e.g. 44100).
Parameters
IGraphNodeBase audio
A `Tensor` of type `float32`. 2-D with shape `[length, channels]`.
IGraphNodeBase sample_rate
A `Tensor` of type `int32`. Scalar containing the sample frequency.
string name
A name for the operation (optional).
Returns
Tensor
A `Tensor` of type `string`.

object encode_wav_dyn(object audio, object sample_rate, object name)

Encode audio data using the WAV file format.

This operation will generate a string suitable to be saved out to create a.wav audio file. It will be encoded in the 16-bit PCM format. It takes in float values in the range -1.0f to 1.0f, and any outside that value will be clamped to that range.

`audio` is a 2-D float Tensor of shape `[length, channels]`. `sample_rate` is a scalar Tensor holding the rate to use (e.g. 44100).
Parameters
object audio
A `Tensor` of type `float32`. 2-D with shape `[length, channels]`.
object sample_rate
A `Tensor` of type `int32`. Scalar containing the sample frequency.
object name
A name for the operation (optional).
Returns
object
A `Tensor` of type `string`.

Public properties

PythonFunctionContainer decode_wav_fn get;

PythonFunctionContainer encode_wav_fn get;