RandomDataset - LostTech.TensorFlow Documentation

object apply(PythonFunctionContainer transformation_func)

Applies a transformation function to this dataset.

`apply` enables chaining of custom `Dataset` transformations, which are represented as functions that take one `Dataset` argument and return a transformed `Dataset`.

For example:

``` dataset = (dataset.map(lambda x: x ** 2) .apply(group_by_window(key_func, reduce_func, window_size)) .map(lambda x: x ** 3)) ```

Parameters

PythonFunctionContainer transformation_func: A function that takes one `Dataset` argument and returns a `Dataset`.

Returns

object

object filter(PythonFunctionContainer predicate)

Filters this dataset according to `predicate`.

Parameters

PythonFunctionContainer predicate: A function mapping a dataset element to a boolean.

Returns

object

Show Example

d = tf.data.Dataset.from_tensor_slices([1, 2, 3]) 
 d = d.filter(lambda x: x < 3)  # ==> [1, 2]  # `tf.math.equal(x, y)` is required for equality comparison
def filter_fn(x):
  return tf.math.equal(x, 1) 
 d = d.filter(filter_fn)  # ==> [1]

object filter_dyn(object predicate)

Filters this dataset according to `predicate`.

Parameters

object predicate: A function mapping a dataset element to a boolean.

Returns

object

Show Example

d = tf.data.Dataset.from_tensor_slices([1, 2, 3]) 
 d = d.filter(lambda x: x < 3)  # ==> [1, 2]  # `tf.math.equal(x, y)` is required for equality comparison
def filter_fn(x):
  return tf.math.equal(x, 1) 
 d = d.filter(filter_fn)  # ==> [1]

object filter_with_legacy_function_dyn(object predicate)

Filters this dataset according to `predicate`. (deprecated)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.filter()

NOTE: This is an escape hatch for existing uses of `filter` that do not work with V2 functions. New uses are strongly discouraged and existing uses should migrate to `filter` as this method will be removed in V2.

Parameters

object predicate: A function mapping a nested structure of tensors (having shapes and types defined by `self.output_shapes` and `self.output_types`) to a scalar tf.bool tensor.

Returns

object

object flat_map_dyn(object map_func)

Maps `map_func` across this dataset and flattens the result.

Use `flat_map` if you want to make sure that the order of your dataset stays the same. For example, to flatten a dataset of batches into a dataset of their elements: `tf.data.Dataset.interleave()` is a generalization of `flat_map`, since `flat_map` produces the same output as `tf.data.Dataset.interleave(cycle_length=1)`

Parameters

object map_func: A function mapping a dataset element to a dataset.

Returns

object

Show Example

a = Dataset.from_tensor_slices([ [1, 2, 3], [4, 5, 6], [7, 8, 9] ])  a.flat_map(lambda x: Dataset.from_tensor_slices(x + 1)) # ==>
#  [ 2, 3, 4, 5, 6, 7, 8, 9, 10 ]

object interleave(PythonFunctionContainer map_func, ImplicitContainer<T> cycle_length, int block_length, Nullable<int> num_parallel_calls)

Maps `map_func` across this dataset, and interleaves the results.

For example, you can use `Dataset.interleave()` to process many input files concurrently: The `cycle_length` and `block_length` arguments control the order in which elements are produced. `cycle_length` controls the number of input elements that are processed concurrently. If you set `cycle_length` to 1, this transformation will handle one input element at a time, and will produce identical results to tf.data.Dataset.flat_map. In general, this transformation will apply `map_func` to `cycle_length` input elements, open iterators on the returned `Dataset` objects, and cycle through them producing `block_length` consecutive elements from each iterator, and consuming the next input element each time it reaches the end of an iterator. NOTE: The order of elements yielded by this transformation is deterministic, as long as `map_func` is a pure function. If `map_func` contains any stateful operations, the order in which that state is accessed is undefined.

Parameters

PythonFunctionContainer map_func: A function mapping a dataset element to a dataset.
ImplicitContainer<T> cycle_length: (Optional.) The number of input elements that will be processed concurrently. If not specified, the value will be derived from the number of available CPU cores. If the `num_parallel_calls` argument is set to tf.data.experimental.AUTOTUNE, the `cycle_length` argument also identifies the maximum degree of parallelism.
int block_length: (Optional.) The number of consecutive elements to produce from each input element before cycling to another input element.
Nullable<int> num_parallel_calls: (Optional.) If specified, the implementation creates a threadpool, which is used to fetch inputs from cycle elements asynchronously and in parallel. The default behavior is to fetch inputs from cycle elements synchronously with no parallelism. If the value tf.data.experimental.AUTOTUNE is used, then the number of parallel calls is set dynamically based on available CPU.

Returns

object

Show Example

# Preprocess 4 files concurrently, and interleave blocks of 16 records from
            # each file.
            filenames = ["/var/data/file1.txt", "/var/data/file2.txt",...]
            dataset = (Dataset.from_tensor_slices(filenames)
                      .interleave(lambda x:
                           TextLineDataset(x).map(parse_fn, num_parallel_calls=1),
                           cycle_length=4, block_length=16))

object interleave_dyn(object map_func, ImplicitContainer<T> cycle_length, ImplicitContainer<T> block_length, object num_parallel_calls)

Maps `map_func` across this dataset, and interleaves the results.

For example, you can use `Dataset.interleave()` to process many input files concurrently: The `cycle_length` and `block_length` arguments control the order in which elements are produced. `cycle_length` controls the number of input elements that are processed concurrently. If you set `cycle_length` to 1, this transformation will handle one input element at a time, and will produce identical results to tf.data.Dataset.flat_map. In general, this transformation will apply `map_func` to `cycle_length` input elements, open iterators on the returned `Dataset` objects, and cycle through them producing `block_length` consecutive elements from each iterator, and consuming the next input element each time it reaches the end of an iterator. NOTE: The order of elements yielded by this transformation is deterministic, as long as `map_func` is a pure function. If `map_func` contains any stateful operations, the order in which that state is accessed is undefined.

Parameters

object map_func: A function mapping a dataset element to a dataset.
ImplicitContainer<T> cycle_length: (Optional.) The number of input elements that will be processed concurrently. If not specified, the value will be derived from the number of available CPU cores. If the `num_parallel_calls` argument is set to tf.data.experimental.AUTOTUNE, the `cycle_length` argument also identifies the maximum degree of parallelism.
ImplicitContainer<T> block_length: (Optional.) The number of consecutive elements to produce from each input element before cycling to another input element.
object num_parallel_calls: (Optional.) If specified, the implementation creates a threadpool, which is used to fetch inputs from cycle elements asynchronously and in parallel. The default behavior is to fetch inputs from cycle elements synchronously with no parallelism. If the value tf.data.experimental.AUTOTUNE is used, then the number of parallel calls is set dynamically based on available CPU.

Returns

object

Show Example

# Preprocess 4 files concurrently, and interleave blocks of 16 records from
            # each file.
            filenames = ["/var/data/file1.txt", "/var/data/file2.txt",...]
            dataset = (Dataset.from_tensor_slices(filenames)
                      .interleave(lambda x:
                           TextLineDataset(x).map(parse_fn, num_parallel_calls=1),
                           cycle_length=4, block_length=16))

object reduce(int64 initial_state, PythonFunctionContainer reduce_func)

object reduce(ValueTuple<IGraphNodeBase, object> initial_state, PythonFunctionContainer reduce_func)

object reduce(int initial_state, PythonFunctionContainer reduce_func)

object reduce(IGraphNodeBase initial_state, PythonFunctionContainer reduce_func)

object reduce_dyn(object initial_state, object reduce_func)

_UnbatchDataset unbatch()

Splits elements of a dataset into multiple elements on the batch dimension. (deprecated)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.unbatch()`.

For example, if elements of the dataset are shaped `[B, a0, a1,...]`, where `B` may vary for each input element, then for each element in the dataset, the unbatched dataset will contain `B` consecutive elements of shape `[a0, a1,...]`.

Returns

_UnbatchDataset: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

Show Example

# NOTE: The following example uses `{... }` to represent the contents
            # of a dataset.
            a = { ['a', 'b', 'c'], ['a', 'b'], ['a', 'b', 'c', 'd'] }  a.apply(tf.data.experimental.unbatch()) == {
    'a', 'b', 'c', 'a', 'b', 'a', 'b', 'c', 'd'}

object unbatch_dyn()

Splits elements of a dataset into multiple elements on the batch dimension. (deprecated)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.unbatch()`.

For example, if elements of the dataset are shaped `[B, a0, a1,...]`, where `B` may vary for each input element, then for each element in the dataset, the unbatched dataset will contain `B` consecutive elements of shape `[a0, a1,...]`.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

Show Example

# NOTE: The following example uses `{... }` to represent the contents
            # of a dataset.
            a = { ['a', 'b', 'c'], ['a', 'b'], ['a', 'b', 'c', 'd'] }  a.apply(tf.data.experimental.unbatch()) == {
    'a', 'b', 'c', 'a', 'b', 'a', 'b', 'c', 'd'}

object window(int size, Nullable<int> shift, int stride, bool drop_remainder)

Combines (nests of) input elements into a dataset of (nests of) windows.

A "window" is a finite dataset of flat elements of size `size` (or possibly fewer if there are not enough input elements to fill the window and `drop_remainder` evaluates to false).

The `stride` argument determines the stride of the input elements, and the `shift` argument determines the shift of the window.

For example, letting {...} to represent a Dataset:

- `tf.data.Dataset.range(7).window(2)` produces `{{0, 1}, {2, 3}, {4, 5}, {6}}` - `tf.data.Dataset.range(7).window(3, 2, 1, True)` produces `{{0, 1, 2}, {2, 3, 4}, {4, 5, 6}}` - `tf.data.Dataset.range(7).window(3, 1, 2, True)` produces `{{0, 2, 4}, {1, 3, 5}, {2, 4, 6}}`

Note that when the `window` transformation is applied to a dataset of nested elements, it produces a dataset of nested windows.

For example:

- `tf.data.Dataset.from_tensor_slices((range(4), range(4))).window(2)` produces `{({0, 1}, {0, 1}), ({2, 3}, {2, 3})}` - `tf.data.Dataset.from_tensor_slices({"a": range(4)}).window(2)` produces `{{"a": {0, 1}}, {"a": {2, 3}}}`

Parameters

int size: A tf.int64 scalar tf.Tensor, representing the number of elements of the input dataset to combine into a window.
Nullable<int> shift: (Optional.) A tf.int64 scalar tf.Tensor, representing the forward shift of the sliding window in each iteration. Defaults to `size`.
int stride: (Optional.) A tf.int64 scalar tf.Tensor, representing the stride of the input elements in the sliding window.
bool drop_remainder: (Optional.) A tf.bool scalar tf.Tensor, representing whether a window should be dropped in case its size is smaller than `window_size`.

Returns

object

object window_dyn(object size, object shift, ImplicitContainer<T> stride, ImplicitContainer<T> drop_remainder)

Combines (nests of) input elements into a dataset of (nests of) windows.

A "window" is a finite dataset of flat elements of size `size` (or possibly fewer if there are not enough input elements to fill the window and `drop_remainder` evaluates to false).

The `stride` argument determines the stride of the input elements, and the `shift` argument determines the shift of the window.

For example, letting {...} to represent a Dataset:

- `tf.data.Dataset.range(7).window(2)` produces `{{0, 1}, {2, 3}, {4, 5}, {6}}` - `tf.data.Dataset.range(7).window(3, 2, 1, True)` produces `{{0, 1, 2}, {2, 3, 4}, {4, 5, 6}}` - `tf.data.Dataset.range(7).window(3, 1, 2, True)` produces `{{0, 2, 4}, {1, 3, 5}, {2, 4, 6}}`

Note that when the `window` transformation is applied to a dataset of nested elements, it produces a dataset of nested windows.

For example:

- `tf.data.Dataset.from_tensor_slices((range(4), range(4))).window(2)` produces `{({0, 1}, {0, 1}), ({2, 3}, {2, 3})}` - `tf.data.Dataset.from_tensor_slices({"a": range(4)}).window(2)` produces `{{"a": {0, 1}}, {"a": {2, 3}}}`

Parameters

object size: A tf.int64 scalar tf.Tensor, representing the number of elements of the input dataset to combine into a window.
object shift: (Optional.) A tf.int64 scalar tf.Tensor, representing the forward shift of the sliding window in each iteration. Defaults to `size`.
ImplicitContainer<T> stride: (Optional.) A tf.int64 scalar tf.Tensor, representing the stride of the input elements in the sliding window.
ImplicitContainer<T> drop_remainder: (Optional.) A tf.bool scalar tf.Tensor, representing whether a window should be dropped in case its size is smaller than `window_size`.

Methods

Properties

Public instance methods

object apply(PythonFunctionContainer transformation_func)

Parameters

Returns

object filter(PythonFunctionContainer predicate)

Parameters

Returns

object filter_dyn(object predicate)

Parameters

Returns

object filter_with_legacy_function_dyn(object predicate)

Parameters

Returns

object flat_map_dyn(object map_func)

Parameters

Returns

object interleave(PythonFunctionContainer map_func, ImplicitContainer<T> cycle_length, int block_length, Nullable<int> num_parallel_calls)

Parameters

Returns

object interleave_dyn(object map_func, ImplicitContainer<T> cycle_length, ImplicitContainer<T> block_length, object num_parallel_calls)

Parameters

Returns

object reduce(int64 initial_state, PythonFunctionContainer reduce_func)

object reduce(ValueTuple<IGraphNodeBase, object> initial_state, PythonFunctionContainer reduce_func)

object reduce(int initial_state, PythonFunctionContainer reduce_func)

object reduce(IGraphNodeBase initial_state, PythonFunctionContainer reduce_func)

object reduce_dyn(object initial_state, object reduce_func)

_UnbatchDataset unbatch()

Returns

object unbatch_dyn()

Returns

object window(int size, Nullable<int> shift, int stride, bool drop_remainder)

Parameters

Returns

object window_dyn(object size, object shift, ImplicitContainer<T> stride, ImplicitContainer<T> drop_remainder)

Parameters

Returns

Public properties

object element_spec get;

object element_spec_dyn get;

object output_classes get;

object output_classes_dyn get;

object output_shapes get;

object output_shapes_dyn get;

object output_types get;

object output_types_dyn get;

object PythonObject get;