tf.data.experimental - LostTech.TensorFlow Documentation

object bucket_by_sequence_length(object element_length_func, IEnumerable<int> bucket_boundaries, IEnumerable<int> bucket_batch_sizes, object padded_shapes, object padding_values, bool pad_to_bucket_boundary, bool no_padding, bool drop_remainder)

A transformation that buckets elements in a `Dataset` by length.

Elements of the `Dataset` are grouped together by length and then are padded and batched.

This is useful for sequence tasks in which the elements have variable length. Grouping together elements that have similar lengths reduces the total fraction of padding in a batch which increases training step efficiency.

Parameters

object element_length_func: function from element in `Dataset` to tf.int32, determines the length of the element, which will determine the bucket it goes into.
IEnumerable<int> bucket_boundaries: `list`, upper length boundaries of the buckets.
IEnumerable<int> bucket_batch_sizes: `list`, batch size per bucket. Length should be `len(bucket_boundaries) + 1`.
object padded_shapes: Nested structure of tf.TensorShape to pass to tf.data.Dataset.padded_batch. If not provided, will use `dataset.output_shapes`, which will result in variable length dimensions being padded out to the maximum length in each batch.
object padding_values: Values to pad with, passed to tf.data.Dataset.padded_batch. Defaults to padding with 0.
bool pad_to_bucket_boundary: bool, if `False`, will pad dimensions with unknown size to maximum length in batch. If `True`, will pad dimensions with unknown size to bucket boundary minus 1 (i.e., the maximum length in each bucket), and caller must ensure that the source `Dataset` does not contain any elements with length longer than `max(bucket_boundaries)`.
bool no_padding: `bool`, indicates whether to pad the batch features (features need to be either of type tf.SparseTensor or of same shape).
bool drop_remainder: (Optional.) A tf.bool scalar tf.Tensor, representing whether the last batch should be dropped in the case it has fewer than `batch_size` elements; the default behavior is not to drop the smaller batch.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object bucket_by_sequence_length_dyn(object element_length_func, object bucket_boundaries, object bucket_batch_sizes, object padded_shapes, object padding_values, ImplicitContainer<T> pad_to_bucket_boundary, ImplicitContainer<T> no_padding, ImplicitContainer<T> drop_remainder)

object bytes_produced_stats(IEnumerable<string> tag)

Records the number of bytes produced by each element of the input dataset.

To consume the statistics, associate a `StatsAggregator` with the output dataset.

Parameters

IEnumerable<string> tag: String. All statistics recorded by the returned transformation will be associated with the given `tag`.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object bytes_produced_stats(string tag)

Records the number of bytes produced by each element of the input dataset.

To consume the statistics, associate a `StatsAggregator` with the output dataset.

Parameters

string tag: String. All statistics recorded by the returned transformation will be associated with the given `tag`.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object bytes_produced_stats_dyn(object tag)

Records the number of bytes produced by each element of the input dataset.

To consume the statistics, associate a `StatsAggregator` with the output dataset.

Parameters

object tag: String. All statistics recorded by the returned transformation will be associated with the given `tag`.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

Tensor cardinality(Dataset dataset)

object cardinality_dyn(object dataset)

Returns the cardinality of `dataset`, if known.

The operation returns the cardinality of `dataset`. The operation may return tf.data.experimental.INFINITE_CARDINALITY if `dataset` contains an infinite number of elements or tf.data.experimental.UNKNOWN_CARDINALITY if the analysis fails to determine the number of elements in `dataset` (e.g. when the dataset source is a file).

Parameters

object dataset: A tf.data.Dataset for which to determine cardinality.

Returns

object: A scalar tf.int64 `Tensor` representing the cardinality of `dataset`. If the cardinality is infinite or unknown, the operation returns the named constant `INFINITE_CARDINALITY` and `UNKNOWN_CARDINALITY` respectively.

object copy_to_device(string target_device, string source_device)

A transformation that copies dataset elements to the given `target_device`.

Parameters

string target_device: The name of a device to which elements will be copied.
string source_device: The original device on which `input_dataset` will be placed.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object dense_to_sparse_batch(int batch_size, IEnumerable<int> row_shape)

A transformation that batches ragged elements into tf.SparseTensors.

Like `Dataset.padded_batch()`, this transformation combines multiple consecutive elements of the dataset, which might have different shapes, into a single element. The resulting element has three components (`indices`, `values`, and `dense_shape`), which comprise a tf.SparseTensor that represents the same data. The `row_shape` represents the dense shape of each row in the resulting tf.SparseTensor, to which the effective batch size is prepended.

Parameters

int batch_size: A tf.int64 scalar tf.Tensor, representing the number of consecutive elements of this dataset to combine in a single batch.
IEnumerable<int> row_shape: A tf.TensorShape or tf.int64 vector tensor-like object representing the equivalent dense shape of a row in the resulting tf.SparseTensor. Each element of this dataset must have the same rank as `row_shape`, and must have size less than or equal to `row_shape` in each dimension.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

Show Example

# NOTE: The following examples use `{... }` to represent the
            # contents of a dataset.
            a = { ['a', 'b', 'c'], ['a', 'b'], ['a', 'b', 'c', 'd'] }  a.apply(tf.data.experimental.dense_to_sparse_batch(
    batch_size=2, row_shape=[6])) ==
{
    ([[0, 0], [0, 1], [0, 2], [1, 0], [1, 1]],  # indices
     ['a', 'b', 'c', 'a', 'b'],                 # values
     [2, 6]),                                   # dense_shape
    ([[0, 0], [0, 1], [0, 2], [0, 3]],
     ['a', 'b', 'c', 'd'],
     [1, 6])
}

_VariantDataset from_variant(IGraphNodeBase variant, TensorSpec structure)

Constructs a dataset from the given variant and structure.

Parameters

IGraphNodeBase variant: A scalar tf.variant tensor representing a dataset.
TensorSpec structure: A tf.data.experimental.Structure object representing the structure of each element in the dataset.

Returns

_VariantDataset: A tf.data.Dataset instance.

_VariantDataset from_variant(IGraphNodeBase variant, ValueTuple<TensorSpec, object, object> structure)

Constructs a dataset from the given variant and structure.

Parameters

IGraphNodeBase variant: A scalar tf.variant tensor representing a dataset.
ValueTuple<TensorSpec, object, object> structure: A tf.data.experimental.Structure object representing the structure of each element in the dataset.

Returns

_VariantDataset: A tf.data.Dataset instance.

object from_variant_dyn(object variant, object structure)

Constructs a dataset from the given variant and structure.

Parameters

object variant: A scalar tf.variant tensor representing a dataset.
object structure: A tf.data.experimental.Structure object representing the structure of each element in the dataset.

Returns

object: A tf.data.Dataset instance.

_OptionalImpl get_next_as_optional(Trackable iterator)

Returns an `Optional` that contains the next value from the iterator.

If `iterator` has reached the end of the sequence, the returned `Optional` will have no value.

Parameters

Trackable iterator: A `tf.compat.v1.data.Iterator` object.

Returns

_OptionalImpl: An `Optional` object representing the next value from the iterator (if it has one) or no value.

_OptionalImpl get_next_as_optional(IEnumerable<object> iterator)

Returns an `Optional` that contains the next value from the iterator.

If `iterator` has reached the end of the sequence, the returned `Optional` will have no value.

Parameters

IEnumerable<object> iterator: A `tf.compat.v1.data.Iterator` object.

Returns

_OptionalImpl: An `Optional` object representing the next value from the iterator (if it has one) or no value.

_OptionalImpl get_next_as_optional(IEnumerator<object> iterator)

Returns an `Optional` that contains the next value from the iterator.

If `iterator` has reached the end of the sequence, the returned `Optional` will have no value.

Parameters

IEnumerator<object> iterator: A `tf.compat.v1.data.Iterator` object.

Returns

_OptionalImpl: An `Optional` object representing the next value from the iterator (if it has one) or no value.

_OptionalImpl get_next_as_optional(MultiDeviceIteratorV2 iterator)

Returns an `Optional` that contains the next value from the iterator.

If `iterator` has reached the end of the sequence, the returned `Optional` will have no value.

Parameters

MultiDeviceIteratorV2 iterator: A `tf.compat.v1.data.Iterator` object.

Returns

_OptionalImpl: An `Optional` object representing the next value from the iterator (if it has one) or no value.

_OptionalImpl get_next_as_optional(object iterator)

Returns an `Optional` that contains the next value from the iterator.

If `iterator` has reached the end of the sequence, the returned `Optional` will have no value.

Parameters

object iterator: A `tf.compat.v1.data.Iterator` object.

Returns

_OptionalImpl: An `Optional` object representing the next value from the iterator (if it has one) or no value.

object get_next_as_optional_dyn(object iterator)

Returns an `Optional` that contains the next value from the iterator.

If `iterator` has reached the end of the sequence, the returned `Optional` will have no value.

Parameters

object iterator: A `tf.compat.v1.data.Iterator` object.

Returns

object: An `Optional` object representing the next value from the iterator (if it has one) or no value.

object get_single_element(Dataset dataset)

object get_structure(object dataset_or_iterator)

Returns the type specification of an element of a `Dataset` or `Iterator`.

Parameters

object dataset_or_iterator: A tf.data.Dataset or tf.data.Iterator.

Returns

object: A nested structure of tf.TypeSpec objects matching the structure of an element of `dataset_or_iterator` and spacifying the type of individal components.

object get_structure(IEnumerable<IGraphNodeBase> dataset_or_iterator)

Returns the type specification of an element of a `Dataset` or `Iterator`.

Parameters

IEnumerable<IGraphNodeBase> dataset_or_iterator: A tf.data.Dataset or tf.data.Iterator.

Returns

object: A nested structure of tf.TypeSpec objects matching the structure of an element of `dataset_or_iterator` and spacifying the type of individal components.

object get_structure_dyn(object dataset_or_iterator)

Returns the type specification of an element of a `Dataset` or `Iterator`.

Parameters

object dataset_or_iterator: A tf.data.Dataset or tf.data.Iterator.

Returns

object: A nested structure of tf.TypeSpec objects matching the structure of an element of `dataset_or_iterator` and spacifying the type of individal components.

object group_by_reducer(PythonFunctionContainer key_func, Reducer reducer)

object group_by_window(PythonFunctionContainer key_func, PythonFunctionContainer reduce_func, Nullable<int> window_size, object window_size_func)

object latency_stats(string tag)

Records the latency of producing each element of the input dataset.

To consume the statistics, associate a `StatsAggregator` with the output dataset.

Parameters

string tag: String. All statistics recorded by the returned transformation will be associated with the given `tag`.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object latency_stats(IEnumerable<string> tag)

Records the latency of producing each element of the input dataset.

To consume the statistics, associate a `StatsAggregator` with the output dataset.

Parameters

IEnumerable<string> tag: String. All statistics recorded by the returned transformation will be associated with the given `tag`.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object latency_stats_dyn(object tag)

Records the latency of producing each element of the input dataset.

To consume the statistics, associate a `StatsAggregator` with the output dataset.

Parameters

object tag: String. All statistics recorded by the returned transformation will be associated with the given `tag`.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

DatasetV1Adapter make_batched_features_dataset(IEnumerable<object> file_pattern, int batch_size, IDictionary<string, object> features, ImplicitContainer<T> reader, string label_key, object reader_args, Nullable<int> num_epochs, bool shuffle, int shuffle_buffer_size, Nullable<int> shuffle_seed, object prefetch_buffer_size, Nullable<int> reader_num_threads, Nullable<int> parser_num_threads, bool sloppy_ordering, bool drop_final_batch)

DatasetV1Adapter make_batched_features_dataset(IEnumerable<object> file_pattern, int batch_size, IDictionary<string, object> features, PythonClassContainer reader, string label_key, object reader_args, Nullable<int> num_epochs, bool shuffle, int shuffle_buffer_size, Nullable<int> shuffle_seed, object prefetch_buffer_size, Nullable<int> reader_num_threads, Nullable<int> parser_num_threads, bool sloppy_ordering, bool drop_final_batch)

DatasetV1Adapter make_csv_dataset(IEnumerable<object> file_pattern, int batch_size, object column_names, object column_defaults, object label_name, object select_columns, string field_delim, bool use_quote_delim, string na_value, bool header, Nullable<int> num_epochs, bool shuffle, int shuffle_buffer_size, object shuffle_seed, object prefetch_buffer_size, object num_parallel_reads, bool sloppy, int num_rows_for_inference, object compression_type, bool ignore_errors)

Reads CSV files into a dataset.

Reads CSV files into a dataset, where each element is a (features, labels) tuple that corresponds to a batch of CSV rows. The features dictionary maps feature column names to `Tensor`s containing the corresponding feature data, and labels is a `Tensor` containing the batch's label data.

Parameters

IEnumerable<object> file_pattern: List of files or patterns of file paths containing CSV records. See tf.io.gfile.glob for pattern rules.
int batch_size: An int representing the number of records to combine in a single batch.
object column_names: An optional list of strings that corresponds to the CSV columns, in order. One per column of the input record. If this is not provided, infers the column names from the first row of the records. These names will be the keys of the features dict of each dataset element.
object column_defaults: A optional list of default values for the CSV fields. One item per selected column of the input record. Each item in the list is either a valid CSV dtype (float32, float64, int32, int64, or string), or a `Tensor` with one of the aforementioned types. The tensor can either be a scalar default value (if the column is optional), or an empty tensor (if the column is required). If a dtype is provided instead of a tensor, the column is also treated as required. If this list is not provided, tries to infer types based on reading the first num_rows_for_inference rows of files specified, and assumes all columns are optional, defaulting to `0` for numeric values and `""` for string values. If both this and `select_columns` are specified, these must have the same lengths, and `column_defaults` is assumed to be sorted in order of increasing column index.
object label_name: A optional string corresponding to the label column. If provided, the data for this column is returned as a separate `Tensor` from the features dictionary, so that the dataset complies with the format expected by a `tf.Estimator.train` or `tf.Estimator.evaluate` input function.
object select_columns: An optional list of integer indices or string column names, that specifies a subset of columns of CSV data to select. If column names are provided, these must correspond to names provided in `column_names` or inferred from the file header lines. When this argument is specified, only a subset of CSV columns will be parsed and returned, corresponding to the columns specified. Using this results in faster parsing and lower memory usage. If both this and `column_defaults` are specified, these must have the same lengths, and `column_defaults` is assumed to be sorted in order of increasing column index.
string field_delim: An optional `string`. Defaults to `","`. Char delimiter to separate fields in a record.
bool use_quote_delim: An optional bool. Defaults to `True`. If false, treats double quotation marks as regular characters inside of the string fields.
string na_value: Additional string to recognize as NA/NaN.
bool header: A bool that indicates whether the first rows of provided CSV files correspond to header lines with column names, and should not be included in the data.
Nullable<int> num_epochs: An int specifying the number of times this dataset is repeated. If None, cycles through the dataset forever.
bool shuffle: A bool that indicates whether the input should be shuffled.
int shuffle_buffer_size: Buffer size to use for shuffling. A large buffer size ensures better shuffling, but increases memory usage and startup time.
object shuffle_seed: Randomization seed to use for shuffling.
object prefetch_buffer_size: An int specifying the number of feature batches to prefetch for performance improvement. Recommended value is the number of batches consumed per training step. Defaults to auto-tune.
object num_parallel_reads: Number of threads used to read CSV records from files. If >1, the results will be interleaved. Defaults to `1`.
bool sloppy: If `True`, reading performance will be improved at the cost of non-deterministic ordering. If `False`, the order of elements produced is deterministic prior to shuffling (elements are still randomized if `shuffle=True`. Note that if the seed is set, then order of elements after shuffling is deterministic). Defaults to `False`.
int num_rows_for_inference: Number of rows of a file to use for type inference if record_defaults is not provided. If None, reads all the rows of all the files. Defaults to 100.
object compression_type: (Optional.) A tf.string scalar evaluating to one of `""` (no compression), `"ZLIB"`, or `"GZIP"`. Defaults to no compression.
bool ignore_errors: (Optional.) If `True`, ignores errors with CSV file parsing, such as malformed data or empty lines, and moves on to the next valid CSV record. Otherwise, the dataset raises an error and stops processing when encountering any invalid records. Defaults to `False`.

Returns

DatasetV1Adapter: A dataset, where each element is a (features, labels) tuple that corresponds to a batch of `batch_size` CSV rows. The features dictionary maps feature column names to `Tensor`s containing the corresponding column data, and labels is a `Tensor` containing the column data for the label column specified by `label_name`.

_Saveable make_saveable_from_iterator(Iterator iterator)

object map_and_batch(PythonFunctionContainer map_func, int batch_size, Nullable<int> num_parallel_batches, bool drop_remainder, Nullable<int> num_parallel_calls)

object map_and_batch_with_legacy_function(object map_func, object batch_size, object num_parallel_batches, bool drop_remainder, object num_parallel_calls)

Fused implementation of `map` and `batch`. (deprecated)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use `tf.data.experimental.map_and_batch()

NOTE: This is an escape hatch for existing uses of `map_and_batch` that do not work with V2 functions. New uses are strongly discouraged and existing uses should migrate to `map_and_batch` as this method will not be removed in V2.

Parameters

object map_func: A function mapping a nested structure of tensors to another nested structure of tensors.
object batch_size: A tf.int64 scalar tf.Tensor, representing the number of consecutive elements of this dataset to combine in a single batch.
object num_parallel_batches: (Optional.) A tf.int64 scalar tf.Tensor, representing the number of batches to create in parallel. On one hand, higher values can help mitigate the effect of stragglers. On the other hand, higher values can increase contention if CPU is scarce.
bool drop_remainder: (Optional.) A tf.bool scalar tf.Tensor, representing whether the last batch should be dropped in case its size is smaller than desired; the default behavior is not to drop the smaller batch.
object num_parallel_calls: (Optional.) A tf.int32 scalar tf.Tensor, representing the number of elements to process in parallel. If not specified, `batch_size * num_parallel_batches` elements will be processed in parallel. If the value tf.data.experimental.AUTOTUNE is used, then the number of parallel calls is set dynamically based on available CPU.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object map_and_batch_with_legacy_function_dyn(object map_func, object batch_size, object num_parallel_batches, ImplicitContainer<T> drop_remainder, object num_parallel_calls)

object parallel_interleave(PythonFunctionContainer map_func, Nullable<int> cycle_length, int block_length, Nullable<bool> sloppy, Nullable<int> buffer_output_elements, Nullable<int> prefetch_input_elements)

object parallel_interleave(object map_func, Nullable<int> cycle_length, int block_length, Nullable<bool> sloppy, Nullable<int> buffer_output_elements, Nullable<int> prefetch_input_elements)

A parallel version of the `Dataset.interleave()` transformation. (deprecated)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.

`parallel_interleave()` maps `map_func` across its input to produce nested datasets, and outputs their elements interleaved. Unlike tf.data.Dataset.interleave, it gets elements from `cycle_length` nested datasets in parallel, which increases the throughput, especially in the presence of stragglers. Furthermore, the `sloppy` argument can be used to improve performance, by relaxing the requirement that the outputs are produced in a deterministic order, and allowing the implementation to skip over nested datasets whose elements are not readily available when requested.

Example usage: WARNING: If `sloppy` is `True`, the order of produced elements is not deterministic.

Parameters

object map_func: A function mapping a nested structure of tensors to a `Dataset`.
Nullable<int> cycle_length: The number of input `Dataset`s to interleave from in parallel.
int block_length: The number of consecutive elements to pull from an input `Dataset` before advancing to the next input `Dataset`.
Nullable<bool> sloppy: If false, elements are produced in deterministic order. Otherwise, the implementation is allowed, for the sake of expediency, to produce elements in a non-deterministic order.
Nullable<int> buffer_output_elements: The number of elements each iterator being interleaved should buffer (similar to the `.prefetch()` transformation for each interleaved iterator).
Nullable<int> prefetch_input_elements: The number of input elements to transform to iterators before they are needed for interleaving.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

Show Example

# Preprocess 4 files concurrently.
            filenames = tf.data.Dataset.list_files("/path/to/data/train*.tfrecords")
            dataset = filenames.apply(
                tf.data.experimental.parallel_interleave(
                    lambda filename: tf.data.TFRecordDataset(filename),
                    cycle_length=4))

object parse_example_dataset(IDictionary<string, object> features, Nullable<int> num_parallel_calls)

A transformation that parses `Example` protos into a `dict` of tensors.

Parses a number of serialized `Example` protos given in `serialized`. We refer to `serialized` as a batch with `batch_size` many entries of individual `Example` protos.

This op parses serialized examples into a dictionary mapping keys to `Tensor` and `SparseTensor` objects. `features` is a dict from keys to `VarLenFeature`, `SparseFeature`, and `FixedLenFeature` objects. Each `VarLenFeature` and `SparseFeature` is mapped to a `SparseTensor`, and each `FixedLenFeature` is mapped to a `Tensor`. See tf.io.parse_example for more details about feature dictionaries.

Returns

object: A dataset transformation function, which can be passed to tf.data.Dataset.apply.

object prefetch_to_device(string device, object buffer_size)

A transformation that prefetches dataset values to the given `device`.

NOTE: Although the transformation creates a tf.data.Dataset, the transformation must be the final `Dataset` in the input pipeline.

Parameters

string device: A string. The name of a device to which elements will be prefetched.
object buffer_size: (Optional.) The number of elements to buffer on `device`. Defaults to an automatically chosen value.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object rejection_resample(object class_func, IEnumerable<double> target_dist, IEnumerable<double> initial_dist, Nullable<int> seed)

A transformation that resamples a dataset to achieve a target distribution.

**NOTE** Resampling is performed via rejection sampling; some fraction of the input values will be dropped.

Parameters

object class_func: A function mapping an element of the input dataset to a scalar tf.int32 tensor. Values should be in `[0, num_classes)`.
IEnumerable<double> target_dist: A floating point type tensor, shaped `[num_classes]`.
IEnumerable<double> initial_dist: (Optional.) A floating point type tensor, shaped `[num_classes]`. If not provided, the true class distribution is estimated live in a streaming fashion.
Nullable<int> seed: (Optional.) Python integer seed for the resampler.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object scan(int64 initial_state, PythonFunctionContainer scan_func)

object scan(ValueTuple<IEnumerable<int>, int> initial_state, PythonFunctionContainer scan_func)

object scan(IEnumerable<int> initial_state, PythonFunctionContainer scan_func)

object scan(TensorArray initial_state, PythonFunctionContainer scan_func)

object scan(int initial_state, PythonFunctionContainer scan_func)

object scan(IGraphNodeBase initial_state, PythonFunctionContainer scan_func)

object shuffle_and_repeat(Nullable<int> buffer_size, Nullable<int> count, Nullable<int> seed)

Shuffles and repeats a Dataset returning a new permutation for each epoch. (deprecated)

Warning: THIS FUNCTION IS DEPRECATED. It will be removed in a future version. Instructions for updating: Use `tf.data.Dataset.shuffle(buffer_size, seed)` followed by `tf.data.Dataset.repeat(count)`. Static tf.data optimizations will take care of using the fused implementation.

`dataset.apply(tf.data.experimental.shuffle_and_repeat(buffer_size, count))`

is equivalent to

`dataset.shuffle(buffer_size, reshuffle_each_iteration=True).repeat(count)`

The difference is that the latter dataset is not serializable. So, if you need to checkpoint an input pipeline with reshuffling you must use this implementation.

Parameters

Nullable<int> buffer_size: A tf.int64 scalar tf.Tensor, representing the maximum number elements that will be buffered when prefetching.
Nullable<int> count: (Optional.) A tf.int64 scalar tf.Tensor, representing the number of times the dataset should be repeated. The default behavior (if `count` is `None` or `-1`) is for the dataset be repeated indefinitely.
Nullable<int> seed: (Optional.) A tf.int64 scalar tf.Tensor, representing the random seed that will be used to create the distribution. See `tf.compat.v1.set_random_seed` for behavior.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object take_while(object predicate)

A transformation that stops dataset iteration based on a `predicate`.

Parameters

object predicate: A function that maps a nested structure of tensors (having shapes and types defined by `self.output_shapes` and `self.output_types`) to a scalar tf.bool tensor.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

object take_while_dyn(object predicate)

A transformation that stops dataset iteration based on a `predicate`.

Parameters

object predicate: A function that maps a nested structure of tensors (having shapes and types defined by `self.output_shapes` and `self.output_types`) to a scalar tf.bool tensor.

Returns

object: A `Dataset` transformation function, which can be passed to tf.data.Dataset.apply.

Methods

Properties

Public static methods

object bucket_by_sequence_length(object element_length_func, IEnumerable<int> bucket_boundaries, IEnumerable<int> bucket_batch_sizes, object padded_shapes, object padding_values, bool pad_to_bucket_boundary, bool no_padding, bool drop_remainder)

Parameters

Returns

object bucket_by_sequence_length_dyn(object element_length_func, object bucket_boundaries, object bucket_batch_sizes, object padded_shapes, object padding_values, ImplicitContainer<T> pad_to_bucket_boundary, ImplicitContainer<T> no_padding, ImplicitContainer<T> drop_remainder)

object bytes_produced_stats(IEnumerable<string> tag)

Parameters

Returns

object bytes_produced_stats(string tag)

Parameters

Returns

object bytes_produced_stats_dyn(object tag)

Parameters

Returns

Tensor cardinality(Dataset dataset)

Tensor cardinality(Dataset dataset)

object cardinality_dyn(object dataset)

Parameters

Returns

object copy_to_device(string target_device, string source_device)

Parameters

Returns

object dense_to_sparse_batch(int batch_size, IEnumerable<int> row_shape)

Parameters

Returns

_VariantDataset from_variant(IGraphNodeBase variant, TensorSpec structure)

Parameters

Returns

_VariantDataset from_variant(IGraphNodeBase variant, ValueTuple<TensorSpec, object, object> structure)

Parameters

Returns

object from_variant_dyn(object variant, object structure)

Parameters

Returns

_OptionalImpl get_next_as_optional(Trackable iterator)

Parameters

Returns

_OptionalImpl get_next_as_optional(IEnumerable<object> iterator)

Parameters

Returns

_OptionalImpl get_next_as_optional(IEnumerator<object> iterator)

Parameters

Returns

_OptionalImpl get_next_as_optional(MultiDeviceIteratorV2 iterator)

Parameters

Returns

_OptionalImpl get_next_as_optional(object iterator)

Parameters

Returns

object get_next_as_optional_dyn(object iterator)

Parameters

Returns

object get_single_element(Dataset dataset)

object get_structure(object dataset_or_iterator)

Parameters

Returns

object get_structure(IEnumerable<IGraphNodeBase> dataset_or_iterator)

Parameters

Returns

object get_structure_dyn(object dataset_or_iterator)

Parameters

Returns

object group_by_reducer(PythonFunctionContainer key_func, Reducer reducer)

object group_by_window(PythonFunctionContainer key_func, PythonFunctionContainer reduce_func, Nullable<int> window_size, object window_size_func)

object latency_stats(string tag)

Parameters

Returns

object latency_stats(IEnumerable<string> tag)

Parameters

Returns

object latency_stats_dyn(object tag)

Parameters

Returns

Parameters

Returns

_Saveable make_saveable_from_iterator(Iterator iterator)

object map_and_batch(PythonFunctionContainer map_func, int batch_size, Nullable<int> num_parallel_batches, bool drop_remainder, Nullable<int> num_parallel_calls)

object map_and_batch_with_legacy_function(object map_func, object batch_size, object num_parallel_batches, bool drop_remainder, object num_parallel_calls)