Type experimental
Namespace tensorflow.compat.v2.data.experimental
Methods
- choose_from_datasets
- choose_from_datasets
- choose_from_datasets_dyn
- Counter
- Counter
- Counter
- Counter
- Counter_dyn
- make_batched_features_dataset
- make_batched_features_dataset
- make_batched_features_dataset_dyn
- make_csv_dataset
- make_csv_dataset_dyn
- sample_from_datasets
- sample_from_datasets
- sample_from_datasets
- sample_from_datasets
- sample_from_datasets_dyn
Properties
Public static methods
_DirectedInterleaveDataset choose_from_datasets(IEnumerable<object> datasets, DatasetV1Adapter choice_dataset)
_DirectedInterleaveDataset choose_from_datasets(IEnumerable<object> datasets, Dataset choice_dataset)
Creates a dataset that deterministically chooses elements from `datasets`. For example, given the following datasets:
The elements of `result` will be: ```
"foo", "bar", "baz", "foo", "bar", "baz", "foo", "bar", "baz"
```
Parameters
-
IEnumerable<object>
datasets - A list of
tf.data.Dataset
objects with compatible structure. -
Dataset
choice_dataset - A
tf.data.Dataset
of scalartf.int64
tensors between `0` and `len(datasets) - 1`.
Returns
-
_DirectedInterleaveDataset
- A dataset that interleaves elements from `datasets` according to the values of `choice_dataset`.
Show Example
datasets = [tf.data.Dataset.from_tensors("foo").repeat(), tf.data.Dataset.from_tensors("bar").repeat(), tf.data.Dataset.from_tensors("baz").repeat()] # Define a dataset containing `[0, 1, 2, 0, 1, 2, 0, 1, 2]`. choice_dataset = tf.data.Dataset.range(3).repeat(3) result = tf.data.experimental.choose_from_datasets(datasets, choice_dataset)
object choose_from_datasets_dyn(object datasets, object choice_dataset)
Creates a dataset that deterministically chooses elements from `datasets`. For example, given the following datasets:
The elements of `result` will be: ```
"foo", "bar", "baz", "foo", "bar", "baz", "foo", "bar", "baz"
```
Parameters
-
object
datasets - A list of
tf.data.Dataset
objects with compatible structure. -
object
choice_dataset - A
tf.data.Dataset
of scalartf.int64
tensors between `0` and `len(datasets) - 1`.
Returns
-
object
- A dataset that interleaves elements from `datasets` according to the values of `choice_dataset`.
Show Example
datasets = [tf.data.Dataset.from_tensors("foo").repeat(), tf.data.Dataset.from_tensors("bar").repeat(), tf.data.Dataset.from_tensors("baz").repeat()] # Define a dataset containing `[0, 1, 2, 0, 1, 2, 0, 1, 2]`. choice_dataset = tf.data.Dataset.range(3).repeat(3) result = tf.data.experimental.choose_from_datasets(datasets, choice_dataset)
object Counter(int start, int step, ImplicitContainer<T> dtype)
object Counter(int start, IGraphNodeBase step, ImplicitContainer<T> dtype)
object Counter(IGraphNodeBase start, int step, ImplicitContainer<T> dtype)
object Counter(IGraphNodeBase start, IGraphNodeBase step, ImplicitContainer<T> dtype)
object Counter_dyn(ImplicitContainer<T> start, ImplicitContainer<T> step, ImplicitContainer<T> dtype)
object make_batched_features_dataset(IEnumerable<object> file_pattern, int batch_size, IDictionary<string, object> features, PythonClassContainer reader, string label_key, IEnumerable<object> reader_args, Nullable<int> num_epochs, bool shuffle, int shuffle_buffer_size, Nullable<int> shuffle_seed, Nullable<int> prefetch_buffer_size, Nullable<int> reader_num_threads, Nullable<int> parser_num_threads, bool sloppy_ordering, bool drop_final_batch)
object make_batched_features_dataset(IEnumerable<object> file_pattern, int batch_size, IDictionary<string, object> features, ImplicitContainer<T> reader, string label_key, IEnumerable<object> reader_args, Nullable<int> num_epochs, bool shuffle, int shuffle_buffer_size, Nullable<int> shuffle_seed, Nullable<int> prefetch_buffer_size, Nullable<int> reader_num_threads, Nullable<int> parser_num_threads, bool sloppy_ordering, bool drop_final_batch)
object make_batched_features_dataset_dyn(object file_pattern, object batch_size, object features, ImplicitContainer<T> reader, object label_key, object reader_args, object num_epochs, ImplicitContainer<T> shuffle, ImplicitContainer<T> shuffle_buffer_size, object shuffle_seed, object prefetch_buffer_size, object reader_num_threads, object parser_num_threads, ImplicitContainer<T> sloppy_ordering, ImplicitContainer<T> drop_final_batch)
Returns a `Dataset` of feature dictionaries from `Example` protos. If label_key argument is provided, returns a `Dataset` of tuple
comprising of feature dictionaries and label. Example: ```
serialized_examples = [
features {
feature { key: "age" value { int64_list { value: [ 0 ] } } }
feature { key: "gender" value { bytes_list { value: [ "f" ] } } }
feature { key: "kws" value { bytes_list { value: [ "code", "art" ] } } }
},
features {
feature { key: "age" value { int64_list { value: [] } } }
feature { key: "gender" value { bytes_list { value: [ "f" ] } } }
feature { key: "kws" value { bytes_list { value: [ "sports" ] } } }
}
]
``` We can use arguments: ```
features: {
"age": FixedLenFeature([], dtype=tf.int64, default_value=-1),
"gender": FixedLenFeature([], dtype=tf.string),
"kws": VarLenFeature(dtype=tf.string),
}
``` And the expected output is:
Parameters
-
object
file_pattern - List of files or patterns of file paths containing
`Example` records. See
tf.io.gfile.glob
for pattern rules. -
object
batch_size - An int representing the number of records to combine in a single batch.
-
object
features - A `dict` mapping feature keys to `FixedLenFeature` or
`VarLenFeature` values. See
tf.io.parse_example
. -
ImplicitContainer<T>
reader - A function or class that can be
called with a `filenames` tensor and (optional) `reader_args` and returns
a `Dataset` of `Example` tensors. Defaults to
tf.data.TFRecordDataset
. -
object
label_key - (Optional) A string corresponding to the key labels are stored in `tf.Examples`. If provided, it must be one of the `features` key, otherwise results in `ValueError`.
-
object
reader_args - Additional arguments to pass to the reader class.
-
object
num_epochs - Integer specifying the number of times to read through the dataset. If None, cycles through the dataset forever. Defaults to `None`.
-
ImplicitContainer<T>
shuffle - A boolean, indicates whether the input should be shuffled. Defaults to `True`.
-
ImplicitContainer<T>
shuffle_buffer_size - Buffer size of the ShuffleDataset. A large capacity ensures better shuffling but would increase memory usage and startup time.
-
object
shuffle_seed - Randomization seed to use for shuffling.
-
object
prefetch_buffer_size - Number of feature batches to prefetch in order to improve performance. Recommended value is the number of batches consumed per training step. Defaults to auto-tune.
-
object
reader_num_threads - Number of threads used to read `Example` records. If >1, the results will be interleaved. Defaults to `1`.
-
object
parser_num_threads - Number of threads to use for parsing `Example` tensors into a dictionary of `Feature` tensors. Defaults to `2`.
-
ImplicitContainer<T>
sloppy_ordering - If `True`, reading performance will be improved at the cost of non-deterministic ordering. If `False`, the order of elements produced is deterministic prior to shuffling (elements are still randomized if `shuffle=True`. Note that if the seed is set, then order of elements after shuffling is deterministic). Defaults to `False`.
-
ImplicitContainer<T>
drop_final_batch - If `True`, and the batch size does not evenly divide the input dataset size, the final smaller batch will be dropped. Defaults to `False`.
Returns
-
object
- A dataset of `dict` elements, (or a tuple of `dict` elements and label). Each `dict` maps feature keys to `Tensor` or `SparseTensor` objects.
Show Example
{ "age": [[0], [-1]], "gender": [["f"], ["f"]], "kws": SparseTensor( indices=[[0, 0], [0, 1], [1, 0]], values=["code", "art", "sports"] dense_shape=[2, 2]), }
Dataset make_csv_dataset(IEnumerable<object> file_pattern, int batch_size, IEnumerable<string> column_names, IEnumerable<IGraphNodeBase> column_defaults, object label_name, IEnumerable<object> select_columns, string field_delim, bool use_quote_delim, string na_value, bool header, Nullable<int> num_epochs, bool shuffle, int shuffle_buffer_size, object shuffle_seed, Nullable<int> prefetch_buffer_size, Nullable<int> num_parallel_reads, bool sloppy, int num_rows_for_inference, object compression_type, bool ignore_errors)
object make_csv_dataset_dyn(object file_pattern, object batch_size, object column_names, object column_defaults, object label_name, object select_columns, ImplicitContainer<T> field_delim, ImplicitContainer<T> use_quote_delim, ImplicitContainer<T> na_value, ImplicitContainer<T> header, object num_epochs, ImplicitContainer<T> shuffle, ImplicitContainer<T> shuffle_buffer_size, object shuffle_seed, object prefetch_buffer_size, object num_parallel_reads, ImplicitContainer<T> sloppy, ImplicitContainer<T> num_rows_for_inference, object compression_type, ImplicitContainer<T> ignore_errors)
object sample_from_datasets(IEnumerable<object> datasets, IGraphNodeBase weights, Nullable<int> seed)
object sample_from_datasets(IEnumerable<object> datasets, IEnumerable<double> weights, Nullable<int> seed)
Samples elements at random from the datasets in `datasets`.
Parameters
-
IEnumerable<object>
datasets - A list of
tf.data.Dataset
objects with compatible structure. -
IEnumerable<double>
weights - (Optional.) A list of `len(datasets)` floating-point values where
`weights[i]` represents the probability with which an element should be
sampled from `datasets[i]`, or a
tf.data.Dataset
object where each element is such a list. Defaults to a uniform distribution across `datasets`. -
Nullable<int>
seed - (Optional.) A
tf.int64
scalartf.Tensor
, representing the random seed that will be used to create the distribution. See `tf.compat.v1.set_random_seed` for behavior.
Returns
-
object
- A dataset that interleaves elements from `datasets` at random, according to `weights` if provided, otherwise with uniform probability.
object sample_from_datasets(IEnumerable<object> datasets, ndarray weights, Nullable<int> seed)
object sample_from_datasets(IEnumerable<object> datasets, Dataset weights, Nullable<int> seed)
object sample_from_datasets_dyn(object datasets, object weights, object seed)
Samples elements at random from the datasets in `datasets`.
Parameters
-
object
datasets - A list of
tf.data.Dataset
objects with compatible structure. -
object
weights - (Optional.) A list of `len(datasets)` floating-point values where
`weights[i]` represents the probability with which an element should be
sampled from `datasets[i]`, or a
tf.data.Dataset
object where each element is such a list. Defaults to a uniform distribution across `datasets`. -
object
seed - (Optional.) A
tf.int64
scalartf.Tensor
, representing the random seed that will be used to create the distribution. See `tf.compat.v1.set_random_seed` for behavior.
Returns
-
object
- A dataset that interleaves elements from `datasets` at random, according to `weights` if provided, otherwise with uniform probability.