tf.keras.datasets.reuters - LostTech.TensorFlow Documentation

Loads the Reuters newswire classification dataset.

string path: where to cache the data (relative to `~/.keras/dataset`).
object num_words: max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept
int skip_top: skip the top N most frequently occurring words (which may not be informative).
object maxlen: truncate sequences after this length.
double test_split: Fraction of the dataset to be used as test data.
int seed: random seed for sample shuffling.
int start_char: The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character.
int oov_char: words that were cut out because of the `num_words` or `skip_top` limit will be replaced with this character.
int index_from: index actual words with this index and higher.
IDictionary<string, object> kwargs: Used for backwards compatibility.

ValueTuple<object, object>: Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
Note that the 'out of vocabulary' character is only used for words that were present in the training set but are not included because they're not making the `num_words` cut here. Words that were not seen in the training set but are in the test set have simply been skipped.

Loads the Reuters newswire classification dataset.

ImplicitContainer<T> path: where to cache the data (relative to `~/.keras/dataset`).
object num_words: max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept
ImplicitContainer<T> skip_top: skip the top N most frequently occurring words (which may not be informative).
object maxlen: truncate sequences after this length.
ImplicitContainer<T> test_split: Fraction of the dataset to be used as test data.
ImplicitContainer<T> seed: random seed for sample shuffling.
ImplicitContainer<T> start_char: The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character.
ImplicitContainer<T> oov_char: words that were cut out because of the `num_words` or `skip_top` limit will be replaced with this character.
ImplicitContainer<T> index_from: index actual words with this index and higher.
IDictionary<string, object> kwargs: Used for backwards compatibility.

object: Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`.
Note that the 'out of vocabulary' character is only used for words that were present in the training set but are not included because they're not making the `num_words` cut here. Words that were not seen in the training set but are in the test set have simply been skipped.

LostTech.TensorFlow : API Documentation