Type tf.keras.datasets.reuters
Namespace tensorflow
Public static methods
ValueTuple<object, object> load_data(string path, object num_words, int skip_top, object maxlen, double test_split, int seed, int start_char, int oov_char, int index_from, IDictionary<string, object> kwargs)
Loads the Reuters newswire classification dataset.
Parameters
-
string
path - where to cache the data (relative to `~/.keras/dataset`).
-
object
num_words - max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept
-
int
skip_top - skip the top N most frequently occurring words (which may not be informative).
-
object
maxlen - truncate sequences after this length.
-
double
test_split - Fraction of the dataset to be used as test data.
-
int
seed - random seed for sample shuffling.
-
int
start_char - The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character.
-
int
oov_char - words that were cut out because of the `num_words` or `skip_top` limit will be replaced with this character.
-
int
index_from - index actual words with this index and higher.
-
IDictionary<string, object>
kwargs - Used for backwards compatibility.
Returns
-
ValueTuple<object, object>
- Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`. Note that the 'out of vocabulary' character is only used for words that were present in the training set but are not included because they're not making the `num_words` cut here. Words that were not seen in the training set but are in the test set have simply been skipped.
object load_data_dyn(ImplicitContainer<T> path, object num_words, ImplicitContainer<T> skip_top, object maxlen, ImplicitContainer<T> test_split, ImplicitContainer<T> seed, ImplicitContainer<T> start_char, ImplicitContainer<T> oov_char, ImplicitContainer<T> index_from, IDictionary<string, object> kwargs)
Loads the Reuters newswire classification dataset.
Parameters
-
ImplicitContainer<T>
path - where to cache the data (relative to `~/.keras/dataset`).
-
object
num_words - max number of words to include. Words are ranked by how often they occur (in the training set) and only the most frequent words are kept
-
ImplicitContainer<T>
skip_top - skip the top N most frequently occurring words (which may not be informative).
-
object
maxlen - truncate sequences after this length.
-
ImplicitContainer<T>
test_split - Fraction of the dataset to be used as test data.
-
ImplicitContainer<T>
seed - random seed for sample shuffling.
-
ImplicitContainer<T>
start_char - The start of a sequence will be marked with this character. Set to 1 because 0 is usually the padding character.
-
ImplicitContainer<T>
oov_char - words that were cut out because of the `num_words` or `skip_top` limit will be replaced with this character.
-
ImplicitContainer<T>
index_from - index actual words with this index and higher.
-
IDictionary<string, object>
kwargs - Used for backwards compatibility.
Returns
-
object
- Tuple of Numpy arrays: `(x_train, y_train), (x_test, y_test)`. Note that the 'out of vocabulary' character is only used for words that were present in the training set but are not included because they're not making the `num_words` cut here. Words that were not seen in the training set but are in the test set have simply been skipped.