shuffle_dataset
- opennmt.data.shuffle_dataset(buffer_size, shuffle_shards=True, dataset_size=None)[source]
Transformation that shuffles the dataset based on its size.
Example
>>> dataset = tf.data.Dataset.range(6) >>> dataset = dataset.apply(opennmt.data.shuffle_dataset(3)) >>> list(dataset.as_numpy_iterator()) [2, 3, 1, 0, 4, 5]
- Parameters
buffer_size – The number of elements from which to sample.
shuffle_shards – When
buffer_size
is smaller than the dataset size, the dataset is first sharded in a random order to add another level of shuffling.dataset_size – If the dataset size is already known, it can be passed here to avoid a slower generic computation of the dataset size later.
- Returns
A
tf.data.Dataset
transformation.