Skip to content

Add shuffle_after_epoch_seed argument to file-based readers.#6236

Merged
JanuszL merged 2 commits intoNVIDIA:mainfrom
JanuszL:add_global_shufling_seed
Mar 5, 2026
Merged

Add shuffle_after_epoch_seed argument to file-based readers.#6236
JanuszL merged 2 commits intoNVIDIA:mainfrom
JanuszL:add_global_shufling_seed

Conversation

@JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Feb 27, 2026

  • Adds an optional shuffle_after_epoch_seed (int32) argument to all
    readers that support shuffle_after_epoch: readers.file, readers.numpy,
    readers.fits, readers.coco, and readers.nemo_asr.
  • Previously the per-epoch shuffle was always seeded with a fixed
    constant, making the dataset order identical across training runs.
    The new argument lets users supply a custom base seed so that
    different training runs produce statistically independent orderings,
    while still guaranteeing a consistent global permutation across shards.
  • When omitted, the old fixed seed is used, preserving backward compatibility.
  • Tests added in test_shuffling.py and test_numpy.py.

Category:

New feature (non-breaking change which adds functionality)
Relates to #5827

Description:

  • Adds an optional shuffle_after_epoch_seed (int64) argument to all
    readers that support shuffle_after_epoch: readers.file, readers.numpy,
    readers.fits, readers.coco, and readers.nemo_asr.
  • Previously the per-epoch shuffle was always seeded with a fixed
    constant, making the dataset order identical across training runs.
    The new argument lets users supply a custom base seed so that
    different training runs produce statistically independent orderings,
    while still guaranteeing a consistent global permutation across shards.
  • When omitted, the old fixed seed is used, preserving backward compatibility.
  • Tests added in test_shuffling.py and test_numpy.py.

Additional information:

Affected modules and functionalities:

  • readers.file, readers.numpy, readers.fits, readers.coco, and readers.nemo_asr

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
      • test_shuffling.py and test_numpy.py extended
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 27, 2026

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [44956153]: BUILD FAILED

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 27, 2026

Greptile Summary

This PR adds configurable per-epoch shuffle seeding to file-based readers, addressing the limitation where all training runs used identical shuffle patterns. The implementation consistently adds an optional shuffle_after_epoch_seed (int32_t) argument across five reader types: file, numpy, fits, coco, and nemo_asr.

Key changes:

  • Replaced hardcoded kDaliDataloaderSeed with configurable shuffle_after_epoch_seed that defaults to the same constant for backward compatibility
  • Added proper validation warning when seed is specified but shuffle_after_epoch is disabled
  • Comprehensive documentation explains multi-GPU training considerations and proper usage
  • Thorough test coverage validates reproducibility, different seed behavior, and backward compatibility

Assessment:
The implementation is clean, consistent across all affected readers, and properly addresses previous review feedback regarding int32_t type usage. The changes maintain backward compatibility while enabling the requested functionality for statistically independent training runs.

Confidence Score: 5/5

  • This PR is safe to merge with no identified issues
  • Implementation is consistent across all readers, previous review concerns have been addressed (int32_t type), comprehensive tests validate all scenarios, and backward compatibility is preserved
  • No files require special attention

Important Files Changed

Filename Overview
dali/operators/reader/loader/file_loader.h Adds shuffle_after_epoch_seed argument support with proper default handling and warning for misuse
dali/operators/reader/loader/file_label_loader.h Mirrors file_loader.h changes, adds shuffle seed support with consistent implementation
dali/operators/reader/loader/nemo_asr_loader.h Adds shuffle seed with lambda initialization pattern, includes warning for misuse
dali/test/python/reader/test_shuffling.py Adds comprehensive tests for seed reproducibility, different seeds, and backward compatibility
dali/test/python/reader/test_numpy.py Adds numpy-specific shuffle seed tests covering all critical scenarios

Last reviewed commit: 4445fea

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

@JanuszL JanuszL force-pushed the add_global_shufling_seed branch from c2311b0 to 7c24cd6 Compare February 27, 2026 09:23
* Adds an optional `shuffle_after_epoch_seed` (int32) argument to all
  readers that support `shuffle_after_epoch`: readers.file, readers.numpy,
  readers.fits, readers.coco, and readers.nemo_asr.
* Previously the per-epoch shuffle was always seeded with a fixed
  constant, making the dataset order identical across training runs.
  The new argument lets users supply a custom base seed so that
  different training runs produce statistically independent orderings,
  while still guaranteeing a consistent global permutation across shards.
* When omitted, the old fixed seed is used, preserving backward compatibility.
* Tests added in test_shuffling.py and test_numpy.py.

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL JanuszL force-pushed the add_global_shufling_seed branch from 7c24cd6 to c616299 Compare February 27, 2026 09:34
@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 27, 2026

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [44958231]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [44958231]: BUILD PASSED

current_index_(0),
current_epoch_(0) {
int32_t seed_arg = kDaliDataloaderSeed;
spec.TryGetArgument(seed_arg, "shuffle_after_epoch_seed");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Open question: should we raise a warning, if shuffle_after_epoch_seed is set when shuffle_after_epoch=False?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Fixed


.. note::
When using multiple DALI pipelines (e.g., for multi-GPU training), all pipeline
instances should use the same ``shuffle_after_epoch_seed`` to ensure a consistent
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I believe that with the paramlinks you can use single quote and it'll link to the parameter

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@JanuszL JanuszL force-pushed the add_global_shufling_seed branch from 2f8b0c0 to 5bcb802 Compare February 27, 2026 12:30
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@JanuszL JanuszL force-pushed the add_global_shufling_seed branch from 5bcb802 to 4445fea Compare February 27, 2026 12:30
@JanuszL
Copy link
Contributor Author

JanuszL commented Feb 27, 2026

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [44966443]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [44966443]: BUILD PASSED

@JanuszL JanuszL merged commit ef8bd5c into NVIDIA:main Mar 5, 2026
7 checks passed
@JanuszL JanuszL deleted the add_global_shufling_seed branch March 5, 2026 08:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants