Update comments in data parallel example to use sampler #7914

JackCaoG · 2024-08-26T18:47:28Z

will-cromar · 2024-08-27T18:09:26Z

examples/data_parallel/train_resnet_ddp.py

+    if xr.world_size() > 1:
+      train_sampler = torch.utils.data.distributed.DistributedSampler(
+          train_dataset,
+          num_replicas=xr.world_size(),


nit: dist.world_size

will-cromar · 2024-08-27T18:09:34Z

examples/data_parallel/train_resnet_ddp.py

+      train_sampler = torch.utils.data.distributed.DistributedSampler(
+          train_dataset,
+          num_replicas=xr.world_size(),
+          rank=xr.global_ordinal(),


nit: dist.get_rank

will-cromar · 2024-08-27T18:09:45Z

examples/data_parallel/train_resnet_ddp.py

+    # want each process to handle different parts of the data.
+    '''
+    train_sampler = None
+    if xr.world_size() > 1:


also dist.world_size

will-cromar · 2024-08-27T18:10:35Z

examples/data_parallel/train_resnet_ddp.py

+    # below code is commented out because in this example we used a fake data
+    # loader that does not take sampler. However this logic is needed if you
+    # want each process to handle different parts of the data.
+    '''


Is it clearer to just apply this sampler to the fake dataset anyway?

fake dataset is a XLA util https://github.com/pytorch/xla/blob/master/examples/train_resnet_base.py#L25-L29, it does not take sampler.

Oh, I just kind of assumed that SampleGenerator was an idiomatic Dataset. Can you try just making it inherit from IterableDataset since it has __iter__ and __len__ already? You should then be able to wrap it in a standard sampler and data loader.

If that doesn't work, then one of us can follow up. Our examples should be as close to PyTorch as possible.

Update comments in data parallel example to use sampler

852eb2b

JackCaoG requested a review from will-cromar August 26, 2024 20:10

JackCaoG marked this pull request as ready for review August 26, 2024 20:10

will-cromar reviewed Aug 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update comments in data parallel example to use sampler #7914

Update comments in data parallel example to use sampler #7914

JackCaoG commented Aug 26, 2024 •

edited

Loading

will-cromar Aug 27, 2024

will-cromar Aug 27, 2024

will-cromar Aug 27, 2024

will-cromar Aug 27, 2024

JackCaoG Aug 27, 2024

will-cromar Aug 27, 2024

Update comments in data parallel example to use sampler #7914

Are you sure you want to change the base?

Update comments in data parallel example to use sampler #7914

Conversation

JackCaoG commented Aug 26, 2024 • edited Loading

will-cromar Aug 27, 2024

Choose a reason for hiding this comment

will-cromar Aug 27, 2024

Choose a reason for hiding this comment

will-cromar Aug 27, 2024

Choose a reason for hiding this comment

will-cromar Aug 27, 2024

Choose a reason for hiding this comment

JackCaoG Aug 27, 2024

Choose a reason for hiding this comment

will-cromar Aug 27, 2024

Choose a reason for hiding this comment

JackCaoG commented Aug 26, 2024 •

edited

Loading