Hi,
I am trying to compare NCCL v/s NCCLX performance. However, I am unable to use both in the same script. I need to run each of them serially.
Probably this is due to the same library names for NCCLX as NCCL leading to conflicts?
This is leading to issues in other instances, for e.g. when I want to use NCCLX collectives in an existing infra that uses torch.distributed with NCCL backend for other distributed operations.
Hi,
I am trying to compare NCCL v/s NCCLX performance. However, I am unable to use both in the same script. I need to run each of them serially.
Probably this is due to the same library names for NCCLX as NCCL leading to conflicts?
This is leading to issues in other instances, for e.g. when I want to use NCCLX collectives in an existing infra that uses torch.distributed with NCCL backend for other distributed operations.