rccl
08/25 CCL (pronounced "Rickle") is a stand-alone library of standard
collective communication routines for GPUs, implementing all-reduce,
all-gather, reduce, broadcast, reduce-scatter, gather, scatter, and
all-to-all. There is also initial support for direct GPU-to-GPU
send and receive operations. It has been optimized to achieve high
bandwidth on platforms using PCIe, xGMI as well as networking using
InfiniBand Verbs or TCP/IP sockets. RCCL supports an arbitrary
number of GPUs installed in a single node or multiple nodes, and
can be used in either single- or multi-process (e.g., MPI)
applications.
The collective operations are implemented using ring and tree
algorithms and have been optimized for throughput and latency. For
best performance, small operations can be either batched into
larger operations or aggregated through the API.
- Download package
-
Checkout Package
osc -A https://api.opensuse.org checkout home:alveus:amd:rocm/rccl && cd $_
- Create Badge
Source Files
Filename | Size | Changed |
---|---|---|
RCCL-6.4.2.tar.gz | 0001895986 1.81 MB | |
_constraints | 0000000322 322 Bytes | |
rccl.spec | 0000008861 8.65 KB |
Comments 0