Table of Contents
Non-blocking collectives (MPI)
Non-blocking point-to-point communication lets a process post a send or receive and continue computing while the transfer happens in the background (see blocking and non-blocking). Non-blocking collectives extend the same idea to collective operations. MPI-3 introduced variants that return immediately with a request handle rather than blocking until the operation completes. MPI_Ibcast, MPI_Ireduce, MPI_Iallreduce, MPI_Iscatter, and others follow the same request/wait pattern. The send and receive buffers must not be touched between the call and MPI_Wait.
double local = compute_first_part(); double global; MPI_Request req; MPI_Iallreduce(&local, &global, 1, MPI_DOUBLE, MPI_SUM, MPI_COMM_WORLD, &req); do_independent_work(); MPI_Wait(&req, MPI_STATUS_IGNORE); use(global);
Not every implementation overlaps the collective with the intervening computation; some simply defer the work to MPI_Wait. On implementations that do use hardware-assisted collectives (common in high-end interconnects like InfiniBand), the overlap can hide significant latency in reduction-heavy algorithms.
