Deadlock (MPI)

Deadlock (MPI)

The classic multithreaded deadlock: thread A holds mutex 1 and waits for mutex 2, while thread B holds mutex 2 and waits for mutex 1. Neither can proceed because each is blocked on a resource the other holds. MPI deadlocks follow the same pattern with blocking sends taking the place of mutex acquires. If process 0 calls MPI_Send to process 1 while process 1 simultaneously calls MPI_Send to process 0, both block waiting for the peer to call MPI_Recv, and neither ever does.

// deadlock: both processes block on MPI_Send waiting for the other to MPI_Recv
MPI_Send(buf, N, MPI_DOUBLE, peer, 0, MPI_COMM_WORLD);
MPI_Recv(buf, N, MPI_DOUBLE, peer, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

The standard fixes are: use MPI_Sendrecv, which handles ordering internally; post a non-blocking MPI_Isend first, then MPI_Recv, then MPI_Wait; or alternate send/receive order by rank so even-rank processes send first and odd-rank processes receive first. Whether MPI_Send actually blocks depends on the implementation's internal buffer size, so relying on it to absorb a large message is non-portable and can silently deadlock on a different machine.

Table of Contents

Deadlock (MPI)