# Thread affinity (OpenMP) On a laptop or single-socket desktop, every core reaches the same RAM at the same speed. On a multi-socket server, each socket has its own bank of RAM (NUMA, non-uniform memory access): accessing local memory is fast, but crossing the inter-socket interconnect to reach the other socket's RAM costs roughly 2–3× more. By default, the OS is free to migrate threads between cores and sockets, silently moving a thread away from the data it allocated. **Thread affinity** pins threads to specific hardware locations to prevent migration and keep threads close to their data. ```c #pragma omp parallel proc_bind(spread) num_threads(8) { // threads are spread evenly across sockets; good for bandwidth-bound work } #pragma omp parallel proc_bind(close) { // threads are packed near the master thread; good for latency-sensitive work // that shares a last-level cache } ``` The `proc_bind` policies are: - `spread` — distribute threads as evenly as possible across the available places (sockets or cores); maximises total memory bandwidth - `close` — place threads as close together as possible relative to the master; maximises shared cache reuse - `master` — place all threads on the same place as the master thread `OMP_PLACES` defines what a "place" is: `sockets`, `cores` (default on most systems), or `threads` (hardware threads / hyperthreads). `OMP_PROC_BIND` sets the default policy for all parallel regions that do not specify `proc_bind` explicitly. On a single-socket desktop the difference is negligible; on a dual- or quad-socket HPC node it can determine whether bandwidth scales or saturates at a fraction of peak.