parallel-computing-overview
**This is an old revision of the document!**
Table of Contents
Overview
Paradigms
Laws and models
| Name | Formula | What it predicts |
|---|---|---|
| Amdahl's law | $S = \dfrac{1}{s + (1-s)/p}$ | Maximum speedup given serial fraction $s$ and $p$ processors |
| Gustafson's law | $S = p - s(p-1)$ | Speedup when problem size scales with $p$ |
| Roofline model | $\min(\pi,\; I \cdot \beta)$ | Whether a kernel is compute-bound or memory-bandwidth-bound |
Common failure modes
| Issue | Cause | Detection |
|---|---|---|
| Race condition | Concurrent unsynchronized read/write | Helgrind, ThreadSanitizer |
| Deadlock | Circular lock dependency | Missing progress; gdb info threads |
| False sharing | Two variables in the same cache line | perf c2c, cache miss counters |
| Load imbalance | Unequal work distribution | Per-thread time profile |
| Memory ordering | Relaxed atomics with wrong assumptions | Litmus tests; address and memory sanitizers |
parallel-computing-overview.1781515952.txt.gz ยท Last modified: by Ivan Janevski
