Site Tools


hash-map

**This is an old revision of the document!**

Table of Contents

Hash map

A hash map (also called a hash table) is a data structure that maps arbitrary keys to values in O(1) average time. A hash function converts a key to an integer index into a backing Array of buckets. The ideal hash function distributes keys uniformly so that each bucket holds roughly one entry; when two keys hash to the same bucket, that is a collision and must be resolved.

The two main collision resolution strategies are chaining and open addressing. In chaining, each bucket holds a Linked list of entries that hashed to it; lookup traverses the list. In open addressing, all entries live in the backing array and a collision is resolved by probing adjacent slots (linear probing, quadratic probing, double hashing) until an empty one is found.

// simple open-addressing hash map (linear probing)
struct entry { int key; int val; bool used; };
struct entry table[CAPACITY];
 
int lookup(int key) {
    int i = hash(key) % CAPACITY;
    while (table[i].used && table[i].key != key)
        i = (i + 1) % CAPACITY;
    return table[i].used ? table[i].val : -1;
}

Performance depends on the load factor: the ratio of entries to capacity. As the load factor approaches 1, collision chains grow and average lookup degrades toward O(n). Hash maps typically resize (double the backing array and rehash all entries) when the load factor exceeds a threshold, commonly 0.7–0.75. After resizing, average O(1) is restored.

Open addressing has better cache behaviour than chaining because entries are stored in the array itself rather than in separate heap allocations. Linear probing in particular has excellent spatial locality; its main weakness is primary clustering — long runs of occupied slots that slow lookup. Robin Hood hashing and quadratic probing mitigate this.

Operation Average Worst case
Lookup O(1) O(n)
Insert O(1) amortised O(n)
Delete O(1) O(n)
hash-map.1781171083.txt.gz · Last modified: by 127.0.0.1