You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
This lesson covers hash tables — a data structure that provides fast average-case access to data using a hash function to compute an index from a key. Hash tables are used extensively in computing for dictionaries, caches, database indexing, and more.
A hash table (also called a hash map) is a data structure that stores key-value pairs. It uses a hash function to convert each key into an index (hash) that determines where the value is stored in an underlying array.
Key → Hash Function → Index → Store/Retrieve Value
"Alice" → h("Alice") → 3 → array[3] = "Alice's data"
"Bob" → h("Bob") → 7 → array[7] = "Bob's data"
| Property | Description |
|---|---|
| Key | A unique identifier for each data item. |
| Value | The data associated with the key. |
| Hash function | A function that converts a key into an array index. |
| Bucket/Slot | A position in the underlying array. |
| Average access time | O(1) for insertion, deletion, and lookup. |
A hash function takes a key and produces an integer index within the range of the array. A good hash function should:
A common approach for string keys is to sum the ASCII values and take the modulo of the array size:
hash("Cat") = (67 + 97 + 116) MOD 10 = 280 MOD 10 = 0
hash("Dog") = (68 + 111 + 103) MOD 10 = 282 MOD 10 = 2
def simple_hash(key: str, table_size: int) -> int:
total = 0
for char in key:
total += ord(char)
return total % table_size
print(simple_hash("Cat", 10)) # 0
print(simple_hash("Dog", 10)) # 2
A collision occurs when two different keys produce the same hash value (index). Since an array slot can hold only one item, collisions must be resolved.
Linear probing handles collisions by searching for the next available slot in the array.
When a collision occurs at index i:
Array size = 7, hash function: key MOD 7
| Key | Hash | Slot | Notes |
|---|---|---|---|
| 10 | 10 MOD 7 = 3 | 3 | Inserted at slot 3 |
| 17 | 17 MOD 7 = 3 | 4 | Collision at 3 → probe to 4 |
| 24 | 24 MOD 7 = 3 | 5 | Collision at 3, 4 → probe to 5 |
| 5 | 5 MOD 7 = 5 | 6 | Collision at 5 → probe to 6 |
Linear probing can cause primary clustering — groups of occupied slots form clusters, increasing the average search time as the table fills up.
Exam Tip: Linear probing is the most commonly tested collision resolution method. Be prepared to trace through insertions showing how collisions are resolved. Always use the formula: next slot = (current + 1) % table_size.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.