Hash Tables
A hash table (also known as a hash map) is a data structure that offers a way to store and retrieve values efficiently using a hash function. It is one of the most commonly used data structures for implementing associative arrays or maps, where you associate a key with a value.
Hash tables are popular because of their ability to provide constant-time average complexity for search, insert, and delete operations, making them crucial in various applications, such as databases, caches, and indexing systems.
A hash table is a collection of key-value pairs, where each key is unique. The basic idea is to use a hash function to convert the key into an index, where the corresponding value is stored in an array or list.
The key steps in how a hash table works are:
A hash table typically consists of:
There are several key operations that can be performed on a hash table:
The insertion operation involves:
To search for a value in a hash table:
To delete a key-value pair:
Collisions occur when two different keys hash to the same index. There are several methods to handle collisions:
Division Method: A simple approach that divides the key by a prime number and returns the remainder.
Multiplication Method: Multiplies the key by a constant, then takes the fractional part and multiplies it by the size of the array.
Here’s a simple implementation of a hash table using chaining for collision resolution:
class HashTable:
def __init__(self, size):
self.size = size
self.table = [[] for _ in range(size)] # Create an empty table with 'size' lists
def hash_function(self, key):
return hash(key) % self.size # Hash the key and mod by table size
# Insert a key-value pair
def insert(self, key, value):
index = self.hash_function(key)
# Check if the key already exists in the chain, and update its value if it does
for item in self.table[index]:
if item[0] == key:
item[1] = value
return
# If the key doesn't exist, append a new (key, value) pair
self.table[index].append([key, value])
# Search for a value by key
def search(self, key):
index = self.hash_function(key)
for item in self.table[index]:
if item[0] == key:
return item[1] # Return the value if the key is found
return None # Return None if key is not found
# Delete a key-value pair
def delete(self, key):
index = self.hash_function(key)
for i, item in enumerate(self.table[index]):
if item[0] == key:
del self.table[index][i] # Remove the (key, value) pair
return True
return False # Return False if the key wasn't found
# Display the hash table
def display(self):
for i in range(self.size):
if self.table[i]:
print(f"Index {i}: {self.table[i]}")
# Example usage
ht = HashTable(10)
ht.insert("apple", 100)
ht.insert("banana", 200)
ht.insert("orange", 300)
# Display the hash table
ht.display()
# Search for a key
print("Value for 'apple':", ht.search("apple")) # Output: 100
# Delete a key
ht.delete("banana")
ht.display() # 'banana' should be removed
# Search for a deleted key
print("Value for 'banana':", ht.search("banana")) # Output: None
hash()
function and applying modulo operation to fit within the table size.In the chaining technique, when two keys hash to the same index, a linked list (or another list-like structure) is created at that index to store multiple key-value pairs. This method allows multiple keys to share the same index, but it requires extra space for the linked list.
Open addressing involves finding another open slot in the table if a collision occurs. It uses techniques like linear probing, quadratic probing, or double hashing to find the next available slot.
dict
).