What is Hash map?

Hash map is a data structure that allows you to store a key and a value pair. A hash map will always have unique set of keys. In Java hash map allows to have null keys and null values.

It is a commonly used data structure in many programming languages. For example, C# has Dictionary.aspx>) and Python has dict() which are implementations oh Map.

How is a hash map implemented?

A hash map works on the principle of Hashing. This allows to efficiently add, delete and lookup a particular key in a collection.

A Hash function, usually returns an integer for a given key. hashCode() method in the Object class is an example of hash function. The integer value returned by hash function is called hash value or simply hash. Hash function must produce same hash value for two equal keys.

Hash Collision

You may be already aware that, multiple keys can return same hash value. This is called hash collision.

For example, the Strings “Aa” and “BB” have same hashCode in the default implementation.

1
2
3
"Aa".hashCode() = 2112
"BB".hashCode() = 2112

An efficient HashMap should deal with hash collisions efficiently. Here efficiency refers to both space and time efficiency.

Hash collision is handled by creating buckets for each hash values. Every entry that collides with that hash, goes into that bucket.

For each (key, value) pair a corresponding Entry is created. This entry is mapped to a hash value of the key.

Hash function returns a direct index, from which an Entry can be accessed, directly. This makes it easier to find an Entry in the hashmap.

Update Java 8 uses Treemap after a threshold

Java 8 HashSet is backed by an instance of HashMap
https://docs.oracle.com/javase/8/docs/api/java/util/HashSet.html

A hash function that returns a unique hash number is called a universal hash function. In practice, it is extremely hard to assign unique numbers to objects. The later is always possible only if you know (or approximate) the number of objects to be processed.

Thus, we say that our hash function has the following properties

  • It always returns a number for an object.
  • Two equal objects will always have the same number
  • Two unequal objects not always have different numbers

References