Signup/Sign In

HashMap Performance Improvements in JAVA 8 using TREEIFY_THRESHOLD

Posted in Programming   LAST UPDATED: SEPTEMBER 6, 2021

    Java provides us an ample amount of data structures which are very useful. When it comes to performance and lookups the Map stands out of the crowd. It stores the key-value combination as an entry where the retrieval can be performed with the help of a key.

    There are a lot of articles available on the internet which tell us what a HashMap is, what methods it has but in this article, we are trying to cover some interesting information about HashMap, which is how JAVA 8 improved the performance of Hashmaps.

    To really understand the changes that JAVA 8 has provided we will be required to understand how a Hashmap works. We have tried to describe that briefly in this article.

    How does HashMap work?

    This is one of the most important questions asked in the interviews also if you want to utilize the power of HashMap to the full extent you should know the internal workings of HashMap.

    HashMap works on the principle of Hashing as its name suggests but that's not all. Hashing leads to another problem which is called Hash Collision which I will explain in the next subheading but as of now let's continue with the internal working of Hashmap and Hashing.

    Java Object class contains two methods i.e. equals() and hashcode(). To get our HashMap working as per expectation we should override these methods properly. Any issues in these methods can even fail the whole purpose as there can be a scenario in which the results are not retrieved even when we still have the key-value present.

    HashMap uses Map.Entry to save the object in the HashMap and when there is a hash collision, the LinkedList of an entry is created. So we can think of it like this, on every hash collision a bucket is created which will have a list of all the key-value pairs whose keys have the same hash code.

    Hashcode generates the hash and provides the right bucket and then the rest of the job is done by the equals() method to return back the right key.

    Understanding Hash Collision

    Hash Collision is a problem that occurs when we have the same hash code for two different keys. For example, if our map has two keys MAY and YAM and we are generating the hashcode by calculating the sum of the numbers assigned to every character in the string like M=13, A=1, Y=25. So the hashcode for MAY and YAM will be the same as 39 and MAY and YAM keys will go to the same bucket giving rise to the same hash for different keys.

    Hashmap collision example

    To achieve better performance the hash collisions should be minimum otherwise the hashmap will become a linear data structure.

    When does HashMap turn into a linear complexity data structure?

    As discussed in the Hash Collision section, there are scenarios when the O(1) complexity of HashMap is turned to linear i.e. O(n) complexity. Suppose if we write a hashcode that leads to hash collision for every key that is being passed then all the entries will be sent to a single bucket and just like a Linked list, this hashmap will have to be traversed, end to end to get the exact entry.

    Improvements in Java 8

    In Java 8, HashMap replaces the linked list with another useful data structure i.e. binary tree on breaching a certain threshold, which is known as TREEIFY_THRESHOLD. Once this threshold is reached the linked list of Entries is converted to the TreeNodes which reduces the time complexity from O(n) to O(log(n)).

    TreeNodes are nothing but the structures supporting the binary trees which have two nodes, smaller node goes to the left and the larger to the right. Whenever we want to search for any key the whole left or right subtree is discarded with a single check. This is how the time complexity is reduced to O(log(n). This change has lead to a significant improvement of HashMap.

    Once the number of entries is decreased due to removal or resizing of HashMap, the structure is converted back to the older implementation which is LinkedList.

    Conclusion:

    In this article, we learned how Hashmap works, how Hash collision works, and how in case of too many hash collisions, a hashmap is reduced to a LinkedList-like behavior, and how this is solved in Java 8.

    If you have any doubt, or you couldn't understand anything, feel free to comment below.

    You may also like:

    About the author:
    I have been into tech development for around 9 years now have a good hands on with the languages ike java,python php etc. I have a pretty good interest in Design and Algorithms as well.
    Tags:Java Hashmap Hash Collision
    IF YOU LIKE IT, THEN SHARE IT
     

    RELATED POSTS