# Java entry series [12] hashCode and equals

## Foreword

In the previous two sections, we explained the Hashtable algorithm and source code analysis in detail. For the hash function that always cannot escape the calculation of hashCode, in this section we will analyze the hashCode and equals in detail. At the same time, you will see that the content in this section is ”Learned to organize (spit out a sentence, the Chinese version of this book is really rubbish), for the book” Effective Java “is of great learning value, but I will not explain a series directly from this book like other children’s shoes What I use is to learn the corresponding place and then refer to different java classic books to summarize. The step-by-step method is more effective. OK, let’s start.

## equals

Looking at the section on “Effective Java” about equals, I directly throw the following five rules that must be adhered to when rewriting equals. When I saw these major features, I was stunned. This is not the time when university lines explain matrices. What are the characteristics of this?

1. Reflexivity: For a non-empty object x, x.equals (x) must return true.

2. Symmetry: For non-empty objects x and y, if x.equals (y) equals true, then y.equals (x) must also return true.

3. Transitivity: For non-empty objects x, y, and z, if x.equals (y) and y.equals (z) are equal to true, then x.equals (z) must also return true

4. Consistency: For non-empty objects x and y, if the information of the object is not modified using equals, no matter how many times it is called, then x.equals (y) is either true or false

5. For a non-empty object x, x.equals (null) must return false

It is well understood about the first point that non-null objects must be referenced by themselves. For the example given in the second point, it is case-insensitive when comparing rewritten objects to certain strings, but string objects are case-sensitive. Write this way, this will lead to the problem of symmetry inconsistency. For the third point, pay attention to the transitivity of equals when inheriting. The fourth point emphasizes the invariance of calling equals through equals. Throws a null pointer exception. Then we can actually use the following points as templates when rewriting equals.

1. Use “==” to determine whether two objects refer to the same

2.Use the instanceof operator to check if the parameter types are the same

3. If the types are the same, convert the parameters to the correct type

4, whether each value in the comparison object is equal, return true if all values are equal, otherwise false

The above templates are from “Effective Java” ‘s summary of rewriting equals. Of course, we can find the above shadow from the equals in the rewritten string object. The equals method of the string object is as follows:

public boolean equals (Object anObject) { // Judge whether the object references are equal, and return directly if equal ( if this == anObject) { return true ; } // Determine if the object parameter type is correct if (anObject instanceof String) { // If the parameter types are the same, convert to the corresponding parameter type String anotherString = (String) anObject; int n = value.length; // Compare whether all values in the parameter object are equal if (n == anotherString.value.length) { char v1 [] = value; char v2 [] = anotherString.value; int i = 0 ; while (n--! = 0 ) { if (v1 [i]! = V2 [i]) return false ; i ++ ; } return true ; } } return false ; }

Well, here we have explained equals, it is still relatively simple, so why do we have to rewrite hashCode when rewriting equals? The main reason is that this is a general convention. If you compare a HashMap or HashSet based on a hashed collection, the address of the storage object needs to be calculated by a hash function. If you do not do this, unexpected problems will occur . So what’s the unexpected problem?

## hashCode

Below we use an example to explain why we must rewrite hashCode when rewriting equals.

public class Person { int age; String name; public Person ( int age, String name) { this .age = age; this .name = name; } @Override public boolean equals (Object obj) { if ( this == obj) { return true ; } if (obj instanceof Person) { Person p = (Person) obj; return ( this .age == p.age && this .name == p.name); } return false ; } }

As above, we give a Person object with two attributes, age and name. When rewriting, judging that age and name are equal, they can be regarded as the same person. Below we perform the following operations on the console, and then we see that it will print out What’s the result?

Person p1 = new Person (12, "Jeffcky" ); Person p2 = new Person (12, "Jeffcky" ); Hashtable hashtable = new Hashtable (); hashtable.put (p1, "v1" ); System.out.println (hashtable.get (p2));

It is not difficult to understand, because the storage address of the Hashtable object is based on hashCode, but we have not rewritten the hashCode above, so when we instantiate the object p2, even if the two objects of equals are overridden, the value of p2 will definitely not be obtained Because hashCode doesn’t wait, let’s rewrite hashCode

@Override public int hashCode () { return (31 * Integer.valueOf ( this .age) .hashCode () + name.hashCode ()); }

We see that string objects override hashCode because strings are used very frequently, and we are most likely to use them in hash collections. Let’s take a look at the hashCode implementation of string objects.

The above figure marks the hashCode core that calculates the string, that is, the hash function. From the above, we can see that it is calculated by the ASCII code of each character in the string. At the same time, we can also expand to look at the hashcode of the source numerical type. . The above calculation method is finally summarized by our mathematical calculation method:

s [0] * 31 ^ (n-1) + s [1] * 31 ^ (n-2) + ... + s [n-1]

For example, we calculate the hashCode of the string [AC]. According to the calculation formula above, it is

65 * 31 ^ (2-1) + 67 * 31 ^ (2-2) = 2082

The reason for choosing 31 mentioned in “Effective Java” is that it is an odd prime number. If the multiplier is an even number and the multiplication overflows, the information will be lost, because 2 multiplication is equivalent to a shift operation. The benefits of using prime numbers are not obvious, but it is customary to use prime numbers to calculate hash results . I seriously doubt that the translator understands the meaning wrong. I can’t be convinced by the reasons given in the book for choosing prime numbers. Here I will explain my personal thoughts.

### Why hash functions use prime numbers

The reason for choosing 31 is because it is prime (prime), not because it is odd . When we insert an element into the hash table, how does the hash identify in which bucket the element needs to be stored? This is an important issue, making it mandatory to have the hash tell us in which bucket to store the value in a constant amount of time for quick retrieval. What we can think of is a fool-style operation, that is, cyclic traversal comparison. This sequential search will directly lead to the deterioration of hash performance, which directly depends on the number of values contained in the hash table. In other words, this will have a linear performance cost (O (N)), and as the number of keys (N) becomes larger, performance is conceivable. Another complication is the actual type of value we are dealing with. If we are dealing with strings and other complex types, the amount of checking or comparing itself will cause the cost to become high again. Based on the above description, we need to solve at least two problems. One is to facilitate fast retrieval instead of sequential retrieval, and the other is to solve the comparison of complex type values. The simple way to solve this problem is to hope for a way to decompose complex values into easy-to-use keys or hashes. The easiest way to achieve this is to generate a unique number, which must be unique because we want to distinguish one Value and another value. Prime numbers are unique numbers. They are unique in that because prime numbers are used to form prime numbers, the product of a prime number and any other number has the greatest possible uniqueness (not unique as the number of pixels itself). This property of prime numbers is unique in Kazakhstan. Use of the Greek function can reduce the number of collisions (or collisions). For example, using 4 * 8, it is more likely to conflict than a prime product such as 3 * 5. 32 can be calculated by 1 * 32 or 2 * 16 or 4 * 8 or 2 ^ 5, but 3 * 5 can only Get 1 as 15 or 3 * 5.

## Summary

In this article, we have discussed hashCode and equals in detail, and analyzed the reasons for using prime numbers in hash functions. There is also a section left here to be added when learning the virtual machine. Learn the specific implementation of hashCode by analyzing the source code of the virtual machine. Next In this section, we will enter the learning and analysis of HashMap source code. Thank you for reading. See you in the next section.

Java entry series [1] string creation method

Java entry series [2] string features

Java entry series [3] stringbuilder string buffer

Java entry series [4] packaging class

Java entry series [5] inheritance abstract classes interfaceson

Java entry series [6] Principles of dynamic array

Java entry series [7] collection arraylist

Java entry series [8] principles of doubley linked list alogrithm

Java entry series [9] linkedlist source code analysis

Java entry series [10] Hash algorithm principle

Java entry series [11] Hashtable source code analysis

Java entry series [12] Hashcode and equals

Java entry series [13] Principles of the read-black tree algorithmn method

Java entry series [14] Hashmap source code ananlysis

Orignal link:https://www.cnblogs.com/CreateMyself/p/11537683.html