String hashCode() documentation vs implementation

  • A+
Category:Languages

Below is the source code snippet of String.hashCode() method from Java 8 (1.8.0_131 to be precise)

/**  * Returns a hash code for this string. The hash code for a  * {@code String} object is computed as  * <blockquote><pre>  * s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]  * </pre></blockquote>  * using {@code int} arithmetic, where {@code s[i]} is the  * <i>i</i>th character of the string, {@code n} is the length of  * the string, and {@code ^} indicates exponentiation.  * (The hash value of the empty string is zero.)  *  * @return  a hash code value for this object.  */ public int hashCode() {     int h = hash;     if (h == 0 && value.length > 0) {         char val[] = value;          for (int i = 0; i < value.length; i++) {             h = 31 * h + val[i];         }         hash = h;     }     return h; } 

You can see that, the documentation says, that hashCode() is computed using below formula

s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1] 

while the actual implementation is different

for (int i = 0; i < value.length; i++) {     h = 31 * h + val[i]; } 

Am I missing any obvious thing? Please help me.


The implementation is correct, with the caveat that integer overflow may occur (which is ok here, it doesn't harm anything).

Here's the steps on a sample string "CAT".

h = 0 

First loop:

i = 0 h = 31 * 0 + 'C' (67) = 67 

Second loop:

i = 1 h = 31 * 67 + 'A' (65) = 2142 

Third loop:

i = 2 h = 31 * 2142 + 'T' (84) = 66486 

Let's derive the formula from the code. Here, n is the index of i into the string s. Each iteration of the for loop performs this formula.

hn = 31hn-1 + sn

h0 /* after loop i = 0 */ = s[0] h1 /* after loop i = 1 */ = 31*h0 + s[1] = 31*s[0] + s[1] h2 /* after loop i = 2 */ = 31*h1 + s[2] = 31*(31*s[0] + s[1]) + s[2] h = 31*31*s[0] + 31*s[1] + s[2] 

The exponents you see for the powers of 31 arise because each loop multiplies in another factor of 31 before adding the value of the next character.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: