# How to compute the hash of a tuple in python

• A+
Category：Languages

In python, if I have a tuple with many elements, is its hash calculated from its elements' `id`s or its elements' content?

In this example,

``` a = (1, [1,2]) hash(a) ```

It errors out saying list is unhashable. So I guess it's not computed by id, or probably there is a check on whether the element is mutable.

Now see this example

``` class A: pass a0 = A() ta = (1, a0) hash(ta) # -1122968024 a0.x = 20 hash(ta) # -1122968024 ```

Here it turns out the hash of `ta` does not change with the modification of its element, i.e., `a0`. So maybe `a0`'s id is used for the hash calculation? Is `a0` somehow considered as immutable? How does python know if a type is mutable?

Now consider this case

``` b = (1, 2) id(b) # 3980742764 c = (1, 2) id(c) # 3980732588 tb = (1, b) tc = (1, c) hash(tb) # -1383040070 hash(tc) # -1383040070 ```

It seems the content of `b` and `c` are used for the hash calculation.

How should I understand these examples?

Neither. It is calculated on the basis of the hashes of these elements, not the contents (values).

Take a look at this paragraph in python's documentation glossary.

Whether something is hashable or not, and how it is hashed, depends on the implementation of its `.__hash__()` method. Python itself has no idea about mutability of an object.

In your first example, `tuple` happens to hash itself on the basis of its elements, while a `list` doesn't have a hash at all - the `.__hash__()` method is not implemented for it (and for a good reason). That's why a `tuple` with a `list` object inside of it is not hashable.

Now, having that in mind, let's have a look at python data model documentation, and what it has to say on the topic:

User-defined classes have `__eq__()` and `__hash__()` methods by default; with them, all objects compare unequal (except with themselves) and `x.__hash__()` returns an appropriate value such that `x == y` implies both that `x is y` and `hash(x) == hash(y)`.

That's why you don't have to define `.__hash__()` for your classes - python does it for you in this case. The default implementation doesn't take instance fields into the account though. That's why you can change the values inside your object without changing it's hash.

In this regard you're right - the (CPython) default implementation of the hashing function for custom classes relies on the `id()` of an object, and not on the values inside of it. It is an implementation detail, and it differs between Python versions though. In more recent versions of Python the relation between `hash()` and `id()` involves some randomization.

## But how does it actually hash itself?

While the details are quite complicated and probably involve some advanced math, the implementation of the hash function for tuple objects is written in C, and can be seen here (see `static Py_hash_t tuplehash(PyTupleObject *v)`.

The calculation involves XORing a constant with the hashes of each of the tuple's elements. The line responsible for hashing of the elements is this one:

``y = PyObject_Hash(*p++); ``

So, to answer your original question: it does a bunch of XOR hokus-pocus with the hashes of each of it's elements. Whether or not the contents of these elements are used depends on their specific hash functions.