Python - id() and hash() Functions | Application Architect

Key Insights

id() returns an object’s memory address and answers “is this the exact same object?”, while hash() returns a computed integer for dictionary/set operations and answers “can this object be used as a key?”
Objects with the same hash aren’t necessarily identical (hash collisions), but objects used as dictionary keys must maintain consistent hashes throughout their lifetime
Implementing __hash__() without __eq__() (or vice versa) breaks Python’s object model and leads to subtle, maddening bugs in collections

Understanding Object Identity vs. Hashability

Python developers frequently conflate id() and hash(), assuming they serve similar purposes. They don’t. These functions answer fundamentally different questions about objects, and understanding the distinction is critical when debugging reference issues, implementing custom classes, or reasoning about dictionary and set behavior.

id() tells you where an object lives in memory. hash() tells you how an object can be indexed in hash-based collections. One is about identity; the other is about lookup efficiency. Let’s dig into both.

The id() Function Deep Dive

The id() function returns a unique integer identifier for an object. In CPython (the standard implementation), this is literally the object’s memory address. The only guarantee Python makes is that this identifier is unique and constant for the object’s lifetime.

# Basic id() behavior
x = [1, 2, 3]
print(f"id of x: {id(x)}")  # e.g., 140234866534720

y = x  # y references the same object
print(f"id of y: {id(y)}")  # Same as x

z = [1, 2, 3]  # New list with same contents
print(f"id of z: {id(z)}")  # Different from x

print(f"x is y: {x is y}")  # True - same object
print(f"x is z: {x is z}")  # False - different objects
print(f"x == z: {x == z}")  # True - equal contents

The is operator directly compares id() values. When you write a is b, Python evaluates id(a) == id(b).

Immutable objects behave differently due to Python’s optimization strategies:

# Immutable objects and id()
a = 42
b = 42
print(f"id(a): {id(a)}, id(b): {id(b)}")
print(f"a is b: {a is b}")  # True - Python caches small integers

# String interning
s1 = "hello"
s2 = "hello"
print(f"s1 is s2: {s1 is s2}")  # True - Python interns short strings

# But not always...
s3 = "hello world!"
s4 = "hello world!"
print(f"s3 is s4: {s3 is s4}")  # May be False - depends on context

# Reassignment creates new objects for immutables
x = "original"
original_id = id(x)
x = "modified"
print(f"ID changed: {id(x) != original_id}")  # True

Python caches small integers (-5 to 256) and interns certain strings for performance. Never rely on this behavior—it’s an implementation detail that varies across Python versions and implementations.

The hash() Function Deep Dive

hash() returns an integer computed from an object’s value, designed for fast dictionary key lookup and set membership testing. The critical requirement: an object’s hash must remain constant for its entire lifetime.

# Hashing immutable built-in types
print(hash("python"))      # Consistent within a session
print(hash((1, 2, 3)))     # Tuples are hashable
print(hash(42))            # Integers hash to themselves (mostly)
print(hash(3.14))          # Floats are hashable

# Unhashable types raise TypeError
try:
    hash([1, 2, 3])
except TypeError as e:
    print(f"Lists: {e}")  # unhashable type: 'list'

try:
    hash({"a": 1})
except TypeError as e:
    print(f"Dicts: {e}")  # unhashable type: 'dict'

try:
    hash({1, 2, 3})
except TypeError as e:
    print(f"Sets: {e}")   # unhashable type: 'set'

Why can’t you hash mutable objects? Because if an object’s value changes after it’s added to a dictionary, its hash would change, and the dictionary would lose track of it. Python prevents this catastrophe by making mutable built-in types unhashable.

# frozenset is the immutable (hashable) version of set
fs = frozenset([1, 2, 3])
print(hash(fs))  # Works fine

# Tuples containing only hashable elements are hashable
print(hash((1, "a", (2, 3))))  # Works

# But tuples containing unhashable elements aren't
try:
    hash((1, [2, 3]))
except TypeError as e:
    print(f"Mixed tuple: {e}")  # unhashable type: 'list'

Key Differences Between id() and hash()

The relationship between id() and hash() is neither one-to-one nor predictable:

# Same id implies same hash (trivially - it's the same object)
x = "test"
y = x
print(f"Same id: {id(x) == id(y)}")    # True
print(f"Same hash: {hash(x) == hash(y)}")  # True (same object)

# Same hash does NOT imply same id
a = 0
b = 0.0
print(f"Same id: {id(a) == id(b)}")    # False - different objects
print(f"Same hash: {hash(a) == hash(b)}")  # True - equal values hash equally
print(f"Equal: {a == b}")               # True

# Different ids, same hash (by design for equal values)
s1 = "hello"
s2 = "".join(['h', 'e', 'l', 'l', 'o'])  # Constructed differently
print(f"Same id: {id(s1) == id(s2)}")    # Likely False
print(f"Same hash: {hash(s1) == hash(s2)}")  # True - same content

Python enforces a critical invariant: if two objects compare equal, they must have the same hash. The reverse isn’t true—hash collisions are expected and handled by dictionaries and sets.

# Demonstrating hash collisions
# These have the same hash but aren't equal
class AlwaysSameHash:
    def __init__(self, value):
        self.value = value
    
    def __hash__(self):
        return 42  # Terrible hash function, but legal
    
    def __eq__(self, other):
        return isinstance(other, AlwaysSameHash) and self.value == other.value

a = AlwaysSameHash(1)
b = AlwaysSameHash(2)
print(f"Same hash: {hash(a) == hash(b)}")  # True
print(f"Equal: {a == b}")                   # False

# Both can exist in a set (hash collision handled)
s = {a, b}
print(f"Set size: {len(s)}")  # 2

Implementing hash() in Custom Classes

By default, custom classes are hashable using their id(). This works because the default __eq__() compares by identity:

class DefaultHashable:
    def __init__(self, value):
        self.value = value

a = DefaultHashable(1)
b = DefaultHashable(1)
print(f"hash(a): {hash(a)}")
print(f"hash(b): {hash(b)}")  # Different from a
print(f"a == b: {a == b}")     # False - identity comparison

When you override __eq__(), Python automatically sets __hash__ to None, making instances unhashable:

class BrokenHashable:
    def __init__(self, value):
        self.value = value
    
    def __eq__(self, other):
        return isinstance(other, BrokenHashable) and self.value == other.value

a = BrokenHashable(1)
try:
    hash(a)
except TypeError as e:
    print(f"Error: {e}")  # unhashable type: 'BrokenHashable'

To create properly hashable objects, implement both methods consistently:

class Point:
    def __init__(self, x, y):
        self._x = x
        self._y = y
    
    @property
    def x(self):
        return self._x
    
    @property
    def y(self):
        return self._y
    
    def __eq__(self, other):
        if not isinstance(other, Point):
            return NotImplemented
        return self._x == other._x and self._y == other._y
    
    def __hash__(self):
        return hash((self._x, self._y))
    
    def __repr__(self):
        return f"Point({self._x}, {self._y})"

# Now it works correctly
p1 = Point(1, 2)
p2 = Point(1, 2)
p3 = Point(3, 4)

print(f"p1 == p2: {p1 == p2}")  # True
print(f"hash(p1) == hash(p2): {hash(p1) == hash(p2)}")  # True

points = {p1, p2, p3}
print(f"Set: {points}")  # Two unique points

lookup = {p1: "first", p3: "second"}
print(f"lookup[p2]: {lookup[p2]}")  # "first" - p2 equals p1

Common Pitfalls and Best Practices

Pitfall 1: Mutating objects used as dictionary keys

class MutablePoint:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    
    def __eq__(self, other):
        return isinstance(other, MutablePoint) and self.x == other.x and self.y == other.y
    
    def __hash__(self):
        return hash((self.x, self.y))  # Danger: based on mutable state

p = MutablePoint(1, 2)
d = {p: "value"}
print(d[p])  # "value"

p.x = 100  # Mutate the key
print(p in d)  # False - hash changed, lookup fails
print(list(d.keys()))  # [MutablePoint] - it's still there, just unfindable

Pitfall 2: Using id() for equality checks

# Wrong: using id() to check equality
def bad_contains(lst, item):
    return any(id(x) == id(item) for x in lst)

# Right: use equality
def good_contains(lst, item):
    return item in lst  # Uses __eq__

Using id() for debugging object lifecycles:

def debug_references():
    data = {"key": "value"}
    print(f"Created dict: id={id(data)}")
    
    cache = {}
    cache["data"] = data
    print(f"After caching: id={id(cache['data'])}")  # Same id
    
    data = {"key": "new_value"}  # Reassignment
    print(f"After reassignment: id={id(data)}")  # New id
    print(f"Cache still has original: id={id(cache['data'])}")

debug_references()

Conclusion

Use id() when you need to verify object identity—debugging reference issues, understanding Python’s memory model, or confirming that two variables point to the exact same object. Use hash() when implementing objects that need to work as dictionary keys or set members.

Aspect	id()	hash()
Returns	Memory address (CPython)	Computed integer
Purpose	Object identity	Collection indexing
Mutable objects	Always works	Raises TypeError
Equal objects	Different ids possible	Must have same hash
Custom classes	Always available	Requires `__hash__()` if `__eq__()` defined

The golden rule: if you implement __eq__(), implement __hash__() using the same attributes, and make sure those attributes are immutable. Your future self debugging a dictionary lookup failure will thank you.