Get all the information directly to your inbox

Get relevant information, unsubscribe at any time.
Mutable and Immutable Objects

Mutable and Immutable Objects

Understanding the differences in data types to make better programs

by Aquiles Carattino Aug. 23, 2018 mutable oop immutable objects

People who start programming in Python quickly stumble upon the existence of lists and tuples. Both are defined similarly, and they look the same. Sometimes they are even used interchangeably. Therefore, the obvious question is, why do we have two different types of elements for the same goal? The answer lies in understanding the differences between mutable and immutable data types in Python.

Even after programming Python applications for a while, it can be hard choosing between lists or tuples. Sometimes, the implications give rise to obscure bugs, very hard to find and correct. In this article, we will discuss the differences between lists and tuples, or more generally about mutable and immutable data types and how they can be used in our programs.

Lists and Tuples

In Python, when we want to define a list, we can do the following:

>>> var1 = [1, 2, 3]

And we can get its elements by their position:

>>> var1[0]
1
>>> var[1]
2

If we want to replace the value of an element, we can do the following:

>>> var1[0] = 0
>>> var1[0]
0

We can do the same with a tuple, which uses () instead of [] in its definition:

>>> var2 = (1, 2, 3)
>>> var2[0]
1

However, if we try to change the value of an element we will get an error:

>>> var2[0] = 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment

This is the first, crucial difference between a list and a tuple. Once defined, tuples cannot change their values, but lists can. In this sense, we can say that tuples are immutable, and lists are mutable. Deciding when to use one or the other will depend on the application. One difference to consider is that tuples are very fast when we need to access their values. Still, lists are much more memory efficient if we ever need to expand them at a later stage.

Being able to change a variable after it was defined is what makes objects mutable or immutable. Tuples and lists are only the first examples. We can look deeper into understanding how Python works.

Mutable and Immutable Data Types

Inspired by the excellent article written by Luciano Ramalho, we can think about variables in Python as labels instead of boxes. In Python, a variable is a label that we assign to an object; it is the way we, as humans, have to identify it. However, what is important is the data underlying the label, its value, and its type.

A useful tool to understand this concept is the id function. We can apply it to any variable, and it will return its identity. If we want to be sure about dealing with the same object, we can check whether the value returned by id is the same. We can do the following:

>>> var1 = [1, 2, 3]
>>> var2 = (1, 2, 3)
>>> id(var1)
44045192
>>> id(var2)
43989032

It is easy to see that both variables have different identities. Now we can expand both the list and the tuple with some new values and check whether their identities are the same:

>>> var1 += [4, 5]
>>> var2 += (4, 5)
>>> print(var1)
[1, 2, 3, 4, 5]
>>> print(var2)
(1, 2, 3, 4, 5)
>>> id(var1)
44045192
>>> id(var2)
30323024

The code above shows already something exciting: we have appended the same values to both the list (var1) and the tuple (var2). However, If we ask for their id, we will see that var1 didn't change its identity, while var2 has a new one. It means that we have expanded the list, but we have created an entirely new tuple. It is one reason why memory management is more efficient for lists than for tuples when we try to expand them.

Tuples are not the only immutable data type in Python. Still, they are a great tool to learn because they can be directly compared to lists, which are mutable. Other immutable data types are:

  1. int
  2. float
  3. decimal
  4. complex
  5. bool
  6. string
  7. tuple
  8. range
  9. frozenset
  10. bytes

Perhaps we didn't think about it before but when we assign an integer, float, etc. to a variable, it can't be replaced. We can verify it by inspecting this code:

>>> var1 = 1
>>> id(var1)
1644063776
>>> var1 += 1
>>> id(var1)
1644063808

You see that an entirely new var1 is created when we add a value to itself; therefore, its identity changes. The same would happen with all the other data types listed above.

Mutable objects, on the other hand, are the following:

  1. list
  2. dictionary
  3. set
  4. bytearray
  5. user-defined classes

Those are the kind of objects that can be changed in-place, without creating a new one to store the updated values. This is why we could expand a list without changing its identity, or we can modify a dictionary keeping the same underlying object:

>>> var1 = {'a': 1, 'b': 2}
>>> id(var1)
140711021092288
>>> var1['b'] = 3
>>> id(var1)
140711021092288
>>> 

Two labels for the same object

An interesting pattern is giving two names (i.e. two labels) to the same variable, for example:

>>> var1 = [0, 1, 2]
>>> var2 = var1
>>> id(var1)
44372872
>>> id(var2)
44372872

Both var1 and var2 have the same identity, which means that they are labels to the same object. In Python, we can verify this by using is instead of comparing the identity:

>>> var1 is var2
True

And if we update one of the values of var1:

>>> var1 += [3, 4, 5]
>>> print(var2)
[0, 1, 2, 3, 4, 5]
>>> var1 is var2
True

We can see that after updating the value of var1, the value of var2 also changed. This happens only with mutable types. With immutable objects, since a new object is created in order to update a value, then each name will be pointing to a different object. The same example as before, but with tuples:

>>> var1 = (1, 2)
>>> var2 = var1
>>> var1 is var2
True
>>> var1 += (3, 4)
>>> var1 is var2
False
>>> var2
(1, 2)

Equal objects

Sometimes we would like to compare whether two variables have the same underlying values and not if they point to the same object. We can use the == operator to compare the contents instead of the objects' identity. We can define two lists with the same values:

>>> var1 = [1, 2, 3]
>>> var2 = [1, 2, 3]

If we check the identities of var1 and var2, will get that they are different objects:

>>> var1 is var2
False

Even though we see that they are the same (we defined them to have the same values), they are two different objects. If we want to compare the values instead of the identities, we can do the following:

>>> var1 == var2
True

The example above also works if we would have defined tuples instead of lists. The fact that the contents are the same is not enough to know whether the variables point to the same object.

Singletons

Without entering too much into details, it is worth mentioning that there is a type of object called a singleton. By definition, they are objects that can be created only once. Therefore, any variable pointing to them should point to the same object. Let's see a quick example using some integers:

>>> a = 1
>>> b = 1
>>> a is 1
True
>>> a is b
True
>>> a == b
True

In Python, the integers between -5 and 256 are singletons. A variable pointing to any of them will have the same identity as any other variable pointing to the same number. This approach is ingenious to save memory because we only have one integer defined and as many variables as we want to point to it. But integers are not the only singletons. For example, booleans and None are also singletons:

>>> a = True
>>> a is True
True
>>> b = None
>>> b is None
True
>>> b == None
True

Using is instead of == has different advantages. The first is speed. We can run the following in the command line:

python -m timeit "1 == 1"

And then:

python -m timeit "1 is 1"

On average, the first expression takes around 20 nanoseconds, while the second takes approximately 17 nanoseconds. Singletons are a topic to cover independently because we can also define our own.

Mutable objects and functions

We have just seen that if we have two mutable objects with the same id, they are the same object. If we change one, we will change the other. The same applies when working with functions that take mutable objects as arguments. Imagine that we develop a function that takes as input a list, divides all of its arguments by two, and then returns the average. The function would look like this:

def divide_and_average(var):
    for i in range(len(var)):
        var[i] /= 2
    avg = sum(var)/len(var)
    return avg

It is very interesting to see what happens when we use this function:

my_list = [1, 2, 3]
print(divide_and_average(my_list))
print(my_list)

The output will be:

1.0
[0.5, 1.0, 1.5]

When we execute the function, we are changing the values of the variable my_list. It is very powerful because it allows us to change the elements of a list in-place while we are returning a different element. Sometimes, however, we don't want to do this and want to preserve the value of the original list. It may seem like a good idea to create a new variable. For example:

def divide_and_average(var1):
    var = var1
    [...]

However, we will see that this doesn't change the output. As we saw earlier, the identity of var and var1 would be the same. To go around this, we can make a copy of the object using the copy module:

import copy

def divide_and_average(var1):
    var = copy.copy(var1)
[...]

We see that the original my_list variable is not altered. What we have just done is called a shallow copy of an object. It is also possible to perform a deep copy, but we leave its implications for a different article. You can check Deep and shallow copies of objects to learn more about the subjects.

Default Arguments in Functions

A common practice when we are defining a function is to assign default values to its arguments. On the one hand, this allows us to include new parameters without changing the downstream code. Still, it also allows us to call the function with fewer arguments, making it easier to use. Let's see, for example, a function that increases the value of the elements of a list. The code would look like:

def increase_values(var1=[1, 1], value=0):
    value += 1
    var1[0] += value
    var1[1] += value
    return var1

If we call this function without arguments, it will use the default value [1, 1] for the list, and the default increase value of 0. What happens if we use this function twice, without any arguments?

print(increase_values())
print(increase_values())

The first time, it prints [2, 2] as expected, but the second time it prints [3, 3]. It means that the default argument of the function is changing every time we run it. When we run the script, Python evaluates the function definition only once and creates the default list and the default value. Because lists are mutable, every time we call the function, we change its default argument. However, value is immutable, and it remains the same for all subsequent function calls.

The next logical question is, how can we prevent this from happening. And the short answer is to use immutable types as default arguments for functions. We could have used None, for instance:

def increase_values(var1=None, value=0):
    if var1 is None:
        var1 = [1, 1]
    ...

Of course, the decision always depends on the use case. We may want to update the default value from one call to another. Imagine the case where we would like to perform a computationally expensive calculation. Still, we don't want to run twice the function with the same input and use a cache of values instead. We could do the following:

def calculate(var1, var2, cache={}):
    try:
        value = cache[var1, var2]
    except KeyError:
        value = expensive_computation(var1, var2)
        cache[var1, var2] = value
    return value

When we run calculate for the first time, there will be nothing stored in the cache dictionary. When we execute the function more than once, cache will start changing, appending the new values. If we repeat the arguments at some point, they will be part of the dictionary, and the stored value will be returned. Notice that we are leveraging the handling of exceptions to avoid checking explicitly whether the combination of values already exists in memory.

Our immutable objects

Python is very flexible, and it gives us a lot of control over how to customize its behavior. As we can see from the list at the beginning of this article, custom-created classes belong to the mutable types. But what happens if we want to define immutable objects? The answer is to modify how the class behaves when assigning attributes, which we can achieve by reimplementing the __setattr__ method.

class MyImmutable:
    def __setattr__(self, key, value):
        raise TypeError('MyImmutable cannot be modified after instantiation')

If we instantiate the class and try to assign a value to an attribute of it, an error will appear:

>>> my_immutable = MyImmutable()
>>> my_immutable.var1 = 2
Traceback (most recent call last):
  File ".\AE_custom_objects.py", line 14, in <module>
    my_immutable.var1 = 2
  File ".\AE_custom_objects.py", line 7, in __setattr__
    raise TypeError('MyImmutable cannot be modified after instantiation')
TypeError: MyImmutable cannot be modified after instantiation

We have an object that we can't modify after instantiation. But that also means there is no much we can do with it. Imagine we would like to store some initial values. If we create a standard __init__ method, it will fail:

class MyImmutable:
    def __init__(self, var1, var2):
        self.var1 = var1
        self.var2 = var2
    [...]

As soon as we try to instantiate this class, the TypeError will be raised. Even within the class itself, assigning values to attributes is achieved through the __setattr__ method. To bypass it, we need to use the super() object:

class MyImmutable:
    def __init__(self, var1, var2):
        super().__setattr__('var1', var1)
        super().__setattr__('var2', var2)

    def __setattr__(self, key, value):
        raise TypeError('MyImmutable cannot be modified after instantiation')

    def __str__(self):
        return 'MyImmutable var1: {}, var2: {}'.format(self.var1, self.var2)

Which now we can use as follows:

>>> my_immutable = MyImmutable(1, 2)
>>> print(my_immutable)
MyImmutable var1: 1, var2: 2
>>> my_immutable.var1 = 2
[...]
TypeError: MyImmutable cannot be modified after instantiation

It is a bit of a workaround, but maybe we can find a use for this kind of pattern. Another exciting resource worth checking is the [namedtuple]https://docs.python.org/3/library/collections.html#collections.namedtuple). As the name suggests, it allows us to have immutable objects with named attributes. Its source code can be of great inspiration to understanding the inner workings of Python.

Conclusions

Understanding the differences between mutable and immutable Python types does not arise as an important topic until it is too late. In most cases, we can develop complex applications exchanging tuples for lists. We may even be altering the value of a variable inside a function without realizing it and without significant consequences. But it will eventually happen that we find a severe bug, tough to track down, and that may be related to the use (or misuse) of mutable types.

As a personal note, I found out such a bug performing a complicated experiment with a microscope. I wanted to be able to refocus automatically on certain bright spots after an image was acquired. The first time the algorithm was working fine. The second time it was pretty much OK, but the third and onwards were not even close to reaching the desired values. The problem was defining the initial range for the scan as a list and dividing it by a factor after every iteration.

The example code is available on Github

If you want to keep learning, you can read more about why tuples may seem to change and what happens when you use mutable or immutable variables as class attributes.

Article written by Aquiles Carattino

Header Photo by rawpixel on Unsplash

Support Us

If you like the content of this website, consider buying a copy of the book Python For The Lab

Check out the book

Latest Articles

Get all the information directly to your inbox