# Everything You Need to Know About Python's Namedtuples

Here's the truth: `namedtuple`'s are underrated, I mean... *REALLY*!

Many developers, unfortunately, overlook this important data structure. When used appropriately, a `namedtuple()` can make your code cleaner, faster and easier to read. And this is what we're going to see in this article.

We'll see the most important aspects of a named tuple in Python 3 and, starting from the very basics, we'll move up to more complex concepts. You’ll learn why you should use them and how you can use them to create cleaner, pythonic code.

By the end of this guide, you’ll have learned:
- why you should you use it and how it can improve code readability
- how to convert a **namedtuple to dict**  
- how to create a **namedtuple from dict**
- when to choose a **namedtuple vs. dict**
- the best way to convert a **namedtuple to JSON**
- how to add an **optional default values** to a namedtuple
- the unfamiliar way to add a **docstring to namedtuples**
- the **difference between named tuples and dataclasses**
- how to **add a method** to a namedtuple
- how to add **type hints** to a namedtuple

## Table of Contents

1. [What Is a `Namedtuple` And Why You Should Use It](#what-is-a-namedtuple-and-why-you-should-use-it)
2. [How to Create a `namedtuple` from Dict or Regular Tuple](#how-to-create-a-namedtuple-from-dict-or-regular-tuple)
3. [How to Convert a Named Tuple to Dict or Regular Tuple](#how-to-convert-a-named-tuple-to-dict-or-regular-tuple)
4. [How to Sort a List of `namedtuple`s](#how-to-sort-a-list-of-namedtuples)
5. [How to Serialize `namedtuple`s to JSON](#how-to-serialize-namedtuples-to-json)
6. [How to Add a `docstring` to a `namedtuple`](#how-to-add-a-docstring-to-a-namedtuple)
7. [How to Add a Method to a Namedtuple](#how-to-add-a-method-to-a-namedtuple)
8. [What Are the Differences Between `namedtuple`s and Data Classes?](#what-are-the-differences-between-namedtuples-and-data-classes)
9. [How to Add Optional Default Values to a `namedtuple`](#how-to-add-optional-default-values-to-a-namedtuple)
10. [How to Add Type Hints to a `namedtuple`](#how-to-add-type-hints-to-a-namedtuple)
11. [Conclusion](#conclusion)

## What Is a `Namedtuple` And Why You Should Use It

`namedtuple` is a very interesting—and also underrated—data structure. 

A named tuple is an extension of the regular built-in tuple (`namedtuple` is a tuple subclass). It provides the same features as the conventional tuple, but also allows you to access fields via attribute lookup using dot notation, that is, using their names instead of only indexes.

It’s very common to find Python’s code that heavily relies on regular tuples, or sometimes dictionaries, to store data. And don’t get me wrong, both dictionaries and regular tuples have their value; the problem lies in misusing them. For example:

Suppose that you have a function that converts a string into a color. The color must be represented in a 4-dimensional space, the RGBA.

```python
def convert_string_to_color(desc: str, alpha: float = 0.0):
    if desc == "green":
        return 50, 205, 50, alpha
    elif desc == "blue":
        return 0, 0, 255, alpha
    else:
        return 0, 0, 0, alpha
```
Then, we can use it like this:
```python
r, g, b, a = convert_string_to_color(desc="blue", alpha=1.0)
```
Ok, that works, but... we have a couple of problems here. The first one is, there's no way to ensure the order of the returned values. That is, there's nothing stopping another developer to call `convert_string_to_color` like this:
```python
g, b, r, a = convert_string_to_color(desc="blue", alpha=1.0)
```
Also, we may not know that the function returns 4 values, and end up calling the function like so:
 ```python
r, g, b = convert_string_to_color(desc="blue", alpha=1.0)
```
Which, in turn, fails with `ValueError` since we cannot unpack the whole tuple.

> That's true. But why don't you use a dictionary instead?

Python’s dictionaries are a very versatile data structure. They can serve as an easy and convenient way to store multiple values. However, a `dict` doesn’t come without shortcomings. Due to its flexibility, dictionaries are very easily abused. As an illustration, let us convert our example to use a dictionary instead of tuple.

```python
def convert_string_to_color(desc: str, alpha: float = 0.0):
    if desc == "green":
        return {"r": 50, "g": 205, "b": 50, "alpha": alpha}
    elif desc == "blue":
        return {"r": 0, "g": 0, "b": 255, "alpha": alpha}
    else:
        return {"r": 0, "g": 0, "b": 0, "alpha": alpha}
```
Ok, we now can use it like this, expecting just one value to be returned:
```python
color = convert_string_to_color(desc="blue", alpha=1.0)
```
No need to remember the order, but it has at least two drawbacks. The first one is that we must keep track of the key’s names.  If we change `{"r": 0, “g”: 0, “b”: 0, “alpha”: alpha}` to `{”red": 0, “green”: 0, “blue”: 0, “a”: alpha}`, when accessing a field, we’ll get a `KeyError` back, as the keys r, g, b, and alpha no longer exist.

The second issue with `dict`s is that they are not *hashable*. That means we cannot store them in a `set` or other dictionaries. Let’s imagined that we want to keep track of how many colors a particular image has. If we use `collections.Counter` to count, we’ll get `TypeError: unhashable type: ‘dict’`.

Also, dictionaries are mutable objects, so we can add as many new keys as we want. Trust me, this is a recipe for nasty bugs that are really hard to track down.

> Ok, fine, that makes sense. So, now what? What I can use instead?

`namedtuple`s! Just... use it!

Converting our function to use `namedtuple`s is as easy as this:
```python
from collections import namedtuple
...
Color = namedtuple("Color", "r g b alpha")
...
def convert_string_to_color(desc: str, alpha: float = 0.0):
    if desc == "green":
        return Color(r=50, g=205, b=50, alpha=alpha)
    elif desc == "blue":
        return Color(r=50, g=0, b=255, alpha=alpha)
    else:
        return Color(r=50, g=0, b=0, alpha=alpha)
```
Like the `dict`’s case, we can assign it to a single variable and use as we please. There’s no need to remember ordering. And if you’re using an IDE such as *PyCharm* and *VSCode*, you have auto completions out of the box.
```python
color = convert_string_to_color(desc="blue", alpha=1.0)
...
has_alpha = color.alpha > 0.0
...
is_black = color.r == 0 and color.g == 0 and color.b == 0
```
To top it all off, `namedtuple`s are immutable objects. If another developer on the team thinks it’s a good idea to add a new field during runtime, the program will fail.
```console
>>> blue = Color(r=0, g=0, b=255, alpha=1.0)

>>> blue.e = 0
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-13-8c7f9b29c633> in <module>
----> 1 blue.e = 0

AttributeError: 'Color' object has no attribute 'e'
```
Not only that, now we can use it the `Counter` to track how many colors a collection has.
```console
>>> Counter([blue, blue])
>>> Counter({Color(r=0, g=0, b=255, alpha=1.0): 2})
```

## How to Create a `namedtuple` from Dict or Regular Tuple

Now that we understand the motivations behind using `namedtuple`, it’s time to learn how to convert normal tuples and dictionaries into named tuples. 

Let's say that you have dictionary instance containing the RGBA values for a color. If you want to instantiate the `Color` namedtuple we just created, you can pass the dict as keyword arguments to the named tuple constructor:
```python
>>> Color = namedtuple("Color", "r g b alpha")
>>> c = {"r": 50, "g": 205, "b": 50, "alpha": alpha}
>>> Color(**c)
>>> Color(r=50, g=205, b=50, alpha=0)
```
That’s it. We can just leverage the `**` construct to unpack the `dict` as keyword arguments into a `namedtuple`.

> What if I want to create a `namedtuple` class from the `dict`? I mean... not an instance but named tuple *class*?

No problem, if you pass a dict to the namedtuple factory function, it creates a named tuple class using the dictionary fields.

```python
>>> c = {"r": 50, "g": 205, "b": 50, "alpha": alpha}
>>> Color = namedtuple("Color", c)
>>> Color(**c)
Color(r=50, g=205, b=50, alpha=0)
```
Then, to create a new `Color` *instance* from a `dict` we can just unpack the dictionary as keyword arguments, like in the previous example.

## How to Convert a Named Tuple to Dict or Regular Tuple

We've just learned how to convert a `namedtuple` from a `dict`. What about the inverse? How can we convert a namedtuple to a dictionary instance?

It turns out, `namedtuple` comes  with a method called `._asdict()`. So, converting it is as simple as calling the method.
```python
>>> blue = Color(r=0, g=0, b=255, alpha=1.0)
>>> blue._asdict()
{'r': 0, 'g': 0, 'b': 255, 'alpha': 1.0}
```
You may be wondering why the method starts with a `_`. Unfortunately, this is one of the inconsistencies with Python. Usually, `_` represents *private* method or attribute. However, `namedtuple` adds them to its public method to avoid naming conflicts. Besides `_asdict`, there’s also `_replace`, `_fields`, and `_field_defaults`. You can find all of them [here](https://docs.python.org/3/library/collections.html#collections.somenamedtuple._asdict).

To convert a named tuple into a regular tuple, it's enough to pass it to a `tuple`constructor. 

```python
>>> tuple(Color(r=50, g=205, b=50, alpha=0.1))
(50, 205, 50, 0.1)
```

## How to Sort a List of `namedtuple`s

Another common use case is storing several `namedtuple`s instances in a list and sort them based on some criteria. For example, say that we have a list of colors and we need to sort them by alpha intensity. 

Fortunately, Python allows a very *pythonic* way of doing that. We can use the `operator.attrgetter` operator. According to the  [docs](https://docs.python.org/3/library/operator.html#operator.attrgetter), `attrgetter` “returns a callable object that fetches attr from its operand”. In layman’s terms, we can pass the name of the field, we want to sort it and pass it to the `sorted` function. Example:

```python
from operator import attrgetter
...
colors = [
    Color(r=50, g=205, b=50, alpha=0.1),
    Color(r=50, g=205, b=50, alpha=0.5),
    Color(r=50, g=0, b=0, alpha=0.3)
]
...
>>> sorted(colors, key=attrgetter("alpha"))
[Color(r=50, g=205, b=50, alpha=0.1),
 Color(r=50, g=0, b=0, alpha=0.3),
 Color(r=50, g=205, b=50, alpha=0.5)]
```
Now, the list of colors is sorted in ascending order by `alpha` intensity!

## How to Serialize `namedtuple`s to JSON

Sometimes you may need to save a `namedtuple`to JSON. As you may probably know, Python’s dictionaries can be converted to JSON through the `json` module. As a result, if we convert our tuple to dictionary with the `_asdict` method, then we’re all set. As an example, consider this scenario:

```python
>>> blue = Color(r=0, g=0, b=255, alpha=1.0)
>>> import json
>>> json.dumps(blue._asdict())
'{"r": 0, "g": 0, "b": 255, "alpha": 1.0}'
```
As you can see, `json.dumps` converts a `dict` into a JSON string.

## How to Add a `docstring` to a `namedtuple`

In Python, we can document methods, classes and modules using plain strings. This string is then made available as a special attribute named `__doc__`. That being said, how can we add a documentation to our `Color` `namedtuple`?

There’s no right answer to this, but we can do it in two ways. The first one (and a bit more cumbersome) is to extend the tuple using a wrapper. By doing so, we can then define the `docstring` in this wrapper. As an example, consider the following snippet:

```python
_Color = namedtuple("Color", "r g b alpha")

class Color(_Color):
    """A namedtuple that represents a color.
    It has 4 fields:
    r - red
    g - green
    b - blue
    alpha - the alpha channel
    """

>>> print(Color.__doc__)
A namedtuple that represents a color.
    It has 4 fields:
    r - red
    g - green
    b - blue
    alpha - the alpha channel
>>> help(Color)
Help on class Color in module __main__:

class Color(Color)
 |  Color(r, g, b, alpha)
 |  
 |  A namedtuple that represents a color.
 |  It has 4 fields:
 |  r - red
 |  g - green
 |  b - blue
 |  alpha - the alpha channel
 |  
 |  Method resolution order:
 |      Color
 |      Color
 |      builtins.tuple
 |      builtins.object
 |  
 |  Data descriptors defined here:
 |  
 |  __dict__
 |      dictionary for instance variables (if defined)
```
As you can see, by inheriting the `_Color` tuple, we added a `__doc__` attribute it.

The second way of adding `docstring` is just setting `__doc__`. You see? There’s no need to extend the tuple in the first place.

```python
>>> Color.__doc__ = """A namedtuple that represents a color.
    It has 4 fields:
    r - red
    g - green
    b - blue
    alpha - the alpha channel
    """
```
Just bear in mind that these methods only work on Python 3+.

## How to Add a Method to a Namedtuple 

You can add a method to a named tuple class by using inheritance. Following the previous example, we can extend it not only to add a docstring but also to add custom methods.

```python
>>> from collections import namedtuple

>>> _Color = namedtuple("Color", "r g b")

>>> class Color(_Color):
            """A namedtuple that represents a color.
            It has 3 fields:
            r - red
            g - green
            b - blue
            """
            def to_hex(self) -> str:
                return f"#{self.r:02x}{self.g:02x}{self.b:02x}"    

>>> blue = Color(r=0, g=0, b=255, alpha=1.0)

>>> blue.to_hex()
'#0000ff'
```

## What Are the Differences Between `namedtuple`s and Data Classes?

Before Python 3.7, creating a simple container of data involved using either:
- a `namedtuple` 
- a regular class 
- a third-party library, such as `attrs`. 

If you wanted to go through the class route, that meant you would have to implement a couple of methods. For instance, a regular class will require a `__init__` method to set the attributes during class instantiation. If you wanted the class to be *hashable*, that meant implementing yourself a `__hash__` method. To compare different objects, you also want a `__eq__` method implemented. And finally, to make debugging easier, you need a `__repr__` method. Again, let’s revisit our color use case again using a regular class.

```python
class Color:
    """A regular class that represents a color."""

    def __init__(self, r, g, b, alpha=0.0):
        self.r = r
        self.g = g
        self.b = b
        self.alpha = alpha

    def __hash__(self):
        return hash((self.r, self.g, self.b, self.alpha))

    def __repr__(self):
        return "{0}({1}, {2}, {3}, {4})".format(
            self.__class__.__name__, self.r, self.g, self.b, self.alpha
        )

    def __eq__(self, other):
        if not isinstance(other, Color):
            return False
        return (
            self.r == other.r
            and self.g == other.g
            and self.b == other.b
            and self.alpha == other.alpha
        )
```
As you can see, there's a lot to implement. You just need a container to hold the data for you and not bother with distracting details. Also, a key difference why people preferred to implement a class is that they are mutable. In fact, the [PEP](https://www.python.org/dev/peps/pep-0557/#abstract) that introduced Data Classes refers them as "mutable `namedtuple`s with defaults".

Now, let's see how this class is implemented as a Data Class.
```python
from dataclasses import dataclass
...
@dataclass
class Color:
    """A regular class that represents a color."""
    r: float
    g: float
    b: float
    alpha: float
```
> Wow! Is that it?

Yes, that's it. As simple as that! A major difference is that, since there's no `__init__` any more, you can just define the attributes after the `docstring`. Also, they must be annotated with a type hint.

Besides being mutable, a Data Class can also have optional fields out of the box. Let’s say that our `Color` class does not require an `alpha` field. We can then make it `Optional`.

```python
from dataclasses import dataclass
from typing import Optional
...
@dataclass
class Color:
    """A regular class that represents a color."""
    r: float
    g: float
    b: float
    alpha: Optional[float] = None
```

And we can instantiate it like so:
```console
>>> blue = Color(r=0, g=0, b=255)
```

Since they're mutable, we can change whatever field we want. And we can instantiate it like so:
```console
>>> blue = Color(r=0, g=0, b=255)
>>> blue.r = 1
>>> # or even add more fields on the fly
>>> blue.e = 10
```
Unfortunately, due to their nature, `namedtuple`s don't have optional fields by default. To add them we need a bit of a hack and a little meta-programming.

Caveat: To add a `__hash__` method, you need to make them immutable by setting `unsafe_hash` to `True`:
```python
@dataclass(unsafe_hash=True)
class Color:
    ...
```
Another difference is that unpacking is a first-class citizen with `namedtuple`s. If you want your Data Class to have the same behavior, you must implement yourself.
```python
from dataclasses import dataclass, astuple
...
@dataclass
class Color:
    """A regular class that represents a color."""
    r: float
    g: float
    b: float
    alpha: float

    def __iter__(self):
        yield from dataclasses.astuple(self)
```
### Performance Comparison

Comparing only the features is not enough, named tuples and data classes differ in performance too. Data classes are implemented in pure Python and based on a `dict`. This makes them faster when it comes to accessing the fields using dot notation. 

On the other hand, `namedtuple`s are just an extension a regular `tuple`. That means their implementation is based on a faster C code and have a smaller memory footprint.

To show that, consider this experiment on Python 3.8.5.
```console
In [6]: import sys

In [7]: ColorTuple = namedtuple("Color", "r g b alpha")

In [8]: @dataclass
   ...: class ColorClass:
   ...:     """A regular class that represents a color."""
   ...:     r: float
   ...:     g: float
   ...:     b: float
   ...:     alpha: float
   ...: 

In [9]: color_tup = ColorTuple(r=50, g=205, b=50, alpha=1.0)

In [10]: color_cls = ColorClass(r=50, g=205, b=50, alpha=1.0)

In [11]: %timeit color_tup.r
36.8 ns ± 0.109 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [12]: %timeit color_cls.r
38.4 ns ± 0.112 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [15]: sys.getsizeof(color_tup)
Out[15]: 72

In [16]: sys.getsizeof(color_cls) + sys.getsizeof(vars(color_cls))
Out[16]: 152
```
As you can see, accessing a field is slightly faster in a `dataclass`, however when it comes to memory usage, they take up much more space than a tuple.

## How to Add Type Hints to a `namedtuple`

As you can see, Data Classes use type hints by default. However, we can have them on `namedtuple`s as well. By importing the `Namedtuple` annotation type and inheriting from it, we can have our `Color` tuple annotated.

```python
from typing import NamedTuple
...
class Color(NamedTuple):
    """A namedtuple that represents a color."""
    r: float
    g: float
    b: float
    alpha: float
```

Another detail that might have gone unnoticed is that this way also allows us to have `docstring`s. If we type `help(Color)` we'll be able to see them.
```console
Help on class Color in module __main__:

class Color(builtins.tuple)
 |  Color(r: float, g: float, b: float, alpha: Union[float, NoneType])
 |  
 |  A namedtuple that represents a color.
 |  
 |  Method resolution order:
 |      Color
 |      builtins.tuple
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __getnewargs__(self)
 |      Return self as a plain tuple.  Used by copy and pickle.
 |  
 |  __repr__(self)
 |      Return a nicely formatted representation string
 |  
 |  _asdict(self)
 |      Return a new dict which maps field names to their values.

```

## How to Add Optional Default Values to a `namedtuple`

In the last section, we learned that Data Classes can have optional values. Also, I mentioned that to mimic the same behavior on a `named tuple` requires some hacking. As it turns out, we can use inheritance, as in the example below.

```python
from collections import namedtuple

class Color(namedtuple("Color", "r g b alpha")):
    __slots__ = ()
    def __new__(cls, r, g, b, alpha=None):
        return super().__new__(cls, r, g, b, alpha)
>>> c = Color(r=0, g=0, b=0)
>>> c
Color(r=0, g=0, b=0, alpha=None)
```

## Conclusion

Named tuples are a very powerful data structure. They allows us to create pythonic code that's cleaner, and more reliable. Despite the competition against the new Data Classes, they still have plenty of firewood to burn. In this tutorial, we learned several ways of making use of `namedtuple`s, and I hope you can them useful.
