How to Unit Test Complex Data Like Numpy Arrays in Python

How to Unit Test Complex Data Like Numpy Arrays in Python

In this tutorial, we'll learn how to test complex data structures in Python.

Examples of such data include images, nested dictionaries, dictionaries of numpy arrays and even file contents. Learning how to test general data is a very helpful skill, especially if you work in Data Science, Machine Learning or other kinds of scientific programming.

Table of Contents

  1. Requirements
  2. Installation
  3. Creating a Demo App
  4. Conclusion


For this tutorial we'll use Python 3.6+, pytest, poetry, pandas, numpy, PIL and pytest-regressions.


To create a new project, we’ll use poetry. If you don’t know what poetry is and how to use it, please refer to the amazing official docs.

The first step is to create a new directory and initialize the project.

mkdir -p complex_data/app
mkdir -p complex_data/tests
cd complex_data

Then, you can install all the dependencies required from the pyproject.toml file.

$ cat <<EOF > pyproject.toml
name = "complex_data"
version = "0.1.0"
description = "Sample project to learn how to test complex data in Python"
authors = ["miguendes"]
license = "MIT"

python = "^3.7"
pytest = "^6.0.1"
pytest-regressions = "^2.0.1"
numpy = "^1.19.1"
pandas = "^1.1.2"
pillow = "^7.2.0"


requires = ["poetry-core>=1.0.0a5"]
build-backend = "poetry.core.masonry.api"

Now that we have all setup, let's install the packages. Poetry will take care of the virtualenv creation for you and will install everything there.

poetry install

Creating a Demo App

Now, let’s create a demo app so we can build tests for it. This demo will be very basic and enough to demonstrate how to test complex data. The concepts can be applied to any domain that deals with a similar type of data.

To start with, we’ll create two files. The first one will be the actual demo implementation. The next one will hold all the test case we are going to build. Alternatively, you can also clone the project from github .

touch app/
touch app/
touch tests/
touch tests/

Testing Nested Dictionaries

A public REST API is a popular way to get data from third party entities. They usually return the data in a json format. And sometimes the data is organized in a deep nested structure. In Python this data will most likely be represented as a dictionary. To exemplify a function that returns a deep dictionary, consider the following example.

# app/
def config_data(verbose_desc: str, simple_desc: str) -> dict:
    return {
        "agg_results": {
            "t1": [True, False, True, False],
            "nums": [1.3, 3.4, 23.456, 21.3456],
            "count": 120,
        "sample": {"desc": {"verbose": verbose_desc, "simple": simple_desc}},

This function takes two parameters, verbose_desc and simple_desc, and set it under sample -> desc. Now, how can we test this function? One way is to test it by inspecting each key->value pair.

def test_config_data():
    actual = config_data(verbose_desc="first experiment", simple_desc="first")
    assert actual["agg_results"]["t1"] == [True, False, True, False]
    assert actual["agg_results"]["nums"] == [1.3, 3.4, 23.456, 21.3456]
    assert actual["agg_results"]["count"] == 120

    assert actual["sample"]["desc"]["verbose"] == "first experiment"
    assert actual["sample"]["desc"]["simple"] == "first"

If we run the test, we can see that it works, but... You know, it’s too verbose and error-prone. For example, if we add new keys we will need to update the test to add more assert statements. Also, it doesn’t scale at all if the dictionary contains large data. So what is the alternative?


It turns out there’s a very interesting pytest plugin that makes this kind of test very easy and simple. With just a few lines of code, you can test this config data.

# tests/
def test_config_data_with_pytest_regressions(data_regression):
    actual = config_data(verbose_desc="first experiment", simple_desc="first")

That's it! The only step needed is using the data_regression fixture and check the actual data returned. pytest-regressions will create a yaml file with the expected data the first time you run the test. Then it will use this file when you call the test again. As a result, if a regression is introduced, the plugin will detect the diff and the test will fail.

When you run the test for the first time, the test will fail and pytest-regressions will store the output in a directory with the name of the test file. FAILED     [100%] (test_config_data_with_pytest_regressions)
data_regression = <pytest_regressions.data_regression.DataRegressionFixture object at 0x7ff6c4b4f130>

    def test_config_data_with_pytest_regressions(data_regression):
        actual = config_data(verbose_desc="first experiment", simple_desc="first")
>       data_regression.check(actual)
E       Failed: File not found in data directory, created:
E       - /home/miguel/projects/tutorials/complex_data/tests/test_complex_app/test_config_data_with_pytest_regressions.yml

If you run it again, the test pass. PASSED     [100%]

Testing Images

Image manipulation is a very common task in AI. In order to train Machine Learning algorithms, images must be pre-processed. For instance, some algorithms require the image to be in gray-scale. Sometimes, images must be scaled down, or up. In other occasions you might want to do image augmentation, which usually consists in rotating or skewing the image.

Now, let’s say that we need to resize down an image to 100x100 and convert it to gray as a pre-processing step. We can write a very simple function called convert_image_to_gray. To test this function, we can rely on the great fixture called image_regression provided by pytest-regressions. The following example illustrates that.

# app/
def convert_image_to_gray(
    input_path: str, output_path: str, size: Tuple[int, int] = (200, 200)
    image =
    gray_image = image.convert("L")
    gray_image.thumbnail(size, Image.ANTIALIAS), "PNG")

The test for it is just like this:

# tests/
def test_convert_image_to_gray(image_regression):
    output_file = Path("tests/resources/python_logo_gray.png")
        "tests/resources/python_logo.png", str(output_file), size=(100, 100)
    image_regression.check(output_file.read_bytes(), diff_threshold=1.0)

And that's it again! No extra setup steps, no complex asserts, just one line.

Testing Dictionaries of Numpy Arrays

Another common practice in Data Science is to manipulate numpy arrays. It’s very usual to have dictionaries composed of string keys and numpy arrays as values. Testing that is just as annoying as testing nested dictionaries. It tends to get worse when the arrays are made up of float numbers.

Again, as our previous examples, let's consider a function that returns a dictionary of numpy arrays.

# app/
def element_wise_mult(a: np.array, b: np.array) -> Dict[str, np.array]:
    res = a * b
    return {"res": res}

This function just multiplies two 1D numpy arrays and stores the result in a dictionary. To test that we can use num_regression fixture. As you may have guessed, the interface is the same as the other fixtures using the check method. A full test definition goes like this:

# tests/
def test_elemwise_multi_calculation(num_regression):
    a = np.random.randn(6)
    b = np.random.randn(6)
    result = element_wise_mult(a, b)

Testing Files

Python can represent a file object in three categories: raw binary files, buffered binary files and text files. A binary file can be used to store any kind of binary data such as images, mp3s, videos and so on. A text file, on the other hand, can be a simple .txt or a .html, .md or just your gold old .py script.

Binary File Example

The simplest example of binary file would be a function that takes a path string and loads a binary file from disk and returns its bytes. pytest-regression comes with a fixture file_regression that helps a lot when testing file contents.

# app/
def read_from_file(path: str) -> bytes:
    with open(path, "rb") as f:

The test goes like this:

# app/
def test_read_from_file(file_regression, datadir):
    contents = read_from_file("tests/resources/data.bin")
    file_regression.check(contents, binary=True, extension=".bin")

Text File Example

Like I said before, an .html file is just a text file. In this example we'll consider a function that converts an dictionary to a HTML table.

# app/
def dict_to_html(data: dict) -> str:
    html = "<table><tr><th>" + "</th><th>".join(data.keys()) + "</th></tr>"

    for row in zip(*data.values()):
        html += "<tr><td>" + "</td><td>".join(row) + "</td></tr>"

    html += "</table>"

    return html

And the test...

# tests/
def test_dict_to_html(file_regression):
    data = {
        "Heights": ["30", "12", "12"],
        "Download Count": ["123", "34", "2"],

    html = dict_to_html(data)
    file_regression.check(html, extension=".html")

Well... We could just assert the string returned, right?

You might be asking. We could just assert the string returned. Yes, that’s a fair point, however if the string is too big, the test setup will become polluted and overly verbose. As an alternative, we could store it as a file and then load it. Well, that’s actually what pytest-regressions does, with the benefit of hiding that complexity.


Testing complex data can be tricky and cumbersome. Fortunately, pytest allows great extensibility as plugins. pytest-regression is a non popular but very useful plugin that makes testing complex data easier. I hope you learned something useful and see you next time!