<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[A blog about Python, testing, and best practices]]></title><description><![CDATA[Helping developers learn more about Python, testing, and best practices. I share everything I know through tutorials, and code examples.]]></description><link>https://miguendes.me</link><generator>RSS for Node</generator><lastBuildDate>Mon, 20 Apr 2026 03:48:52 GMT</lastBuildDate><atom:link href="https://miguendes.me/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[The Best Ways to Compare Two Lists in Python]]></title><description><![CDATA[A while ago I wrote a guide on how to compare two dictionaries in Python 3, and how this task is not as simple as it might sound. It turns out comparing two lists in Python is just so tricky as comparing dicts.
The way we've been taught to compare tw...]]></description><link>https://miguendes.me/python-compare-lists</link><guid isPermaLink="true">https://miguendes.me/python-compare-lists</guid><category><![CDATA[Python 3]]></category><category><![CDATA[Python]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sun, 12 Dec 2021 08:38:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1639129372385/8FJrwRpJr.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A while ago I wrote a guide on <a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">how to compare two dictionaries</a> in Python 3, and how this task is not as simple as it might sound. It turns out comparing two lists in Python is just so tricky as comparing <code>dict</code>s.</p>
<p>The way we've been taught to compare two objects in Python is a bit misleading. Most books and tutorials teach object comparison by using either the <code>==</code> or the <code>is</code> operator. In reality, these two operators cover just a small fraction of the most frequent use cases. </p>
<p>For example:</p>
<ul>
<li>what if we want to compare a list of floating-point numbers considering a certain tolerance?</li>
<li>what if we wish to contrast two lists but ignoring the order in which the elements appear?</li>
<li>maybe we need to compare two lists and return the elements that intersect both</li>
<li>sometimes we might want to get the difference between two lists</li>
<li>what if we have two lists of strings and need to compare them by ignoring the string cases?</li>
<li>what if we're given a list of <code>numpy</code> arrays to compare each other, what can we do?</li>
<li>or maybe we have a list of custom objects, or a list of dictionaries.</li>
</ul>
<p>The list goes on and on, and for all of these use cases using <code>==</code> doesn't help.</p>
<p>That's what we are going to see in this article. We’ll learn the best ways of comparing two lists in Python for several use cases where the <code>==</code> operator is not enough.</p>
<p>Ready? Let's go! </p>
<h2 id="heading-comparing-if-two-lists-are-equal-in-python">Comparing if two lists are equal in python</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639165088994/lBeHepjcd.png" alt="Comparing if two lists are equal in python" /></p>
<p>The easiest way to compare two lists for equality is to use the <code>==</code> operator. This comparison method works well for simple cases, but as we'll see later, it doesn't work with advanced comparisons.</p>
<p>An example of a simple case would be a list of <code>int</code> or <code>str</code> objects.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>numbers = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span>target = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers == target
<span class="hljs-literal">True</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>] == [<span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>]
<span class="hljs-literal">False</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>[<span class="hljs-string">'name'</span>, <span class="hljs-string">'lastname'</span>] == [<span class="hljs-string">'name'</span>, <span class="hljs-string">'lastname'</span>]
<span class="hljs-literal">True</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>[<span class="hljs-string">'name'</span>, <span class="hljs-string">'lastname'</span>] == [<span class="hljs-string">'name'</span>, <span class="hljs-string">'last name'</span>]   
<span class="hljs-literal">False</span>
</code></pre>
<p>Pretty simple, right? Unfortunately, the world is complex, and so is production grade code. In the real world, things get complicated really fast. As an illustration, consider the following cases.</p>
<p>Suppose you have a list of floating points that is built dynamically. You can add single elements, or elements derived from a mathematical operation such as <code>0.1 + 0.1</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>numbers = []
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers.append(<span class="hljs-number">0.1</span> + <span class="hljs-number">0.1</span> + <span class="hljs-number">0.1</span>)  <span class="hljs-comment"># derive the element based on a summation</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers.append(<span class="hljs-number">0.2</span>) <span class="hljs-comment"># add a single element</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>target = [<span class="hljs-number">0.3</span>, <span class="hljs-number">0.2</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers == target  <span class="hljs-comment"># compares the lists</span>
<span class="hljs-literal">False</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers  <span class="hljs-comment"># Ooopppssss....</span>
[<span class="hljs-number">0.30000000000000004</span>, <span class="hljs-number">0.2</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span>target
[<span class="hljs-number">0.3</span>, <span class="hljs-number">0.2</span>]
</code></pre>
<p>Clearly, floating point arithmetic has its <a target="_blank" href="https://docs.python.org/3/tutorial/floatingpoint.html">limitations</a>, and sometimes we want to compare two lists but ignore precision errors, or even define some tolerance. For cases like this, the <code>==</code> operator won’t suffice.</p>
<p>Things can get more complicated if the lists have custom objects or objects from other libraries, such as <code>numpy</code>.</p>
<pre><code class="lang-python">In [<span class="hljs-number">1</span>]: <span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np

In [<span class="hljs-number">2</span>]: numbers = [np.ones(<span class="hljs-number">3</span>), np.zeros(<span class="hljs-number">2</span>)]

In [<span class="hljs-number">3</span>]: numbers
Out[<span class="hljs-number">3</span>]: [array([<span class="hljs-number">1.</span>, <span class="hljs-number">1.</span>, <span class="hljs-number">1.</span>]), array([<span class="hljs-number">0.</span>, <span class="hljs-number">0.</span>])]

In [<span class="hljs-number">4</span>]: target = [np.ones(<span class="hljs-number">3</span>), np.zeros(<span class="hljs-number">2</span>)]

In [<span class="hljs-number">5</span>]: numbers == target
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-5</span>-b832db4b039d&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> numbers == target

ValueError: The truth value of an array <span class="hljs-keyword">with</span> more than one element <span class="hljs-keyword">is</span> ambiguous. Use a.any() <span class="hljs-keyword">or</span> a.all()
</code></pre>
<p>You might also like to compare the lists and return the matches. Or maybe compare the two lists and return the differences. Or perhaps you want to compare two lists ignoring the duplicates, or compare a list of dictionaries in Python. </p>
<p>In every single case, using <code>==</code> is not the answer, and that's what we are going to see next: how to perform complex comparison operations between two lists in Python.</p>
<h2 id="heading-comparing-two-lists-of-float-numbers">Comparing two lists of float numbers</h2>
<p>In the previous section, we saw that floating point arithmetic can cause precision errors. If we have a list of floats and want to compare it with another list, chances are that the <code>==</code> operator won't help.</p>
<p>Let's revisit the example from the previous section and see what is the best way of comparing two lists of floats.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>numbers = []
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers.append(<span class="hljs-number">0.1</span> + <span class="hljs-number">0.1</span> + <span class="hljs-number">0.1</span>)  <span class="hljs-comment"># derive the element based on a summation</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers.append(<span class="hljs-number">0.2</span>) <span class="hljs-comment"># add a single element</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>target = [<span class="hljs-number">0.3</span>, <span class="hljs-number">0.2</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers == target  <span class="hljs-comment"># compares the lists</span>
<span class="hljs-literal">False</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>numbers  <span class="hljs-comment"># Ooopppssss....</span>
[<span class="hljs-number">0.30000000000000004</span>, <span class="hljs-number">0.2</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span>target
[<span class="hljs-number">0.3</span>, <span class="hljs-number">0.2</span>]
</code></pre>
<p>As you see, <code>0.1 + 0.1 + 0.1 = 0.30000000000000004</code>, which causes the comparison to fail. Now, how can we do better? Is it even possible?</p>
<p>There are a few ways of doing approaching this task. One would be to create our own custom function, that iterates over the elements and compare it one by one using the <a target="_blank" href="https://docs.python.org/3/library/math.html#math.isclose"><code>math.isclose()</code></a> function.</p>
<p>Fortunately we don't have to reinvent the wheel. As I showed in the <a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">"how to compare two dicts"</a> article, we can use a library called <code>deepdiff</code> for that. This library supports different types of objects and lists are one of them.</p>
<p>The example below starts off by setting up the two lists we want to compare. We then pass it to the <code>deepdiff.DeepDiff</code> constructor which returns the difference. That's great, the returned value is much more informative than a simple boolean. </p>
<p>Since we want to ignore the precision error, we can <a target="_blank" href="https://zepworks.com/deepdiff/current/diff.html">set the number of digits AFTER the decimal point</a> to be used in the comparison. </p>
<p>The result is an empty dict, which means the lists are equal. If we try comparing a list with a float number that differs in more than 3 significant digits, the library will return that diff.</p>
<p>For reproducibility, in this article I used the latest version of <code>deepdiff</code> which is <code>5.6.0</code>.</p>
<pre><code class="lang-python">In [<span class="hljs-number">1</span>]: <span class="hljs-keyword">from</span> deepdiff <span class="hljs-keyword">import</span> DeepDiff

In [<span class="hljs-number">2</span>]: numbers = []

In [<span class="hljs-number">3</span>]: numbers.append(<span class="hljs-number">0.1</span> + <span class="hljs-number">0.1</span> + <span class="hljs-number">0.1</span>)  <span class="hljs-comment"># derive the element based on a summation</span>

In [<span class="hljs-number">4</span>]: numbers.append(<span class="hljs-number">0.2</span>) <span class="hljs-comment"># add a single element</span>

In [<span class="hljs-number">5</span>]: target = [<span class="hljs-number">0.3</span>, <span class="hljs-number">0.2</span>]

<span class="hljs-comment"># if we don't specify the number of significant digits, the comparison will use ==</span>
In [<span class="hljs-number">6</span>]: DeepDiff(numbers, target)
Out[<span class="hljs-number">6</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[0]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">0.3</span>,
   <span class="hljs-string">'old_value'</span>: <span class="hljs-number">0.30000000000000004</span>}}}

<span class="hljs-comment"># 0.30000000000000004 and 0.3 are equal if we only look at the first 3 significant digits</span>
In [<span class="hljs-number">7</span>]: DeepDiff(numbers, target, significant_digits=<span class="hljs-number">3</span>)
Out[<span class="hljs-number">7</span>]: {}

In [<span class="hljs-number">8</span>]: numbers
Out[<span class="hljs-number">8</span>]: [<span class="hljs-number">0.30000000000000004</span>, <span class="hljs-number">0.2</span>]

In [<span class="hljs-number">9</span>]: target = [<span class="hljs-number">0.341</span>, <span class="hljs-number">0.2</span>]

<span class="hljs-comment"># 0.341 differs in more than 3 significant digits</span>
In [<span class="hljs-number">10</span>]: DeepDiff(numbers, target, significant_digits=<span class="hljs-number">3</span>)
Out[<span class="hljs-number">10</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[0]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">0.341</span>,
   <span class="hljs-string">'old_value'</span>: <span class="hljs-number">0.30000000000000004</span>}}}
</code></pre>
<h2 id="heading-comparing-if-two-lists-without-order-unordered-lists-are-equal">Comparing if two lists without order (unordered lists) are equal</h2>
<p>Lists in Python are unordered by default. Sometimes we want to compare two lists but treat them as the same as long as they have the same elements—regardless of their order.</p>
<p>There are two ways of doing this:</p>
<ul>
<li>sorting the lists and using the <code>==</code> operator</li>
<li>converting them to <code>set</code>s and using the <code>==</code> operator</li>
<li>using <code>deepdiff</code></li>
</ul>
<p>These first two methods assume the elements can be safely compared using the <code>==</code>  operator. This approach doesn’t work for floating-point numbers, and other complex objects, but as we saw in the previous section, we can use <code>deepdiff</code>.</p>
<h3 id="heading-sorting-the-lists-and-using-the-operator">Sorting the lists and using the <code>==</code> operator</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639167507922/6PwNOccSb.png" alt="comparing two lists in python using the sorted function" /></p>
<p>You can sort lists in Python in two different ways:</p>
<ul>
<li>using the <code>list.sort()</code> method</li>
<li>using the <code>sorted()</code> function</li>
</ul>
<p>The first method sorts a list in place, and that means your list will be modified. It's a good idea to not modify a list in place as it can introduce bugs that are hard to detect.</p>
<p>Using <code>sorted</code> is better since it returns a new list and keep the original unmodified.</p>
<p>Let's see how it works.</p>
<pre><code class="lang-python">In [<span class="hljs-number">6</span>]: numbers = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">20</span>]

In [<span class="hljs-number">7</span>]: target = [<span class="hljs-number">10</span>, <span class="hljs-number">20</span>, <span class="hljs-number">30</span>]

In [<span class="hljs-number">8</span>]: numbers == target
Out[<span class="hljs-number">8</span>]: <span class="hljs-literal">False</span>

In [<span class="hljs-number">9</span>]: sorted(numbers) == sorted(target)
Out[<span class="hljs-number">9</span>]: <span class="hljs-literal">True</span>

In [<span class="hljs-number">10</span>]: sorted(numbers)
Out[<span class="hljs-number">10</span>]: [<span class="hljs-number">10</span>, <span class="hljs-number">20</span>, <span class="hljs-number">30</span>]

In [<span class="hljs-number">11</span>]: sorted(target)
Out[<span class="hljs-number">11</span>]: [<span class="hljs-number">10</span>, <span class="hljs-number">20</span>, <span class="hljs-number">30</span>]
</code></pre>
<p>As a consequence, by sorting the lists first we ensure that both lists will have the same order, and thus can be compared using the <code>==</code> operator.</p>
<h3 id="heading-converting-the-lists-to-a-set">Converting the <code>list</code>s to a <code>set</code></h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639167841292/pSz_8OBMH.png" alt="comparing two lists in python using a set" /></p>
<p>Contrary to lists, sets in Python don’t care about order. For example, a set <code>{1, 2, 3}</code> is the same as <code>{2, 3, 1}</code>. As such, we can use this feature to compare the two lists ignoring the elements’ order.</p>
<p>To do so, we convert each list into a set, then using the <code>==</code> to compare them. </p>
<pre><code class="lang-python">In [<span class="hljs-number">12</span>]: numbers = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">20</span>]

In [<span class="hljs-number">13</span>]: target = [<span class="hljs-number">10</span>, <span class="hljs-number">20</span>, <span class="hljs-number">30</span>]

In [<span class="hljs-number">14</span>]: set(numbers) == set(target)
Out[<span class="hljs-number">14</span>]: <span class="hljs-literal">True</span>

In [<span class="hljs-number">15</span>]: set(numbers)
Out[<span class="hljs-number">15</span>]: {<span class="hljs-number">10</span>, <span class="hljs-number">20</span>, <span class="hljs-number">30</span>}

In [<span class="hljs-number">16</span>]: set(target)
Out[<span class="hljs-number">16</span>]: {<span class="hljs-number">10</span>, <span class="hljs-number">20</span>, <span class="hljs-number">30</span>}
</code></pre>
<h3 id="heading-using-the-deepdiff-library">Using the <code>deepdiff</code> library</h3>
<p>This library also allows us to ignore the order in sequences such as <code>list</code>s. By default, it will take the order in consideration, but if we set <code>ignore_order</code> to <code>True</code>, then we're all good. Let's see this in action.</p>
<pre><code class="lang-python">In [<span class="hljs-number">11</span>]: numbers = [<span class="hljs-number">10</span>, <span class="hljs-number">30</span>, <span class="hljs-number">20</span>]

In [<span class="hljs-number">12</span>]: target = [<span class="hljs-number">10</span>, <span class="hljs-number">20</span>, <span class="hljs-number">30</span>]

In [<span class="hljs-number">13</span>]: DeepDiff(numbers, target)
Out[<span class="hljs-number">13</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[1]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">20</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">30</span>},
  <span class="hljs-string">'root[2]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">30</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">20</span>}}}

In [<span class="hljs-number">14</span>]: DeepDiff(numbers, target, ignore_order=<span class="hljs-literal">True</span>)
Out[<span class="hljs-number">14</span>]: {}
</code></pre>
<p>Using <code>deepdiff</code> has pros and cons. In the end, it is an external library you need to install, so if you can use a <code>set</code> to compare the lists, then stick to it. However, if you have other use cases where it can shine, then I’d go with it.</p>
<h2 id="heading-how-to-compare-two-lists-and-return-matches">How to compare two lists and return matches</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639297052330/gGdBccXi7.png" alt="getting the intersection of two lists in python" /></p>
<p>In this section, we'll see how we can compare two lists and find their intersection. In other words, we want to find the values that appear in both. </p>
<p>To do that, we can once more use a <code>set</code> and take their <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#set.intersection">intersection</a>. </p>
<pre><code class="lang-python">In [<span class="hljs-number">1</span>]: t1 = [<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">9</span>, <span class="hljs-number">3</span>]

In [<span class="hljs-number">2</span>]: t2 = [<span class="hljs-number">7</span>, <span class="hljs-number">6</span>, <span class="hljs-number">11</span>, <span class="hljs-number">12</span>, <span class="hljs-number">9</span>, <span class="hljs-number">23</span>, <span class="hljs-number">2</span>]

In [<span class="hljs-number">3</span>]: set(t1).intersection(set(t2))
Out[<span class="hljs-number">3</span>]: {<span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">9</span>}

<span class="hljs-comment"># the &amp; operator is a shorthand for the set.intersection() method </span>
In [<span class="hljs-number">4</span>]: set(t1) &amp; set(t2)
Out[<span class="hljs-number">4</span>]: {<span class="hljs-number">2</span>, <span class="hljs-number">7</span>, <span class="hljs-number">9</span>}
</code></pre>
<h2 id="heading-how-to-compare-two-lists-in-python-and-return-differences">How to compare two lists in python and return differences</h2>
<p>We can the find difference between two lists in python in two different ways:</p>
<ul>
<li>using <code>set</code></li>
<li>using the<code>deepdiff</code> library</li>
</ul>
<h3 id="heading-using-set">Using <code>set</code></h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1639297653548/-ESDTI045.png" alt="getting the difference between two lists in python using set" /></p>
<p>Just like we did to determine the intersection, we can leverage the <code>set</code> data structure to check difference between two lists in python. </p>
<p>If we want to get all the elements that are present in the first list but not in the second, we can use the <code>set.difference()</code>. </p>
<p>On the other hand, if we want to find all the elements that are in either of the lists but not both, then we can use <code>set.symmetric_difference()</code>.</p>
<pre><code class="lang-python">In [<span class="hljs-number">8</span>]: t1 = [<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">9</span>, <span class="hljs-number">3</span>]

In [<span class="hljs-number">9</span>]: t2 = [<span class="hljs-number">7</span>, <span class="hljs-number">6</span>, <span class="hljs-number">11</span>, <span class="hljs-number">12</span>, <span class="hljs-number">9</span>, <span class="hljs-number">23</span>, <span class="hljs-number">2</span>]

In [<span class="hljs-number">10</span>]: set(t1).difference(set(t2))
Out[<span class="hljs-number">10</span>]: {<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>}

In [<span class="hljs-number">11</span>]: set(t2).difference(set(t1))
Out[<span class="hljs-number">11</span>]: {<span class="hljs-number">6</span>, <span class="hljs-number">11</span>, <span class="hljs-number">12</span>, <span class="hljs-number">23</span>}

In [<span class="hljs-number">12</span>]: set(t1).symmetric_difference(set(t2))
Out[<span class="hljs-number">12</span>]: {<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">11</span>, <span class="hljs-number">12</span>, <span class="hljs-number">23</span>}

In [<span class="hljs-number">13</span>]: set(t1) - set(t2)
Out[<span class="hljs-number">13</span>]: {<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>}

In [<span class="hljs-number">14</span>]: set(t1) ^ set(t2)
Out[<span class="hljs-number">14</span>]: {<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">6</span>, <span class="hljs-number">11</span>, <span class="hljs-number">12</span>, <span class="hljs-number">23</span>}
</code></pre>
<p>This method has a limitation: it groups what is different between the lists into one final result which is the set difference. What if we want to know which elements in that diff belong to what list?</p>
<h3 id="heading-using-deepdiff">Using <code>deepdiff</code></h3>
<p>As we've seen so far, this library is powerful and it returns a nice diff. Let's see what happens when we use <code>deepdiff</code> to get the difference between two lists in Python.</p>
<pre><code class="lang-python">In [<span class="hljs-number">15</span>]: t1 = [<span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>, <span class="hljs-number">7</span>, <span class="hljs-number">4</span>, <span class="hljs-number">9</span>, <span class="hljs-number">3</span>]

In [<span class="hljs-number">16</span>]: t2 = [<span class="hljs-number">7</span>, <span class="hljs-number">6</span>, <span class="hljs-number">11</span>, <span class="hljs-number">12</span>, <span class="hljs-number">9</span>, <span class="hljs-number">23</span>, <span class="hljs-number">2</span>]

In [<span class="hljs-number">17</span>]: DeepDiff(t1, t2)
Out[<span class="hljs-number">17</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[0]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">7</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">2</span>},
  <span class="hljs-string">'root[1]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">6</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">1</span>},
  <span class="hljs-string">'root[2]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">11</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">0</span>},
  <span class="hljs-string">'root[3]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">12</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">7</span>},
  <span class="hljs-string">'root[4]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">9</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">4</span>},
  <span class="hljs-string">'root[5]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">23</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">9</span>},
  <span class="hljs-string">'root[6]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">2</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">3</span>}}}

In [<span class="hljs-number">18</span>]: DeepDiff(t1, t2, ignore_order=<span class="hljs-literal">True</span>)
Out[<span class="hljs-number">18</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[4]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">6</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">4</span>},
  <span class="hljs-string">'root[6]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">11</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">3</span>},
  <span class="hljs-string">'root[1]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">12</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">1</span>}},
 <span class="hljs-string">'iterable_item_added'</span>: {<span class="hljs-string">'root[5]'</span>: <span class="hljs-number">23</span>},
 <span class="hljs-string">'iterable_item_removed'</span>: {<span class="hljs-string">'root[2]'</span>: <span class="hljs-number">0</span>}}
</code></pre>
<p>Accordingly, <code>deepdiff</code> returns what changed from one list to the other. The right approach then will depend on your use case. If you want a detailed diff, then use <code>DeepDiff</code>. Otherwise, just use a <code>set</code>.</p>
<h2 id="heading-how-to-compare-two-lists-of-strings">How to compare two lists of strings</h2>
<p>Comparing two lists of string in Python depends largely on what type of comparison you want to make. That's because we can <a target="_blank" href="https://miguendes.me/python-compare-strings">compare a string in a handful of ways</a>.</p>
<p>In this section, we'll see 3 different ways of doing that. </p>
<p>The simplest one is using a <code>==</code> operator, like we saw in the beginning. This method is suitable if you want a strict comparison between each string. </p>
<pre><code class="lang-python">In [<span class="hljs-number">1</span>]: names = [<span class="hljs-string">'jack'</span>, <span class="hljs-string">'josh'</span>, <span class="hljs-string">'james'</span>]

In [<span class="hljs-number">2</span>]: target = [<span class="hljs-string">'jack'</span>, <span class="hljs-string">'josh'</span>, <span class="hljs-string">'james'</span>]

In [<span class="hljs-number">3</span>]: names == target
Out[<span class="hljs-number">3</span>]: <span class="hljs-literal">True</span>
</code></pre>
<p>Things start to get messy if you want to compare the list of strings but ignoring the case. Using the <code>==</code> for that just doesn't work.</p>
<pre><code>In [<span class="hljs-number">4</span>]: names <span class="hljs-operator">=</span> [<span class="hljs-string">'Jack'</span>, <span class="hljs-string">'Josh'</span>, <span class="hljs-string">'James'</span>]

In [<span class="hljs-number">2</span>]: target <span class="hljs-operator">=</span> [<span class="hljs-string">'jack'</span>, <span class="hljs-string">'josh'</span>, <span class="hljs-string">'james'</span>]

In [<span class="hljs-number">5</span>]: names <span class="hljs-operator">=</span><span class="hljs-operator">=</span> target
Out[<span class="hljs-number">5</span>]: False
</code></pre><p>The best tool for that is again <code>deepdiff</code>. It allows us to ignore the string by passing a boolean flag to it.</p>
<pre><code class="lang-python">In [<span class="hljs-number">1</span>]: <span class="hljs-keyword">import</span> deepdiff

In [<span class="hljs-number">2</span>]: names = [<span class="hljs-string">'Jack'</span>, <span class="hljs-string">'Josh'</span>, <span class="hljs-string">'James'</span>]

In [<span class="hljs-number">3</span>]: target = [<span class="hljs-string">'jack'</span>, <span class="hljs-string">'josh'</span>, <span class="hljs-string">'james'</span>]

<span class="hljs-comment"># ignoring string case</span>
In [<span class="hljs-number">4</span>]: deepdiff.DeepDiff(names, target, ignore_string_case=<span class="hljs-literal">True</span>)
Out[<span class="hljs-number">4</span>]: {}

<span class="hljs-comment"># considering the case</span>
In [<span class="hljs-number">5</span>]: deepdiff.DeepDiff(names, target)
Out[<span class="hljs-number">5</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[0]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-string">'jack'</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-string">'Jack'</span>},
  <span class="hljs-string">'root[1]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-string">'josh'</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-string">'Josh'</span>},
  <span class="hljs-string">'root[2]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-string">'james'</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-string">'James'</span>}}}
</code></pre>
<p>We can also ignore the order in which the strings appear in the lists.</p>
<pre><code class="lang-python">In [<span class="hljs-number">6</span>]: names = [<span class="hljs-string">'Jack'</span>, <span class="hljs-string">'James'</span>, <span class="hljs-string">'Josh'</span>]

In [<span class="hljs-number">7</span>]: target = [<span class="hljs-string">'jack'</span>, <span class="hljs-string">'josh'</span>, <span class="hljs-string">'james'</span>]

<span class="hljs-comment"># ignoring the order and string case</span>
In [<span class="hljs-number">8</span>]: deepdiff.DeepDiff(names, target, ignore_string_case=<span class="hljs-literal">True</span>, ignore_order=T
   ...: rue)
Out[<span class="hljs-number">8</span>]: {}

<span class="hljs-comment"># considering the order but ignoring the case</span>
In [<span class="hljs-number">9</span>]: deepdiff.DeepDiff(names, target, ignore_string_case=<span class="hljs-literal">True</span>)
Out[<span class="hljs-number">9</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[1]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-string">'josh'</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-string">'james'</span>},
  <span class="hljs-string">'root[2]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-string">'james'</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-string">'josh'</span>}}}
</code></pre>
<p>You can also go further and perform advanced comparisons by passing a custom operator to <code>DeepDiff</code>. </p>
<p>For example, suppose you want to compare the strings but <a target="_blank" href="https://miguendes.me/python-compare-strings#heading-how-to-compare-two-strings-and-ignore-whitespace">ignoring any whitespace</a> they may have. </p>
<p>Or perhaps you want to perform a <a target="_blank" href="https://miguendes.me/python-compare-strings#heading-how-to-compare-two-strings-for-similarity-fuzzy-string-matching">fuzzy matching</a> using an edit distance metric.</p>
<p>To do that, we can write the comparison logic in the operator class and pass it to <code>DeepDiff</code>. </p>
<p>In this first example, we'll <a target="_blank" href="https://miguendes.me/python-trim-string">ignore any whitespace by trimming the strings</a> before comparing them.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">IgnoreWhitespaceOperator</span>:</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">match</span>(<span class="hljs-params">self, level</span>) -&gt; bool:</span>
        <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">give_up_diffing</span>(<span class="hljs-params">self, level, diff_instance</span>) -&gt; bool:</span>
        <span class="hljs-keyword">if</span> isinstance(level.t1, str) <span class="hljs-keyword">and</span> isinstance(level.t2, str):
            <span class="hljs-keyword">return</span> level.t1.strip() == level.t2.strip()
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>
</code></pre>
<p>Then we can just plug into <code>DeepDiff</code> by adding it to the list of <code>custom_operators</code>, like so <code>custom_operators=[IgnoreWhitespaceOperator()]</code>.</p>
<pre><code class="lang-python">In [<span class="hljs-number">6</span>]: <span class="hljs-keyword">from</span> deepdiff <span class="hljs-keyword">import</span> DeepDiff

In [<span class="hljs-number">13</span>]: names = [<span class="hljs-string">'Jack'</span>, <span class="hljs-string">'James '</span>, <span class="hljs-string">'  Josh '</span>]

In [<span class="hljs-number">14</span>]: target = [<span class="hljs-string">'Jack'</span>, <span class="hljs-string">'James'</span>, <span class="hljs-string">'Josh'</span>,]

<span class="hljs-comment"># the operator will ignore the spaces in both lists</span>
In [<span class="hljs-number">15</span>]: DeepDiff(names, target, custom_operators=[IgnoreWhitespaceOperator()])
Out[<span class="hljs-number">15</span>]: {}

In [<span class="hljs-number">16</span>]: target = [<span class="hljs-string">'Jack'</span>, <span class="hljs-string">'James'</span>, <span class="hljs-string">'Josh'</span>, <span class="hljs-string">'Jelly'</span>]

<span class="hljs-comment"># if one of the list has an additional member, this will be flagged</span>
In [<span class="hljs-number">17</span>]: DeepDiff(names, target, custom_operators=[IgnoreWhitespaceOperator()])
Out[<span class="hljs-number">17</span>]: {<span class="hljs-string">'iterable_item_added'</span>: {<span class="hljs-string">'root[3]'</span>: <span class="hljs-string">'Jelly'</span>}}

In [<span class="hljs-number">18</span>]: target = [<span class="hljs-string">'Jack'</span>, <span class="hljs-string">'Josh'</span>, <span class="hljs-string">'James'</span>]

<span class="hljs-comment"># by default, the library doesn't ignore order</span>
In [<span class="hljs-number">19</span>]: DeepDiff(names, target, custom_operators=[IgnoreWhitespaceOperator()])
Out[<span class="hljs-number">19</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[1]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-string">'Josh'</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-string">'James '</span>},
  <span class="hljs-string">'root[2]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-string">'James'</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-string">'  Josh '</span>}}}

<span class="hljs-comment"># if you don't care about order, be explicit</span>
In [<span class="hljs-number">20</span>]: DeepDiff(names, target, ignore_order=<span class="hljs-literal">True</span>, custom_operators=[IgnoreWhitespaceOperator()])
Out[<span class="hljs-number">20</span>]: {}
</code></pre>
<h2 id="heading-how-to-compare-two-lists-of-dictionaries">How to compare two lists of dictionaries</h2>
<p>Comparing two lists of dictionaries in Python is definitely intricate without the help of an external library. As we've seen so far, <code>deepdiff</code> is versatile enough and we can use it to compare deep complex objects such as lists of dictionaries.</p>
<p>Let's see what happens when we pass two lists of dictionaries.</p>
<pre><code class="lang-python">In [<span class="hljs-number">1</span>]: <span class="hljs-keyword">from</span> deepdiff <span class="hljs-keyword">import</span> DeepDiff

In [<span class="hljs-number">2</span>]: first_list = [
   ...:     {
   ...:         <span class="hljs-string">'number'</span>: <span class="hljs-number">1</span>,
   ...:         <span class="hljs-string">'list'</span>: [<span class="hljs-string">'one'</span>, <span class="hljs-string">'two'</span>]
   ...:     },
   ...:     {
   ...:         <span class="hljs-string">'number'</span>: <span class="hljs-number">2</span>,
   ...:         <span class="hljs-string">'list'</span>: [<span class="hljs-string">'one'</span>, <span class="hljs-string">'two'</span>]
   ...:     },
   ...: ]

In [<span class="hljs-number">3</span>]: target_list = [
   ...:     {
   ...:         <span class="hljs-string">'number'</span>: <span class="hljs-number">3</span>,
   ...:         <span class="hljs-string">'list'</span>: [<span class="hljs-string">'one'</span>, <span class="hljs-string">'two'</span>]
   ...:     },
   ...:     {
   ...:         <span class="hljs-string">'number'</span>: <span class="hljs-number">2</span>,
   ...:         <span class="hljs-string">'list'</span>: [<span class="hljs-string">'one'</span>, <span class="hljs-string">'two'</span>]
   ...:     },
   ...: ]

In [<span class="hljs-number">4</span>]: DeepDiff(first_list, target_list)
Out[<span class="hljs-number">4</span>]: {<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">"root[0]['number']"</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">3</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">1</span>}}}
</code></pre>
<p>It outputs the exact location where the elements differ and what the difference is!</p>
<p>Let's see another example where a list has a missing element.</p>
<pre><code class="lang-python">In [<span class="hljs-number">2</span>]: first_list = [
   ...:     {
   ...:         <span class="hljs-string">'number'</span>: <span class="hljs-number">1</span>,
   ...:         <span class="hljs-string">'list'</span>: [<span class="hljs-string">'one'</span>, <span class="hljs-string">'two'</span>]
   ...:     },
   ...:     {
   ...:         <span class="hljs-string">'number'</span>: <span class="hljs-number">2</span>,
   ...:         <span class="hljs-string">'list'</span>: [<span class="hljs-string">'one'</span>, <span class="hljs-string">'two'</span>]
   ...:     },
   ...: ]

In [<span class="hljs-number">5</span>]: target = [
   ...:     {
   ...:         <span class="hljs-string">'number'</span>: <span class="hljs-number">3</span>,
   ...:         <span class="hljs-string">'list'</span>: [<span class="hljs-string">'one'</span>, <span class="hljs-string">'two'</span>]
   ...:     },
   ...: ]

In [<span class="hljs-number">6</span>]: 

In [<span class="hljs-number">6</span>]: DeepDiff(first_list, target)
Out[<span class="hljs-number">6</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">"root[0]['number']"</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">3</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">1</span>}},
 <span class="hljs-string">'iterable_item_removed'</span>: {<span class="hljs-string">'root[1]'</span>: {<span class="hljs-string">'number'</span>: <span class="hljs-number">2</span>, <span class="hljs-string">'list'</span>: [<span class="hljs-string">'one'</span>, <span class="hljs-string">'two'</span>]}}}
</code></pre>
<p>It says the the second dictionary has been removed, which is the case for this example.</p>
<h2 id="heading-how-to-compare-two-list-of-lists">How to compare two list of lists</h2>
<p>Comparing multidimensional lists—a.k.a list of lists—is easy for <code>deepdiff</code>. It works just like a list of <code>dict</code>s.</p>
<p>In the example below, we have two multidimensional lists that we want to compare. When passed to <code>DeepDiff</code>, it returns the exact location in which the elements differ.</p>
<p>For example, for the position <code>[1][0]</code>, the new value is 8, and the old is 3. Another interesting aspect is that it works for deeply nested structures, for instance, <code>deepdiff</code> also highlights the difference in the <code>[2][0][0]</code> position.</p>
<pre><code class="lang-python">In [<span class="hljs-number">1</span>]: <span class="hljs-keyword">from</span> deepdiff <span class="hljs-keyword">import</span> DeepDiff

In [<span class="hljs-number">2</span>]: first_list = [[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">4</span>], [[<span class="hljs-number">5</span>]]]

In [<span class="hljs-number">3</span>]: target_list = [[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">8</span>, <span class="hljs-number">4</span>], [[<span class="hljs-number">7</span>]]]

In [<span class="hljs-number">4</span>]: DeepDiff(first_list, target_list)
Out[<span class="hljs-number">4</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[1][0]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">8</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">3</span>},
  <span class="hljs-string">'root[2][0][0]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">7</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">5</span>}}}
</code></pre>
<p>When feeding the library with two identical multidimensional lists, it returns an empty response.</p>
<pre><code class="lang-python">In [<span class="hljs-number">3</span>]: target_list = [[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">8</span>, <span class="hljs-number">4</span>], [[<span class="hljs-number">7</span>]]]

In [<span class="hljs-number">5</span>]: second_list = [[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">8</span>, <span class="hljs-number">4</span>], [[<span class="hljs-number">7</span>]]]

In [<span class="hljs-number">7</span>]: DeepDiff(second_list, target_list)
Out[<span class="hljs-number">7</span>]: {}
</code></pre>
<h2 id="heading-how-to-compare-two-lists-of-objects">How to compare two lists of objects</h2>
<p>Sometimes we have a list of custom objects that we want to compare. Maybe we want to get a diff, or just check if they contain the same elements. The solution for this problem couldn't be different: use <code>deepdiff</code>.</p>
<p>The following example demonstrates the power of this library. We're going to compare two lists containing a custom objects, and we'll be able to assert if they are equal or not and what are the differences.</p>
<p>In the example below, we have two lists of <code>Person</code> objects. The only difference between the two is that in the last position <code>Person</code> object has a different age. <code>deepdiff</code> not only finds the right position - <code>[1]</code> - but also finds that <code>age</code> field is different as well.</p>
<pre><code class="lang-python">In [<span class="hljs-number">9</span>]: <span class="hljs-keyword">from</span> deepdiff <span class="hljs-keyword">import</span> DeepDiff

In [<span class="hljs-number">10</span>]: first = [Person(<span class="hljs-string">'Jack'</span>, <span class="hljs-number">34</span>), Person(<span class="hljs-string">'Janine'</span>, <span class="hljs-number">23</span>)]

In [<span class="hljs-number">11</span>]: target = [Person(<span class="hljs-string">'Jack'</span>, <span class="hljs-number">34</span>), Person(<span class="hljs-string">'Janine'</span>, <span class="hljs-number">24</span>)]

In [<span class="hljs-number">12</span>]: DeepDiff(first, target)
Out[<span class="hljs-number">12</span>]: {<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[1].age'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">24</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">23</span>}}}

In [<span class="hljs-number">14</span>]: second = [Person(<span class="hljs-string">'Jack'</span>, <span class="hljs-number">34</span>), Person(<span class="hljs-string">'Janine'</span>, <span class="hljs-number">24</span>)]

In [<span class="hljs-number">15</span>]: DeepDiff(second, target)
Out[<span class="hljs-number">15</span>]: {}
</code></pre>
<h2 id="heading-how-to-compare-two-lists-of-numpy-arrays">How to compare two lists of numpy arrays</h2>
<p>In this section, we'll see how to compare two lists of <code>numpy</code> arrays. This is a fairly common task for those who work with data science and/or machine learning. </p>
<p>We saw in the first section that using the <code>==</code> operator doesn't work well with lists of <code>numpy</code>arrays. Luckily we can use... guess what!? Yes, we can use <code>deepdiff</code>.</p>
<p>The example below shows two lists with different <code>numpy</code> arrays and the library can detect the exact position in which they differ. How cool is that?</p>
<pre><code class="lang-python">In [<span class="hljs-number">16</span>]: <span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np

In [<span class="hljs-number">17</span>]: <span class="hljs-keyword">from</span> deepdiff <span class="hljs-keyword">import</span> DeepDiff

In [<span class="hljs-number">18</span>]: first = [np.ones(<span class="hljs-number">3</span>), np.array([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>])]

In [<span class="hljs-number">19</span>]: target = [np.zeros(<span class="hljs-number">4</span>), np.array([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>])]

In [<span class="hljs-number">20</span>]: DeepDiff(first, target)
Out[<span class="hljs-number">20</span>]: 
{<span class="hljs-string">'values_changed'</span>: {<span class="hljs-string">'root[0][0]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">0.0</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">1.0</span>},
  <span class="hljs-string">'root[0][1]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">0.0</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">1.0</span>},
  <span class="hljs-string">'root[0][2]'</span>: {<span class="hljs-string">'new_value'</span>: <span class="hljs-number">0.0</span>, <span class="hljs-string">'old_value'</span>: <span class="hljs-number">1.0</span>}},
 <span class="hljs-string">'iterable_item_added'</span>: {<span class="hljs-string">'root[0][3]'</span>: <span class="hljs-number">0.0</span>, <span class="hljs-string">'root[1][3]'</span>: <span class="hljs-number">4</span>}}
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this post, we saw many ways to compare two lists in Python. The best method depends on what kind of elements we have and how we want to compare. Hopefully, you now know how to:</p>
<ul>
<li>check if two lists are equal in python</li>
<li>compare two lists without order (unordered lists)</li>
<li>compare two lists in python and return matches</li>
<li>compare two lists in python and return differences</li>
<li>compare two lists of strings     </li>
<li>compare two lists of dictionaries</li>
<li>compare two list of lists</li>
<li>compare two lists of objects</li>
<li>compare two lists of numpy arrays</li>
</ul>
<p>Other posts you may like:</p>
<ul>
<li><p><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/python-compare-strings">How to Compare Two Strings in Python (in 8 Easy Ways)</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/python-flatten-list">7 Different Ways to Flatten a List of Lists in Python</a></p>
</li>
</ul>
<p>See you next time!</p>
<p>This post was originally published at <a target="_blank" href="https://miguendes.me/python-compare-lists">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[How to Compare Two Strings in Python (in 8 Easy Ways)]]></title><description><![CDATA[Comparing strings is a fundamental task common to any programming language.
When it comes to Python, there are several ways of doing it. The best one will always depend on the use case, but we can narrow them down to a few that best fit this goal.
In...]]></description><link>https://miguendes.me/python-compare-strings</link><guid isPermaLink="true">https://miguendes.me/python-compare-strings</guid><category><![CDATA[Python]]></category><category><![CDATA[Python 3]]></category><category><![CDATA[python beginner]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sun, 28 Nov 2021 10:39:34 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1636963424696/rIA28EZDS.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Comparing strings is a fundamental task common to any programming language.</p>
<p>When it comes to Python, there are several ways of doing it. The best one will always depend on the use case, but we can narrow them down to a few that best fit this goal.</p>
<p>In this article, we'll do exactly that. </p>
<p>By the end of this tutorial, you'll have learned:</p>
<ul>
<li><a class="post-section-overview" href="#comparing-strings-using-the-and-operators">how to compare strings using the <code>==</code> and <code>!=</code> operators</a></li>
<li><a class="post-section-overview" href="#comparing-strings-using-the-is-operator">how to use the <code>is</code> operator to compare two strings</a></li>
<li><a class="post-section-overview" href="#comparing-strings-using-the-and-operators">how to compare strings using the <code>&lt;</code>, <code>&gt;</code>, <code>&lt;=</code>, and <code>&gt;=</code> operators</a></li>
<li><a class="post-section-overview" href="#compare-two-strings-by-ignoring-the-case">how to compare two string ignoring the case</a></li>
<li><a class="post-section-overview" href="#how-to-compare-two-strings-and-ignore-whitespace">how to ignore whitespaces when performing string comparison</a></li>
<li><a class="post-section-overview" href="#how-to-compare-two-strings-for-similarity-fuzzy-string-matching">how to determine if two strings are similar by doing fuzzy matching</a></li>
<li><a class="post-section-overview" href="#how-to-compare-two-strings-and-return-the-difference">how to compare two strings and return the difference</a></li>
<li><a class="post-section-overview" href="#string-comparison-not-working">how to debug when the string comparison is not working</a></li>
</ul>
<p>Let's go!</p>
<h2 id="heading-comparing-strings-using-the-and-operators">Comparing strings using the <code>==</code> and <code>!=</code> operators</h2>
<p>The simplest way to check if two strings are equal in Python is to use the <code>==</code> operator. And if you are looking for the opposite, then <code>!=</code> is what you need. That's it!</p>
<p><code>==</code> and <code>!=</code> are boolean operators, meaning they return <code>True</code> or <code>False</code>. For example, <code>==</code> returns <code>True</code> if the two strings match, and <code>False</code> otherwise.  </p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>name = <span class="hljs-string">'Carl'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>another_name = <span class="hljs-string">'Carl'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name == another_name
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name != another_name
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>yet_another_name = <span class="hljs-string">'Josh'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name == yet_another_name
<span class="hljs-literal">False</span>
</code></pre>
<p>These operators are also <strong>case sensitive</strong>, which means uppercase letters are treated differently. The example below shows just that, <code>city</code> starts with an uppercase <code>L</code> whereas <code>capital</code> starts with a lowercase <code>l</code>. As a result, Python returns <code>False</code> when comparing them with <code>==</code>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1636966091121/CXWX9rJ46.png" alt="python_is_string_2.png" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>name = <span class="hljs-string">'Carl'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>yet_another_name = <span class="hljs-string">'carl'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name == yet_another_name
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name != yet_another_name
<span class="hljs-literal">True</span>
</code></pre>
<h2 id="heading-comparing-strings-using-the-is-operator">Comparing strings using the <code>is</code> operator</h2>
<p>Another way of comparing if two strings are equal in Python is using the <code>is</code> operator. However, the kind of comparison it performs is different than <code>==</code>. The <code>is</code> operator compare if the 2 string are the same <strong><em>instance</em></strong>.</p>
<p>In Python—and in many other languages—we say two objects are the same instance if they are the same object in memory. </p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>name = <span class="hljs-string">'John Jabocs Howard'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>another_name = name

<span class="hljs-meta">&gt;&gt;&gt; </span>name <span class="hljs-keyword">is</span> another_name
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>yet_another_name = <span class="hljs-string">'John Jabocs Howard'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name <span class="hljs-keyword">is</span> yet_another_name
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>id(name)
<span class="hljs-number">140142470447472</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>id(another_name)
<span class="hljs-number">140142470447472</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>id(yet_another_name)
<span class="hljs-number">140142459568816</span>
</code></pre>
<p>The image below shows how this example would be represented in memory.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1636965445606/L-CTA-zRG.png" alt="python_is_string_1.png" /></p>
<p>As you see, we're comparing <strong><em>identities</em></strong>, <strong><em>not</em></strong> content. Objects with the same identity usually have the same references, and share the same memory location. Keep that in mind when using the <code>is</code> operator.</p>
<h2 id="heading-comparing-strings-using-the-operators">Comparing strings using the &lt;, &gt;, &lt;=, and &gt;= operators</h2>
<p>The third way of comparing strings is alphabetically. This is useful when we need to determine the lexicographical order of two strings. </p>
<p>Let's see an example.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>name = <span class="hljs-string">'maria'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>another_name = <span class="hljs-string">'marcus'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name &lt; another_name
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name &gt; another_name
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name &lt;= another_name
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name &gt;= another_name
<span class="hljs-literal">True</span>
</code></pre>
<p>To determine the order, Python compares the strings char by char. In our example, the first three letters are the same <code>mar</code>, but the next one is not, <code>c</code> from <code>marcus</code> comes before <code>i</code> from <code>maria</code>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637140896277/UnUqTuz7zY.png" alt="python_is_string_4.png" /></p>
<p>It's important to have in mind that this comparisons are <strong>case-sensitive</strong>. Python treats upper-case and lower-case differently. For example, if we change <code>"maria"</code> to <code>"Maria"</code>, then the result is different because <code>M</code> comes before <code>m</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>name = <span class="hljs-string">'Maria'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>another_name = <span class="hljs-string">'marcus'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>name &lt; another_name
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>ord(<span class="hljs-string">'M'</span>) &lt; ord(<span class="hljs-string">'m'</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>ord(<span class="hljs-string">'M'</span>)
<span class="hljs-number">77</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>ord(<span class="hljs-string">'m'</span>)
<span class="hljs-number">109</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1637139948016/5O4MliYYS.png" alt="python_is_string_3.png" /></p>
<blockquote>
<p>⚠️ WARNING ⚠️: Avoid comparing strings that represent numbers using these operators. The comparison is done based on alphabetical ordering, which causes <code>"2" &lt; "10"</code> to evaluated to <code>False</code>.</p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-string">'2'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b = <span class="hljs-string">'10'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a &lt; b
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a &lt;= b
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a &gt; b
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a &gt;= b
<span class="hljs-literal">True</span>
</code></pre>
<h2 id="heading-compare-two-strings-by-ignoring-the-case">Compare two strings by ignoring the case</h2>
<p>Sometimes we may need to compare two strings—<a target="_blank" href="https://miguendes.me/python-compare-lists">a list of strings</a>, or even a <a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">dictionary of strings</a>—regardless of the case. </p>
<p>Achieving that will depend on the alphabet we're dealing with. For ASCII strings, we can either convert both strings to lowercase using <code>str.lower()</code>, or uppercase with <code>str.upper()</code> and compare them.</p>
<p>For other alphabets, such as Greek or German, converting to lowercase to make the strings case insensitive doesn't always work. Let's see some examples.</p>
<p>Suppose we have a string in German named <code>'Straße'</code>, which means <code>"Street"</code>. You can also write the same word without the <code>ß</code>, in this case, the word becomes <code>Strasse</code>. If we try to lowercase it, or uppercase it, see what happens.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-string">'Atraße'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-string">'Straße'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b = <span class="hljs-string">'strasse'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a.lower() == b.lower()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a.lower()
<span class="hljs-string">'straße'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b.lower()
<span class="hljs-string">'strasse'</span>
</code></pre>
<p>That happens because a simple call to <code>str.lower()</code> won't do anything to <code>ß</code>. Its lowercase form is equivalent to <code>ss</code> but <code>ß</code> itself has the same form and shape in lower or upper case.</p>
<p>The best way to ignore case and make effective case insensitive string comparisons is to use <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.casefold"><code>str.casefold</code></a>. According to the <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.casefold">docs</a>:</p>
<blockquote>
<p>Casefolding is similar to lowercasing but more aggressive because it is intended to remove all case distinctions in a string. </p>
</blockquote>
<p>Let's see what happens when we use <code>str.casefold</code> instead.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-string">'Straße'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b = <span class="hljs-string">'strasse'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a.casefold() == b.casefold()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a.casefold()
<span class="hljs-string">'strasse'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b.casefold()
<span class="hljs-string">'strasse'</span>
</code></pre>
<h2 id="heading-how-to-compare-two-strings-and-ignore-whitespace">How to compare two strings and ignore whitespace</h2>
<p>Sometimes you might want to compare two strings by ignoring space characters. The best solution for this problem depends on where the spaces are, whether there are multiple spaces in the string and so on.</p>
<p>The first example we'll see consider that the only difference between the strings is that one of them have leading and/or trailing spaces. In this case, we can <a target="_blank" href="https://miguendes.me/python-trim-string">trim both strings using the <code>str.strip</code> method</a> and use the <code>==</code> operator to compare them.</p>
<pre><code class="lang-python">
<span class="hljs-meta">&gt;&gt;&gt; </span>s1 = <span class="hljs-string">'Hey, I really like this post.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s2 = <span class="hljs-string">'      Hey, I really like this post.   '</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s1.strip() == s2.strip()
<span class="hljs-literal">True</span>
</code></pre>
<p>However, sometimes you have a string with whitespaces all over it, including multiple spaces inside it. If that is the case, then <code>str.strip</code> is not enough.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s2 = <span class="hljs-string">'      Hey, I really      like this post.   '</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s1 = <span class="hljs-string">'Hey, I really like this post.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s1.strip() == s2.strip()
<span class="hljs-literal">False</span>
</code></pre>
<p>The alternative then is to <a target="_blank" href="https://miguendes.me/python-trim-string#removing-only-duplicates">remove the duplicate whitespaces using a regular expression</a>. This method only returns duplicated chars, so we still need to strip the leading and trailing ones.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s2 = <span class="hljs-string">'      Hey, I really      like this post.   '</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s1 = <span class="hljs-string">'Hey, I really like this post.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">'\s+'</span>, <span class="hljs-string">' '</span>, s1.strip())
<span class="hljs-string">'Hey, I really like this post.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">'\s+'</span>, <span class="hljs-string">' '</span>, s2.strip())
<span class="hljs-string">'Hey, I really like this post.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">'\s+'</span>, <span class="hljs-string">' '</span>, s1.strip()) == re.sub(<span class="hljs-string">'\s+'</span>, <span class="hljs-string">' '</span>, s2.strip())
<span class="hljs-literal">True</span>
</code></pre>
<p>Or if you don't care about duplicates and want to remove everything, then just pass the empty string as the second argument to <code>re.sub</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s2 = <span class="hljs-string">'      Hey, I really      like this post.   '</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s1 = <span class="hljs-string">'Hey, I really like this post.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">'\s+'</span>, <span class="hljs-string">''</span>, s1.strip())
<span class="hljs-string">'Hey,Ireallylikethispost.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">'\s+'</span>, <span class="hljs-string">''</span>, s2.strip())
<span class="hljs-string">'Hey,Ireallylikethispost.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">'\s+'</span>, <span class="hljs-string">''</span>, s1.strip()) == re.sub(<span class="hljs-string">'\s+'</span>, <span class="hljs-string">''</span>, s2.strip())
<span class="hljs-literal">True</span>
</code></pre>
<p>The last and final method is to use a translation table. This solution is an interesting alternative to regex.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>table = str.maketrans({<span class="hljs-string">' '</span>: <span class="hljs-literal">None</span>})

<span class="hljs-meta">&gt;&gt;&gt; </span>table
{<span class="hljs-number">32</span>: <span class="hljs-literal">None</span>}

<span class="hljs-meta">&gt;&gt;&gt; </span>s1.translate(table)
<span class="hljs-string">'Hey,Ireallylikethispost.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s2.translate(table)
<span class="hljs-string">'Hey,Ireallylikethispost.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s1.translate(table) == s2.translate(table)
<span class="hljs-literal">True</span>
</code></pre>
<p>A nice thing about this method is that it allows removing not only spaces but other chars such as <a target="_blank" href="https://stackoverflow.com/questions/16474848/python-how-to-compare-strings-and-ignore-white-space-and-special-characters">punctuation</a> as well.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> string

<span class="hljs-meta">&gt;&gt;&gt; </span>table = str.maketrans(dict.fromkeys(string.punctuation + <span class="hljs-string">' '</span>))

<span class="hljs-meta">&gt;&gt;&gt; </span>s1.translate(table)
<span class="hljs-string">'HeyIreallylikethispost'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s2.translate(table)
<span class="hljs-string">'HeyIreallylikethispost'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s1.translate(table) == s2.translate(table)
<span class="hljs-literal">True</span>
</code></pre>
<h2 id="heading-how-to-compare-two-strings-for-similarity-fuzzy-string-matching">How to compare two strings for similarity (fuzzy string matching)</h2>
<p>Another popular string comparison use case is checking if two strings are almost equal. In this task, we're interested in knowing how similar they are instead of comparing their equality.</p>
<p>To make it easier to understand, consider a scenario when we have two strings and we are willing to ignore misspelling errors. Unfortunately, that's not possible with the <code>==</code> operator.</p>
<p>We can solve this problem in two different ways:</p>
<ul>
<li>using the <code>difflib</code> from the standard library</li>
<li>using an external library such as <a target="_blank" href="https://github.com/jamesturk/jellyfish"><code>jellysifh</code></a></li>
</ul>
<h3 id="heading-using-difflib">Using <code>difflib</code></h3>
<p>The <a target="_blank" href="https://docs.python.org/3/library/difflib.html"><code>difflib</code></a> in the standard library has a <code>SequenceMatcher</code> class that provides a <code>ratio()</code> method that returns a measure of the string's similarity as a percentage. </p>
<p>Suppose you have two similar strings, say <code>a = "preview"</code>, and <code>b = "previeu"</code>. The only difference between them is the final letter. Let's imagine that this difference is small enough for you and you want to ignore it. </p>
<p>By using <code>SequenceMatcher.ratio()</code> we can get the percentage in which they are similar and use that number to assert if the two strings are similar enough.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> difflib <span class="hljs-keyword">import</span> SequenceMatcher

<span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-string">"preview"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b = <span class="hljs-string">"previeu"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>SequenceMatcher(a=a, b=b).ratio()
<span class="hljs-number">0.8571428571428571</span>
</code></pre>
<p>In this example, <code>SequenceMatcher</code> tells us that the two strings are 85% similar. We can then use this number as a threshold and ignore the difference.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">is_string_similar</span>(<span class="hljs-params">s1: str, s2: str, threshold: float = <span class="hljs-number">0.8</span></span>) -&gt; bool
    ...:</span> :
    ...:     <span class="hljs-keyword">return</span> SequenceMatcher(a=s1, b=s2).ratio() &gt; threshold
    ...:

<span class="hljs-meta">&gt;&gt;&gt; </span>is_string_similar(s1=<span class="hljs-string">"preview"</span>, s2=<span class="hljs-string">"previeu"</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>is_string_similar(s1=<span class="hljs-string">"preview"</span>, s2=<span class="hljs-string">"preview"</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>is_string_similar(s1=<span class="hljs-string">"preview"</span>, s2=<span class="hljs-string">"previewjajdj"</span>)
<span class="hljs-literal">False</span>
</code></pre>
<p>There's one problem, though. The threshold depends on the length of the string. For example, two very small strings, say <code>a = "ab"</code> and <code>b = "ac"</code> will be 50% different.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>SequenceMatcher(a=<span class="hljs-string">"ab"</span>, b=<span class="hljs-string">"ac"</span>).ratio()
<span class="hljs-number">0.5</span>
</code></pre>
<p>So, setting up a decent threshold may be tricky. As an alternative, we can try another algorithm, one that the counts transpositions of letters in a string. And the good new is, such an algorithm exists, and that's what we'll see next.</p>
<h3 id="heading-using-damerau-levenshtein-distance">Using Damerau-Levenshtein distance</h3>
<p>The <a target="_blank" href="http://en.wikipedia.org/wiki/Damerau-Levenshtein_distance">Damerau-Levenshtein algorithm</a> counts the minimum number of operations needed to change one string into another. </p>
<p>In another words, it tells how many insertions, deletions or substitutions of a single character; or transposition of two adjacent characters we need to perform so that the two string become equal.</p>
<p>In Python, we can use the function <code>damerau_levenshtein_distance</code> from the <a target="_blank" href="https://github.com/jamesturk/jellyfish"><code>jellysifh</code></a> library. </p>
<p>Let's see what the Damerau-Levenshtein distance is for the last example from the previous section.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> jellyfish

<span class="hljs-meta">&gt;&gt;&gt; </span>jellyfish.damerau_levenshtein_distance(<span class="hljs-string">'ab'</span>, <span class="hljs-string">'ac'</span>)
<span class="hljs-number">1</span>
</code></pre>
<p>It's 1! So that means to transform <code>"ac"</code> into <code>"ab"</code> we need 1 change. What about the first example?</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s1 = <span class="hljs-string">"preview"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s2 = <span class="hljs-string">"previeu"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span> jellyfish.damerau_levenshtein_distance(s1, s2)
<span class="hljs-number">1</span>
</code></pre>
<p>It's 1 too! And that makes lots of sense, after all we just need to edit the last letter to make them equal.</p>
<p>This way, we can set the threshold based on number of changes instead of ratio.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">are_strings_similar</span>(<span class="hljs-params">s1: str, s2: str, threshold: int = <span class="hljs-number">2</span></span>) -&gt; bool:</span>
    ...:     <span class="hljs-keyword">return</span> jellyfish.damerau_levenshtein_distance(s1, s2) &lt;= threshold
    ...: 

<span class="hljs-meta">&gt;&gt;&gt; </span>are_strings_similar(<span class="hljs-string">"ab"</span>, <span class="hljs-string">"ac"</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>are_strings_similar(<span class="hljs-string">"ab"</span>, <span class="hljs-string">"ackiol"</span>)
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>are_strings_similar(<span class="hljs-string">"ab"</span>, <span class="hljs-string">"cb"</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>are_strings_similar(<span class="hljs-string">"abcf"</span>, <span class="hljs-string">"abcd"</span>)
<span class="hljs-literal">True</span>

<span class="hljs-comment"># this ones are not that similar, but we have a default threshold of 2</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>are_strings_similar(<span class="hljs-string">"abcf"</span>, <span class="hljs-string">"acfg"</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>are_strings_similar(<span class="hljs-string">"abcf"</span>, <span class="hljs-string">"acyg"</span>)
<span class="hljs-literal">False</span>
</code></pre>
<h2 id="heading-how-to-compare-two-strings-and-return-the-difference">How to compare two strings and return the difference</h2>
<p>Sometimes we know in advance that two strings are different and we want to know what makes them different. In other words, we want to obtain their "diff". </p>
<p>In the previous section, we used <a target="_blank" href="https://docs.python.org/3/library/difflib.html"><code>difflib</code></a> as a way of telling if two strings were similar enough. This module is actually more powerful than that, and we can use it to compare the strings and show their differences.</p>
<p>The annoying thing is that it requires a list of strings instead of just a single string. Then it returns a generator that you can use to join into a single string and print the difference.</p>
<pre><code class="lang-python">
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> difflib

<span class="hljs-meta">&gt;&gt;&gt; </span>d = difflib.Differ()

<span class="hljs-meta">&gt;&gt;&gt; </span>diff = d.compare([<span class="hljs-string">'my string for test'</span>], [<span class="hljs-string">'my str for test'</span>])

<span class="hljs-meta">&gt;&gt;&gt; </span>diff
&lt;generator object Differ.compare at <span class="hljs-number">0x7f27703250b0</span>&gt;

<span class="hljs-meta">&gt;&gt;&gt; </span>list(diff)
[<span class="hljs-string">'- my string for test'</span>, <span class="hljs-string">'?       ---\n'</span>, <span class="hljs-string">'+ my str for test'</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span>print(<span class="hljs-string">'\n'</span>.join(diff))
- my string <span class="hljs-keyword">for</span> test
?       ---

+ my str <span class="hljs-keyword">for</span> test
</code></pre>
<h2 id="heading-string-comparison-not-working">String comparison not working?</h2>
<p>In this section, we'll discuss the reasons why your string comparison is not working and how to fix it. The two main reasons based on my experience are:</p>
<ul>
<li>using the wrong operator</li>
<li>having a trailing space or newline</li>
</ul>
<h3 id="heading-comparing-strings-using-is-instead-of">Comparing strings using <code>is</code> instead of <code>==</code></h3>
<p>This one is very common amongst novice Python developers. It's easy to use the wrong operator, especially when comparing strings.</p>
<p>As we've discussed in this article, only use the <code>is</code> operator <strong><em>if</em></strong> you want to check if the two string are the same <strong><em>instances</em></strong>.</p>
<h3 id="heading-having-a-trailing-whitespace-of-newline-n">Having a trailing whitespace of newline (<code>\n</code>)</h3>
<p>This one is very common when reading a string from the <code>input</code> function. Whenever we use this function to collect information, the user might accidentally add a trailing space.</p>
<p>If you store the result from the <code>input</code> in a variable, you won't easily see the problem.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-string">'hello'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b = input(<span class="hljs-string">'Enter a word: '</span>)
Enter a word: hello 

<span class="hljs-meta">&gt;&gt;&gt; </span>a == b
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a
<span class="hljs-string">'hello'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b
<span class="hljs-string">'hello '</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a == b.strip()
<span class="hljs-literal">True</span>
</code></pre>
<p>The solution here is to <a target="_blank" href="https://miguendes.me/python-trim-string">strip</a> the whitespace from the string the user enters and then compare it. You can do it to whatever input source you don't trust.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this guide, we saw 8 different ways of comparing strings in Python and two most common mistakes. We saw how we can leverage different operations to perform string comparison and how to use external libraries to do string fuzzy matching. </p>
<p>Key takeaways:</p>
<ul>
<li>Use the <code>==</code> and <code>!=</code> operators to compare two strings for equality</li>
<li>Use the <code>is</code> operator to check if two strings are the same instance</li>
<li>Use the <code>&lt;</code>, <code>&gt;</code>, <code>&lt;=</code>, and <code>&gt;=</code> operators to compare strings alphabetically</li>
<li>Use <code>str.casefold()</code> to compare two string ignoring the case</li>
<li>Trim strings using native methods or regex to ignore whitespaces when performing string comparison</li>
<li>Use <code>difflib</code> or <code>jellyfish</code> to check if two strings are almost equal (fuzzy matching)</li>
<li>Use <code>difflib</code> to to compare two strings and return the difference</li>
<li>String comparison is not working? Check for trailing or leading spaces, or understand if you are using the right operator for the job</li>
</ul>
<p>That's it for today, and I hope you learned something new. See you next time!</p>
<p>Other posts you may like:</p>
<ul>
<li><p><a target="_blank" href="https://miguendes.me/python-isdigit-isnumeric-isdecimal">How to Choose Between isdigit(), isdecimal() and isnumeric() in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/python-compare-lists">The Best Ways to Compare Two Lists in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/python-trim-string">15 Easy Ways to Trim a String in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/pylint-consider-using-f-string">Pylint: How to fix "c0209: formatting a regular string which could be a f-string (consider-using-f-string)"</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-implement-a-random-string-generator-with-python">How to Implement a Random String Generator With Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-check-if-a-string-is-a-valid-url-in-python">How to Check If a String Is a Valid URL in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">Python F-String: 73 Examples to Help You Master It</a></p>
</li>
</ul>
<p>This post was originally published at <a target="_blank" href="https://miguendes.me/python-compare-strings">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[Pylint: How to fix "c0209: formatting a regular string which could be a f-string (consider-using-f-string)"]]></title><description><![CDATA[Some weeks I ago I faced this problem in one of my projects after upgrading pylint to 2.11.
The error was:
script.py:7:8: C0209: Formatting a regular string which could be a f-string (consider-using-f-string)

At first I found it very confusing; my c...]]></description><link>https://miguendes.me/pylint-consider-using-f-string</link><guid isPermaLink="true">https://miguendes.me/pylint-consider-using-f-string</guid><category><![CDATA[Python]]></category><category><![CDATA[python beginner]]></category><category><![CDATA[Bugs and Errors]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sun, 14 Nov 2021 09:11:30 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1636879217193/fk9gGJpPr.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Some weeks I ago I faced this problem in one of my projects after upgrading <a target="_blank" href="https://pylint.org/"><code>pylint</code></a> to <code>2.11</code>.</p>
<p>The error was:</p>
<pre><code class="lang-console">script.py:7:8: C0209: Formatting a regular string which could be a f-string (consider-using-f-string)
</code></pre>
<p>At first I found it very confusing; my code was the same and it'd been working fine before upgrading it. I decided to dig a little deeper and found this <a target="_blank" href="https://github.com/PyCQA/pylint/pull/4796">pull request</a> on Pylint's github page.</p>
<p>It turns out, this is a new feature that <a target="_blank" href="https://github.com/PyCQA/pylint/releases/tag/v2.11.0">landed on Pylint 2.11.0.</a></p>
<p>In this post, you will see 3 different ways to fix this "formatting a regular string which could be a f-string (consider-using-f-string)" error.</p>
<h2 id="heading-how-to-fix-formatting-a-regular-string-which-could-be-a-f-string-consider-using-f-string">How to Fix "formatting a regular string which could be a f-string (consider-using-f-string)"</h2>
<p>Before Python introduced <a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">f-strings</a>, one could use <code>%</code> or the <code>str.format</code> method to format a string. Even though these methods are still valid, f-strings are slightly preferred now. </p>
<p>As a way of enforcing developers to make this migration, the new Pylint version raises this error when it detects the old way of formatting string.</p>
<p>To fix that you can either:</p>
<ul>
<li>replace the old formatting method with a f-string</li>
<li>ignore the Pylint error </li>
</ul>
<h3 id="heading-replacing-or-the-strformat-with-a-f-string">Replacing <code>%</code> or the <code>str.format</code> with a f-string</h3>
<p>Let's consider this small script that uses both methods.</p>
<pre><code class="lang-python">name = <span class="hljs-string">'world'</span>

a = <span class="hljs-string">'my hello %s'</span> % name

print(a)

b = <span class="hljs-string">'again this name is {}'</span>.format(name) 

print(b)
</code></pre>
<p>If we run Pylint 2.11.0+ on it, we get a few errors:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1636878592489/KbPKGr1aD.png" alt="C0209: Formatting a regular string which could be a f-string (consider-using-f-string)" /></p>
<p>If it's OK for you to update to f-string, then that’s the recommended way. How you do that depends on how you're formatting your strings but in doubt you can <a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">check this article</a> to learn the myriad ways you can use a f-string.</p>
<p>In my case, replacing <code>%</code> and <code>str.format</code> becomes:</p>
<pre><code class="lang-python">name = <span class="hljs-string">'world'</span>

a = <span class="hljs-string">f'my hello <span class="hljs-subst">{name}</span>'</span>

print(a)

b = <span class="hljs-string">f'again this name is <span class="hljs-subst">{name}</span>'</span>

print(b)
</code></pre>
<p>If we re-run Pylint, we get:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1636878965935/brdWKINwL.png" alt="c0209: formatting a regular string which could be a f-string (consider-using-f-string)" /></p>
<p>In the next section, we'll use flags to disable this error.</p>
<h3 id="heading-ignoring-the-error-using-flags">Ignoring the error using flags</h3>
<p>You can also ignore the error instead of converting it to f-strings. To do that, you can either add a disabling flag at the top of the python file, or disable it line-by-line.</p>
<h4 id="heading-ignoring-all-errors-in-the-file">Ignoring all errors in the file</h4>
<p>When you place this flag at the very top of your Python file, Pylint ignores that error across the whole file.</p>
<pre><code class="lang-python"><span class="hljs-comment"># pylint: disable=consider-using-f-string</span>

name = <span class="hljs-string">'world'</span>

a = <span class="hljs-string">'my hello %s'</span> % name

print(a)

b = <span class="hljs-string">'again this name is {}'</span>.format(name)

print(b)
</code></pre>
<p>When we re-run the check, Pylint returns:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1636879041373/eUKFsrxH5.png" alt="c0209: formatting a regular string which could be a f-string (consider-using-f-string)" /></p>
<p>Another alternative is to add this flag to a <code>.pylintrc</code> file. This file should be placed at the root of your project, and by doing so, Pylint will ignore the error across the whole project.</p>
<p>A minimal example in this case would be:</p>
<pre><code><span class="hljs-comment"># .pylintrc</span>

<span class="hljs-section">[MASTER]</span>

<span class="hljs-attr">disable</span>=consider-using-f-string
</code></pre><p>After this change, if we re-run Pylint we get:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1636879041373/eUKFsrxH5.png" alt="c0209: formatting a regular string which could be a f-string (consider-using-f-string)" /></p>
<h4 id="heading-ignoring-individual-errors">Ignoring individual errors</h4>
<p>To ignore each individual case, place a disabling flag next to the expression <code>#pylint: disable=consider-using-f-string</code>you want to ignore.</p>
<pre><code class="lang-python">name = <span class="hljs-string">'world'</span>

a = <span class="hljs-string">'my hello %s'</span> % name <span class="hljs-comment">#pylint: disable=consider-using-f-string</span>

print(a)

b = <span class="hljs-string">'again this name is {}'</span>.format(name) <span class="hljs-comment">#pylint: disable=consider-using-f-string</span>

print(b)
</code></pre>
<p>The errors will now be suppressed:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1636879041373/eUKFsrxH5.png" alt="c0209: formatting a regular string which could be a f-string (consider-using-f-string)" /></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>That's it for today. I hope this article helped you understand and fix the infamous <code>"c0209: formatting a regular string which could be a f-string (consider-using-f-string)"</code> error.</p>
<p>See you next time!</p>
]]></content:encoded></item><item><title><![CDATA[Python pathlib Cookbook: 57+ Examples to Master It (2022)]]></title><description><![CDATA[When I started learning Python, there was one thing I always had trouble with: dealing with directories and file paths!
I remember the struggle to manipulate paths as strings using the os module. I was constantly looking up error messages related to ...]]></description><link>https://miguendes.me/python-pathlib</link><guid isPermaLink="true">https://miguendes.me/python-pathlib</guid><category><![CDATA[Python]]></category><category><![CDATA[Python 3]]></category><category><![CDATA[python beginner]]></category><category><![CDATA[python projects]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sun, 31 Oct 2021 08:46:16 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1635668729067/bKrWQvvdV.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When I started learning Python, there was one thing I always had trouble with: dealing with directories and file paths!</p>
<p>I remember the struggle to manipulate paths as strings using the <code>os</code> module. I was constantly looking up error messages related to improper path manipulation.</p>
<p>The <a target="_blank" href="https://docs.python.org/3/library/os.html"><code>os</code></a> module never felt intuitive and ergonomic to me, but my luck changed when <a target="_blank" href="https://docs.python.org/3/library/pathlib.html"><code>pathlib</code></a> landed in Python 3.4. It was a breath of fresh air, much easier to use, and felt more <em>Pythonic</em> to me.</p>
<p>The only problem was: finding examples on how to use it was hard; the documentation only covered a few use cases. And yes, Python's docs are good, but for newcomers, examples are a must.</p>
<p>Even though the docs are much better now, they don't showcase the module in a problem-solving fashion. That’s why I decided to create this cookbook.</p>
<p>This article is a brain dump of everything I know about <code>pathlib</code>. It's meant to be a reference rather than a linear guide. Feel free to jump around to sections that are more relevant to you.</p>
<p>In this guide, we'll go over dozens of use cases such as:</p>
<ul>
<li>how to create (touch) an empty file</li>
<li>how to convert a path to string</li>
<li>getting the home directory</li>
<li>creating new directories, doing it recursively, and dealing with issues when they</li>
<li>getting the current working directory</li>
<li>get the file extension from a filename</li>
<li>get the parent directory of a file or script</li>
<li>read and write text or binary files</li>
<li>how to delete files</li>
<li>how create nested directories</li>
<li>how to list all files and folders in a directory</li>
<li>how to list all subdirectories recursively</li>
<li>how to remove a directory along with its contents</li>
</ul>
<p>I hope you enjoy!</p>
<h1 id="heading-table-of-contents">Table of contents</h1>
<ul>
<li><a class="post-section-overview" href="#what-is-pathlib-in-python">What is <code>pathlib</code> in Python?</a></li>
<li><a class="post-section-overview" href="#the-anatomy-of-a-pathlibpath">The anatomy of a <code>pathlib.Path</code></a></li>
<li><a class="post-section-overview" href="#how-to-convert-a-path-to-string">How to convert a path to string</a></li>
<li><a class="post-section-overview" href="#how-to-join-a-path-by-adding-parts-or-other-paths">How to join a path by adding parts or other paths</a></li>
<li><a class="post-section-overview" href="#working-with-directories-using-pathlib">Working with directories using <code>pathlib</code></a><ul>
<li><a class="post-section-overview" href="#how-to-get-the-current-working-directory-cwd-with-pathlib">How to get the current working directory (cwd) with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-get-the-home-directory-with-pathlib">How to get the home directory with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-expand-the-initial-path-component-with-pathexpanduser">How to expand the initial path component with <code>Path.expanduser()</code></a></li>
<li><a class="post-section-overview" href="#how-to-list-all-files-and-directories">How to list all files and directories</a></li>
<li><a class="post-section-overview" href="#using-isdir-to-list-only-the-directories">Using <code>isdir</code> to list only the directories</a></li>
<li><a class="post-section-overview" href="#getting-a-list-of-all-subdirectories-in-the-current-directory-recursively">Getting a list of all subdirectories in the current directory recursively</a></li>
<li><a class="post-section-overview" href="#how-to-recursively-iterate-through-all-files">How to recursively iterate through all files</a></li>
<li><a class="post-section-overview" href="#how-to-change-directories-with-python-pathlib">How to change directories with Python pathlib</a></li>
<li><a class="post-section-overview" href="#how-to-delete-directories-with-pathlib">How to delete directories with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-remove-a-directory-along-with-its-contents-with-pathlib">How to remove a directory along with its contents with <code>pathlib</code></a></li>
</ul>
</li>
<li><a class="post-section-overview" href="#working-with-files-using-pathlib">Working with files using <code>pathlib</code></a><ul>
<li><a class="post-section-overview" href="#how-to-touch-a-file-and-create-parent-directories">How to touch a file and create parent directories</a></li>
<li><a class="post-section-overview" href="#how-to-get-the-filename-from-path">How to get the filename from path</a></li>
<li><a class="post-section-overview" href="#how-to-get-the-file-extension-from-a-filename-using-pathlib">How to get the file extension from a filename using <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-open-a-file-for-reading-with-pathlib">How to open a file for reading with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-read-text-files-with-pathlib">How to read text files with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-read-json-files-from-path-with-pathlib">How to read JSON files from path with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-write-a-text-file-with-pathlib">How to write a text file with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-copy-files-with-pathlib">How to copy files with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-delete-a-file-with-pathlib">How to delete a file with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-delete-all-files-in-a-directory-with-pathlib">How to delete all files in a directory with <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-rename-a-file-using-pathlib">How to rename a file using <code>pathlib</code></a></li>
<li><a class="post-section-overview" href="#how-to-get-the-parent-directory-of-a-file-with-pathlib">How to get the parent directory of a file with <code>pathlib</code></a></li>
</ul>
</li>
<li><a class="post-section-overview" href="#conclusion">Conclusion</a></li>
</ul>
<h2 id="heading-what-is-pathlib-in-python">What is <code>pathlib</code> in Python?</h2>
<p><code>pathlib</code> is a Python module created to make it easier to work with paths in a file system. This module debuted in Python 3.4 and was proposed by the <a target="_blank" href="https://www.python.org/dev/peps/pep-0428/">PEP 428</a>.</p>
<p>Prior to Python 3.4, the <code>os</code> module from the standard library was the go to module to handle paths. <code>os</code> provides several functions that manipulate paths represented as plain Python strings. For example, to join two paths using <code>os</code>, one can use the<code>os.path.join</code> function.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> os
<span class="hljs-meta">&gt;&gt;&gt; </span>os.path.join(<span class="hljs-string">'/home/user'</span>, <span class="hljs-string">'projects'</span>)
<span class="hljs-string">'/home/user/projects'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>os.path.expanduser(<span class="hljs-string">'~'</span>)
<span class="hljs-string">'C:\\Users\\Miguel'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>home = os.path.expanduser(<span class="hljs-string">'~'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>os.path.join(home, <span class="hljs-string">'projects'</span>)
<span class="hljs-string">'C:\\Users\\Miguel\\projects'</span>
</code></pre>
<p>Representing paths as strings encourages inexperienced Python developers to perform common path operations using string method. For example, joining paths with <code>+</code> instead of using <code>os.path.join()</code>, which can lead to subtle bugs and make the code hard to reuse across multiple platforms.</p>
<p>Moreover, if you want the path operations to be platform agnostic, you will need multiple calls to various <code>os</code> functions such as <code>os.path.dirname()</code>, <code>os.path.basename()</code>, and others. </p>
<p>In an attempt to fix these issues, Python 3.4 incorporated the <code>pathlib</code> module. It provides a high-level abstraction that works well under POSIX systems, such as Linux as well as Windows. It abstracts way the path's representation and provides the operations as methods.</p>
<h2 id="heading-the-anatomy-of-a-pathlibpath">The anatomy of a <code>pathlib.Path</code></h2>
<p>To make it easier to understand the basics components of a <code>Path</code>, in this section we'll their basic components.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1635494392030/uomUWH2vC.png" alt="Python pathlib Path parts Linux" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/blog/config.tar.gz'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.drive
<span class="hljs-string">'/'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.root
<span class="hljs-string">'/'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.anchor
<span class="hljs-string">'/'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.parent
PosixPath(<span class="hljs-string">'/home/miguel/projects/blog'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.name
<span class="hljs-string">'config.tar.gz'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.stem
<span class="hljs-string">'config.tar'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.suffix
<span class="hljs-string">'.gz'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.suffixes
[<span class="hljs-string">'.tar'</span>, <span class="hljs-string">'.gz'</span>]
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1635494412217/w8PsZFUD-.png" alt="Python pathlib Path parts Windows" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">r'C:/Users/Miguel/projects/blog/config.tar.gz'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.drive
<span class="hljs-string">'C:'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.root
<span class="hljs-string">'/'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.anchor
<span class="hljs-string">'C:/'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.parent
WindowsPath(<span class="hljs-string">'C:/Users/Miguel/projects/blog'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.name
<span class="hljs-string">'config.tar.gz'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.stem
<span class="hljs-string">'config.tar'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.suffix
<span class="hljs-string">'.gz'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.suffixes
[<span class="hljs-string">'.tar'</span>, <span class="hljs-string">'.gz'</span>]
</code></pre>
<h2 id="heading-how-to-convert-a-path-to-string">How to convert a path to string</h2>
<p><code>pathlib</code> implements the magic <code>__str__</code> method, and we can use it convert a path to string. Having this method implemented means you can get its string representation by passing it to the <code>str</code> constructor, like in the example below.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/tutorial'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>str(path)
<span class="hljs-string">'/home/miguel/projects/tutorial'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>repr(path)
<span class="hljs-string">"PosixPath('/home/miguel/projects/blog/config.tar.gz')"</span>
</code></pre>
<p>The example above illustrates a <code>PosixPath</code>, but you can also convert a WindowsPath to string using the same mechanism.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">r'C:/Users/Miguel/projects/blog/config.tar.gz'</span>)

<span class="hljs-comment"># when we convert a WindowsPath to string, Python adds backslashes</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>str(path)
<span class="hljs-string">'C:\\Users\\Miguel\\projects\\blog\\config.tar.gz'</span>

<span class="hljs-comment"># whereas repr returns the path with forward slashes as it is represented on Windows</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>repr(path)
<span class="hljs-string">"WindowsPath('C:/Users/Miguel/projects/blog/config.tar.gz')"</span>
</code></pre>
<h2 id="heading-how-to-join-a-path-by-adding-parts-or-other-paths">How to join a path by adding parts or other paths</h2>
<p>One of the things I like the most about <code>pathlib</code> is how easy it is to join two or more paths, or parts. There are three main ways you can do that:</p>
<ul>
<li>you can pass all the individual parts of a path to the constructor</li>
<li>use the <code>.joinpath</code> method</li>
<li>use the <code>/</code> operator</li>
</ul>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-comment"># pass all the parts to the constructor</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'.'</span>, <span class="hljs-string">'projects'</span>, <span class="hljs-string">'python'</span>, <span class="hljs-string">'source'</span>)
PosixPath(<span class="hljs-string">'projects/python/source'</span>)

<span class="hljs-comment"># Using the / operator to join another path object</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'.'</span>, <span class="hljs-string">'projects'</span>, <span class="hljs-string">'python'</span>) / Path(<span class="hljs-string">'source'</span>)
PosixPath(<span class="hljs-string">'projects/python/source'</span>)

<span class="hljs-comment"># Using the / operator to join another a string</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'.'</span>, <span class="hljs-string">'projects'</span>, <span class="hljs-string">'python'</span>) / <span class="hljs-string">'source'</span>
PosixPath(<span class="hljs-string">'projects/python/source'</span>)

<span class="hljs-comment"># Using the joinpath method</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'.'</span>, <span class="hljs-string">'projects'</span>, <span class="hljs-string">'python'</span>).joinpath(<span class="hljs-string">'source'</span>)
PosixPath(<span class="hljs-string">'projects/python/source'</span>)
</code></pre>
<p>On Windows, <code>Path</code> returns a <code>WindowsPath</code> instead, but it works the same way as in Linux.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'.'</span>, <span class="hljs-string">'projects'</span>, <span class="hljs-string">'python'</span>, <span class="hljs-string">'source'</span>)
WindowsPath(<span class="hljs-string">'projects/python/source'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'.'</span>, <span class="hljs-string">'projects'</span>, <span class="hljs-string">'python'</span>) / Path(<span class="hljs-string">'source'</span>)
WindowsPath(<span class="hljs-string">'projects/python/source'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'.'</span>, <span class="hljs-string">'projects'</span>, <span class="hljs-string">'python'</span>) / <span class="hljs-string">'source'</span>
WindowsPath(<span class="hljs-string">'projects/python/source'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'.'</span>, <span class="hljs-string">'projects'</span>, <span class="hljs-string">'python'</span>).joinpath(<span class="hljs-string">'source'</span>)
WindowsPath(<span class="hljs-string">'projects/python/source'</span>)
</code></pre>
<h2 id="heading-working-with-directories-using-pathlib">Working with directories using <code>pathlib</code></h2>
<p>In this section, we'll see how we can traverse, or walk, through directories with <code>pathlib</code>. And when it comes to navigating folders, there many things we can do, such as:</p>
<ul>
<li><a class="post-section-overview" href="#how-to-get-the-current-working-directory-cwd-with-pathlib">getting the current working directory</a></li>
<li><a class="post-section-overview" href="#how-to-get-the-home-directory-with-pathlib">getting the home directory</a></li>
<li><a class="post-section-overview" href="#how-to-expand-the-initial-path-component-with-pathexpanduser">expanding the home directory</a></li>
<li><a class="post-section-overview" href="#creating-directories-with-pathlib">creating new directories, doing it recursively, and dealing with issues when they already exist</a></li>
<li><a class="post-section-overview" href="#how-to-create-parent-directories-recursively-if-not-exists">how create nested directories</a></li>
<li><a class="post-section-overview" href="#how-to-list-all-files-and-directories">listing all files and folders in a directory</a></li>
<li><a class="post-section-overview" href="#using-isdir-to-list-only-the-directories">listing only folders in a directory</a></li>
<li><a class="post-section-overview" href="#how-to-list-only-the-files-with-isfile">listing only the files in a directory</a></li>
<li><a class="post-section-overview" href="#how-to-list-only-the-files-with-isfile">getting the number of files in a directory</a></li>
<li><a class="post-section-overview" href="#how-to-recursively-iterate-through-all-files">listing all subdirectories recursively</a></li>
<li><a class="post-section-overview" href="#how-to-recursively-iterate-through-all-files">listing all files in a directory and subdirectories recursively</a></li>
<li><a class="post-section-overview" href="#recursively-list-all-files-with-a-given-extension-or-pattern">recursively listing all files with a given extension or pattern</a></li>
<li><a class="post-section-overview" href="#how-to-change-directories-with-python-pathlib">changing current working directories</a></li>
<li><a class="post-section-overview" href="#how-to-delete-directories-with-pathlib">removing an empty directory</a></li>
<li><a class="post-section-overview" href="#how-to-remove-a-directory-along-with-its-contents-with-pathlib">removing a directory along with its contents</a></li>
</ul>
<h3 id="heading-how-to-get-the-current-working-directory-cwd-with-pathlib">How to get the current working directory (cwd) with <code>pathlib</code></h3>
<p>The <code>pathlib</code> module provides a classmethod <code>Path.cwd()</code> to get the current working directory in Python. It returns a PosixPath instance on Linux, or other Unix systems such as macOS or OpenBSD. Under the hood, <code>Path.cwd()</code> is just a <a target="_blank" href="https://miguendes.me/how-to-find-the-current-working-directory-in-python">wrapper for the classic <code>os.getcwd()</code></a>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>Path.cwd()
PosixPath(<span class="hljs-string">'/home/miguel/Desktop/pathlib'</span>)
</code></pre>
<p>On Windows, it returns a WindowsPath.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>Path.cwd()
<span class="hljs-meta">&gt;&gt;&gt; </span>WindowsPath(<span class="hljs-string">'C:/Users/Miguel/pathlib'</span>)
</code></pre>
<p>You can also print it by converting it to string using a <a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">f-string</a>, for example.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>print(<span class="hljs-string">f'This is the current directory: <span class="hljs-subst">{Path.cwd()}</span>'</span>)
This <span class="hljs-keyword">is</span> the current directory: /home/miguel/Desktop/pathlib
</code></pre>
<p>PS: If you </p>
<h3 id="heading-how-to-get-the-home-directory-with-pathlib">How to get the home directory with <code>pathlib</code></h3>
<p>When <code>pathlib</code> arrived in Python 3.4, a <code>Path</code> had no method for navigating to the home directory. This changed on Python 3.5, with the inclusion of the <code>Path.home()</code> method.</p>
<p>In Python 3.4, one has to use <code>os.path.expanduser</code>, which is awkward and unintuitive.</p>
<pre><code class="lang-python"><span class="hljs-comment"># In python 3.4</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib, os
<span class="hljs-meta">&gt;&gt;&gt; </span>pathlib.Path(os.path.expanduser(<span class="hljs-string">"~"</span>))
PosixPath(<span class="hljs-string">'/home/miguel'</span>)
</code></pre>
<p>From Python 3.5 onwards, you just call <code>Path.home()</code>.</p>
<pre><code class="lang-python"><span class="hljs-comment"># In Python 3.5+</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib

<span class="hljs-meta">&gt;&gt;&gt; </span>pathlib.Path.home()
PosixPath(<span class="hljs-string">'/home/miguel'</span>)
</code></pre>
<p><code>Path.home()</code> also works well on Windows.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib

<span class="hljs-meta">&gt;&gt;&gt; </span>pathlib.Path.home()
WindowsPath(<span class="hljs-string">'C:/Users/Miguel'</span>)
</code></pre>
<h3 id="heading-how-to-expand-the-initial-path-component-with-pathexpanduser">How to expand the initial path component with <code>Path.expanduser()</code></h3>
<p>In Unix systems, the home directory can be expanded using <code>~</code> ( tilde symbol). For example, this allows us to represent full paths like this: <code>/home/miguel/Desktop</code> as just: <code>~/Desktop/</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'~/Desktop/'</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>path.expanduser()
PosixPath(<span class="hljs-string">'/home/miguel/Desktop'</span>)
</code></pre>
<p>Despite being more popular on Unix systems, this representation also works on Windows.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'~/projects'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.expanduser()
WindowsPath(<span class="hljs-string">'C:/Users/Miguel/projects'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.expanduser().exists()
<span class="hljs-literal">True</span>
</code></pre>
<blockquote>
<p><strong>What's the opposite of <code>os.path.expanduser()</code>?</strong></p>
</blockquote>
<p>Unfortunately, the <code>pathlib</code> module doesn't have any method to do the inverse operation. If you want to condense the expanded path back to its shorter version, you need to get the path relative to your home directory using <code>Path.relative_to</code>, and place the <code>~</code> in front of it.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'~/Desktop/'</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>expanded_path = path.expanduser()
<span class="hljs-meta">&gt;&gt;&gt; </span>expanded_path
PosixPath(<span class="hljs-string">'/home/miguel/Desktop'</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'~'</span> / expanded_path.relative_to(Path.home())
PosixPath(<span class="hljs-string">'~/Desktop'</span>)
</code></pre>
<h3 id="heading-creating-directories-with-pathlib">Creating directories with <code>pathlib</code></h3>
<p>A directory is nothing more than a location for storing files and other directories, also called folders. <code>pathlib.Path</code> comes with a method to create new directories named <code>Path.mkdir()</code>.</p>
<p>This method takes three arguments:</p>
<ul>
<li><code>mode</code>: Used to determine the file mode and access flags</li>
<li><code>parents</code>: Similar to the <code>mkdir -p</code> command in Unix systems. Default to <code>False</code> which means it raises errors if there's the parent is missing, or if the directory is already created. When it's <code>True</code>, <code>pathlib.mkdir</code> creates the missing parent directories.</li>
<li><code>exist_ok</code>: Defaults to <code>False</code> and raises <code>FileExistsError</code> if the directory being created already exists. When you set it to <code>True</code>, <code>pathlib</code> ignores the error if the last part of the path is not an existing non-directory file.</li>
</ul>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-comment"># lists all files and directories in the current folder</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>list(Path.cwd().iterdir())
[PosixPath(<span class="hljs-string">'/home/miguel/path/not_created_yet'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/reports'</span>)]

<span class="hljs-comment"># create a new path instance</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'new_directory'</span>)

<span class="hljs-comment"># only the path instance has been created, but it doesn't exist on disk yet</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">False</span>

<span class="hljs-comment"># create path on disk</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>path.mkdir()

<span class="hljs-comment"># now it exsists</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">True</span>

<span class="hljs-comment"># indeed, it shows up</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>list(Path.cwd().iterdir())
[PosixPath(<span class="hljs-string">'/home/miguel/path/not_created_yet'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/reports'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/new_directory'</span>)]
</code></pre>
<h4 id="heading-creating-a-directory-that-already-exists">Creating a directory that already exists</h4>
<p>When you have a directory path and it already exists, Python raises <code>FileExistsError</code> if you call <code>Path.mkdir()</code> on it. In the previous section, we briefly mentioned that this happens because by default the <code>exist_ok</code> argument is set to <code>False</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>list(Path.cwd().iterdir())
[PosixPath(<span class="hljs-string">'/home/miguel/path/not_created_yet'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/reports'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/new_directory'</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'new_directory'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.mkdir()
---------------------------------------------------------------------------
FileExistsError                           Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-25</span><span class="hljs-number">-4</span>b7d1fa6f6eb&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> path.mkdir()

~/.pyenv/versions/<span class="hljs-number">3.9</span><span class="hljs-number">.4</span>/lib/python3<span class="hljs-number">.9</span>/pathlib.py <span class="hljs-keyword">in</span> mkdir(self, mode, parents, exist_ok)
   <span class="hljs-number">1311</span>         <span class="hljs-keyword">try</span>:
-&gt; <span class="hljs-number">1312</span>             self._accessor.mkdir(self, mode)
   <span class="hljs-number">1313</span>         <span class="hljs-keyword">except</span> FileNotFoundError:
   <span class="hljs-number">1314</span>             <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> parents <span class="hljs-keyword">or</span> self.parent == self:

FileExistsError: [Errno <span class="hljs-number">17</span>] File exists: <span class="hljs-string">'new_directory'</span>
</code></pre>
<p>To create a folder that already exists, you need to set <code>exist_ok</code> to <code>True</code>. This is useful if you don't want to check using <code>if</code>'s or deal with exceptions, for example. Another benefit is that is the directory is not empty, <code>pathlib</code> won't override it.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'new_directory'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.mkdir(exist_ok=<span class="hljs-literal">True</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>list(Path.cwd().iterdir())
[PosixPath(<span class="hljs-string">'/home/miguel/path/not_created_yet'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/reports'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/new_directory'</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span>(path / <span class="hljs-string">'new_file.txt'</span>).touch()

<span class="hljs-meta">&gt;&gt;&gt; </span>list(path.iterdir())
[PosixPath(<span class="hljs-string">'new_directory/new_file.txt'</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span>path.mkdir(exist_ok=<span class="hljs-literal">True</span>)

<span class="hljs-comment"># the file is still there, pathlib didn't overwrote it</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>list(path.iterdir())
[PosixPath(<span class="hljs-string">'new_directory/new_file.txt'</span>)]
</code></pre>
<h4 id="heading-how-to-create-parent-directories-recursively-if-not-exists">How to create parent directories recursively if not exists</h4>
<p>Sometimes you might want to create not only a single directory but also a parent and a subdirectory in one go. </p>
<p>The good news is that <code>Path.mkdir()</code> can handle situations like this well thanks to its <code>parents</code> argument. When <code>parents</code> is set to <code>True</code>, <code>pathlib.mkdir</code> creates the missing parent directories; this behavior is similar to the <code>mkdir -p</code> command in Unix systems.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'new_parent_dir/sub_dir'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.mkdir()
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-35</span><span class="hljs-number">-4</span>b7d1fa6f6eb&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> path.mkdir()

~/.pyenv/versions/<span class="hljs-number">3.9</span><span class="hljs-number">.4</span>/lib/python3<span class="hljs-number">.9</span>/pathlib.py <span class="hljs-keyword">in</span> mkdir(self, mode, parents, exist_ok)
   <span class="hljs-number">1311</span>         <span class="hljs-keyword">try</span>:
-&gt; <span class="hljs-number">1312</span>             self._accessor.mkdir(self, mode)
   <span class="hljs-number">1313</span>         <span class="hljs-keyword">except</span> FileNotFoundError:
   <span class="hljs-number">1314</span>             <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> parents <span class="hljs-keyword">or</span> self.parent == self:

FileNotFoundError: [Errno <span class="hljs-number">2</span>] No such file <span class="hljs-keyword">or</span> directory: <span class="hljs-string">'new_parent_dir/sub_dir'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.mkdir(parents=<span class="hljs-literal">True</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.parent
PosixPath(<span class="hljs-string">'new_parent_dir'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path
PosixPath(<span class="hljs-string">'new_parent_dir/sub_dir'</span>)
</code></pre>
<h3 id="heading-how-to-list-all-files-and-directories">How to list all files and directories</h3>
<p>There are many ways you can list files in a directory with Python's <code>pathlib</code>. We'll see each one in this section.</p>
<p>To list all files in a directory, including other directories, you can use the <code>Path.iterdir()</code> method. For performance reasons, it returns a generator that you can either use to iterate over it, or just convert to a list for convenience.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>list(path.iterdir())
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/README.md'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/tests'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src'</span>)]
</code></pre>
<h3 id="heading-using-isdir-to-list-only-the-directories">Using <code>isdir</code> to list only the directories</h3>
<p>We've seen that <code>iterdir</code> returns a list of <code>Path</code>s. To list only the directories in a folder, you can use the <code>Path.is_dir()</code> method. The example below will get all the folder names inside the directory.</p>
<blockquote>
<p>⚠️ WARNING: This example only lists the immediate subdirectories in Python. In the next subsection, we'll see how to list all subdirectories.</p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>[p <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.iterdir() <span class="hljs-keyword">if</span> p.is_dir()]
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/tests'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src'</span>)]
</code></pre>
<h3 id="heading-getting-a-list-of-all-subdirectories-in-the-current-directory-recursively">Getting a list of all subdirectories in the current directory recursively</h3>
<p>In this section, we'll see how to navigate in directory and subdirectories. This time we'll use another method from <code>pathlib.Path</code> named <code>glob</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>[p <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.glob(<span class="hljs-string">'**/*'</span>) <span class="hljs-keyword">if</span> p.is_dir()]
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/tests'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src/dir'</span>)]
</code></pre>
<p>As you see, <code>Path.glob</code> will also print the subdirectory <code>src/dir</code>.</p>
<p>Remembering to pass <code>'**/</code> to <code>glob()</code> is a bit annoying, but there's a way to simplify this by using <code>Path.rglob()</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>[p <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.rglob(<span class="hljs-string">'*'</span>) <span class="hljs-keyword">if</span> p.is_dir()]
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/tests'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src/dir'</span>)]
</code></pre>
<h3 id="heading-how-to-list-only-the-files-with-isfile">How to list only the files with <code>is_file</code></h3>
<p>Just as <code>pathlib</code> provides a method to check if a path is a directory, it also provides one to check if a path is a file. This method is called <code>Path.is_file()</code>, and you can use to filter out the directories and print all file names in a folder.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>[p <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.iterdir() <span class="hljs-keyword">if</span> p.is_file()]
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/README.md'</span>)]
</code></pre>
<blockquote>
<p>⚠️ WARNING: This example only lists the files inside the current directory. In the next subsection, we'll see how to list all files inside the subdirectories as well.</p>
</blockquote>
<p>Another nice use case is using <code>Path.iterdir()</code> to count the number of files inside a folder.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>len([p <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.iterdir() <span class="hljs-keyword">if</span> p.is_file()])
<span class="hljs-number">2</span>
</code></pre>
<h3 id="heading-how-to-recursively-iterate-through-all-files">How to recursively iterate through all files</h3>
<p>In previous sections, we used <code>Path.rglob()</code> to list all directories recursively, we can do the same for files by filtering the paths using the <code>Path.is_file()</code> method.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>[p <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.rglob(<span class="hljs-string">'*'</span>) <span class="hljs-keyword">if</span> p.is_file()]
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/README.md'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/tests/test_script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src/dir/walk.py'</span>)]
</code></pre>
<h3 id="heading-how-to-recursively-list-all-files-with-a-given-extension-or-pattern">How to recursively list all files with a given extension or pattern</h3>
<p>In the previous example, we list all files in a directory, but what if we want to filter by extension? For that, <code>pathlib.Path</code> has a method named <code>match()</code>, which returns <code>True</code> if matching is successful, and <code>False</code> otherwise.</p>
<p>In the example below, we list all <code>.py</code> files recursively.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>[p <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.rglob(<span class="hljs-string">'*'</span>) <span class="hljs-keyword">if</span> p.is_file() <span class="hljs-keyword">and</span> p.match(<span class="hljs-string">'*.py'</span>)]
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/tests/test_script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src/dir/walk.py'</span>)]
</code></pre>
<p>We can use the same trick for other kinds of files. For example, we might want to list all images in a directory or subdirectories.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/pictures'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>[p <span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.rglob(<span class="hljs-string">'*'</span>)
         <span class="hljs-keyword">if</span> p.match(<span class="hljs-string">'*.jpeg'</span>) <span class="hljs-keyword">or</span> p.match(<span class="hljs-string">'*.jpg'</span>) <span class="hljs-keyword">or</span> p.match(<span class="hljs-string">'*.png'</span>)
]
[PosixPath(<span class="hljs-string">'/home/miguel/pictures/dog.png'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/pictures/london/sunshine.jpg'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/pictures/london/building.jpeg'</span>)]
</code></pre>
<p>We can actually simplify it even further, we can use only <code>Path.glob</code> and <code>Path.rglob</code> to matching. (Thanks to <code>u/laundmo</code> and <code>u/SquareRootsi</code> for pointing out!)</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'/home/miguel/projects/pathlib'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>list(path.rglob(<span class="hljs-string">'*.py'</span>))
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/tests/test_script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src/dir/walk.py'</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span>list(path.glob(<span class="hljs-string">'*.py'</span>))
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/script.py'</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span>list(path.glob(<span class="hljs-string">'**/*.py'</span>))
[PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/tests/test_script.py'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/projects/pathlib/src/dir/walk.py'</span>)]
</code></pre>
<h3 id="heading-how-to-change-directories-with-python-pathlib">How to change directories with Python pathlib</h3>
<p>Unfortunately, <code>pathlib</code> has no built-in method to change directories. However, it is possible to combine it with the <code>os.chdir()</code> function, and use it to change the current directory to a different one.</p>
<blockquote>
<p>⚠️ WARNING: For versions prior to 3.6, <code>os.chdir</code> only accepts paths as string.</p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib

<span class="hljs-meta">&gt;&gt;&gt; </span>pathlib.Path.cwd()
PosixPath(<span class="hljs-string">'/home/miguel'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>target_dir = <span class="hljs-string">'/home'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>os.chdir(target_dir)

<span class="hljs-meta">&gt;&gt;&gt; </span>pathlib.Path.cwd()
PosixPath(<span class="hljs-string">'/home'</span>)
</code></pre>
<h3 id="heading-how-to-delete-directories-with-pathlib">How to delete directories with <code>pathlib</code></h3>
<p>Deleting directories using <code>pathlib</code> depends on if the folder is empty or not. To delete an empty directory, we can use the <code>Path.rmdir()</code>method.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'new_empty_dir'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.mkdir()

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.rmdir()

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">False</span>
</code></pre>
<p>If we put some file or other directory inside and try to delete, <code>Path.rmdir()</code> raises an error.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'non_empty_dir'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.mkdir()

<span class="hljs-meta">&gt;&gt;&gt; </span>(path / <span class="hljs-string">'file.txt'</span>).touch()

<span class="hljs-meta">&gt;&gt;&gt; </span>path
PosixPath(<span class="hljs-string">'non_empty_dir'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.rmdir()
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-64</span><span class="hljs-number">-00</span>bf20b27a59&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> path.rmdir()

~/.pyenv/versions/<span class="hljs-number">3.9</span><span class="hljs-number">.4</span>/lib/python3<span class="hljs-number">.9</span>/pathlib.py <span class="hljs-keyword">in</span> rmdir(self)
   <span class="hljs-number">1350</span>         Remove this directory.  The directory must be empty.
                      ...
-&gt; <span class="hljs-number">1352</span>         self._accessor.rmdir(self)
   <span class="hljs-number">1353</span>
   <span class="hljs-number">1354</span>     <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">lstat</span>(<span class="hljs-params">self</span>):</span>

OSError: [Errno <span class="hljs-number">39</span>] Directory <span class="hljs-keyword">not</span> empty: <span class="hljs-string">'non_empty_dir'</span>
</code></pre>
<p>Now, the question is: how to delete non-empty directories with <code>pathlib</code>?</p>
<p>This is what we'll see next.</p>
<h3 id="heading-how-to-remove-a-directory-along-with-its-contents-with-pathlib">How to remove a directory along with its contents with <code>pathlib</code></h3>
<p>To delete a non-empty directory, we need to remove its contents, everything. </p>
<p>To do that with <code>pathlib</code>, we need to create a function that uses <code>Path.iterdir()</code> to walk or traverse the directory and:</p>
<ul>
<li>if the path is a file, we call <code>Path.unlink()</code></li>
<li>otherwise, we call the function recursively. When there are no more files, that is, when the folder is empty, just call <code>Path.rmdir()</code></li>
</ul>
<p>Let's use the following example of a non empty directory with nested folder and files in it.</p>
<pre><code class="lang-console">$ tree /home/miguel/Desktop/blog/pathlib/sandbox/
/home/miguel/Desktop/blog/pathlib/sandbox/
├── article.txt
└── reports
    ├── another_nested
    │   └── some_file.png
    └── article.txt

2 directories, 3 files
</code></pre>
<p>To remove it we can use the following recursive function.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">remove_all</span>(<span class="hljs-params">root: Path</span>):</span>
         <span class="hljs-keyword">for</span> path <span class="hljs-keyword">in</span> root.iterdir():
             <span class="hljs-keyword">if</span> path.is_file():
                 print(<span class="hljs-string">f'Deleting the file: <span class="hljs-subst">{path}</span>'</span>)
                 path.unlink()
             <span class="hljs-keyword">else</span>:
                 remove_all(path)
         print(<span class="hljs-string">f'Deleting the empty dir: <span class="hljs-subst">{root}</span>'</span>)
         root.rmdir()
</code></pre>
<p>Then, we invoke it for the root directory, inclusive.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>root = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox'</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>root
PosixPath(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>root.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>remove_all(root)
Deleting the file: /home/miguel/Desktop/blog/pathlib/sandbox/reports/another_nested/some_file.png
Deleting the empty dir: /home/miguel/Desktop/blog/pathlib/sandbox/reports/another_nested
Deleting the file: /home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt
Deleting the empty dir: /home/miguel/Desktop/blog/pathlib/sandbox/reports
Deleting the file: /home/miguel/Desktop/blog/pathlib/sandbox/article.txt
Deleting the empty dir: /home/miguel/Desktop/blog/pathlib/sandbox

<span class="hljs-meta">&gt;&gt;&gt; </span>root
PosixPath(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>root.exists()
<span class="hljs-literal">False</span>
</code></pre>
<p>I need to be honest, this solution works fine but it's not the most appropriate one. <code>pathlib</code> is not suitable for these kind of operations.</p>
<p>As suggested by <code>u/Rawing7</code> from reddit, a better approach is to use <a target="_blank" href="https://docs.python.org/3/library/shutil.html#shutil.rmtree"><code>shutil.rmtree</code></a>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> shutil

<span class="hljs-meta">&gt;&gt;&gt; </span>root = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>root.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>shutil.rmtree(root)

<span class="hljs-meta">&gt;&gt;&gt; </span>root.exists()
<span class="hljs-literal">False</span>
</code></pre>
<h2 id="heading-working-with-files">Working with files</h2>
<p>In this section, we'll use <code>pathlib</code> to perform operations on a file, for example, we'll see how we can:</p>
<ul>
<li>create new files</li>
<li>copy existing files</li>
<li>delete files with <code>pathlib</code></li>
<li>read and write files with <code>pathlib</code></li>
</ul>
<p>Specifically, we'll learn how to:</p>
<ul>
<li><a class="post-section-overview" href="#how-to-touch-create-an-empty-a-file">create (touch) an empty file</a></li>
<li><a class="post-section-overview" href="#touch-a-file-with-timestamp">touch a file with timestamp</a></li>
<li><a class="post-section-overview" href="#how-to-touch-a-file-and-create-parent-directories">touch a new file and create the parent directories if they don't exist</a></li>
<li><a class="post-section-overview" href="#how-to-get-the-filename-from-path">get the file name</a></li>
<li><a class="post-section-overview" href="#how-to-get-the-file-extension-from-a-filename-using-pathlib">get the file extension from a filename</a></li>
<li><a class="post-section-overview" href="#how-to-open-a-file-for-reading-with-pathlib">open a file for reading</a></li>
<li><a class="post-section-overview" href="#how-to-read-text-files-with-pathlib">read a text file</a></li>
<li><a class="post-section-overview" href="#how-to-read-json-files-from-path-with-pathlib">read a JSON file</a></li>
<li><a class="post-section-overview" href="#how-to-read-binary-files-with-pathlib">read a binary file</a></li>
<li><a class="post-section-overview" href="#how-to-open-all-files-in-a-directory-in-python">opening all the files in a folder</a></li>
<li><a class="post-section-overview" href="#how-to-write-a-text-file-with-pathlib">write a text file</a></li>
<li><a class="post-section-overview" href="#how-to-write-json-files-to-path-with-pathlib">write a JSON file</a></li>
<li><a class="post-section-overview" href="#how-to-write-bytes-data-to-a-file">write bytes data file</a></li>
<li><a class="post-section-overview" href="#how-to-copy-files-with-pathlib">copy an existing file to another directory</a></li>
<li><a class="post-section-overview" href="#how-to-delete-a-file-with-pathlib">delete a single file</a></li>
<li><a class="post-section-overview" href="#how-to-delete-all-files-in-a-directory-with-pathlib">delete all files in a directory</a></li>
<li><a class="post-section-overview" href="#how-to-rename-a-file-using-pathlib">rename a file by changing its name, or by adding a new extension</a></li>
<li><a class="post-section-overview" href="#how-to-get-the-parent-directory-of-a-file-with-pathlib">get the parent directory of a file or script</a></li>
</ul>
<h3 id="heading-how-to-touch-create-an-empty-a-file">How to touch (create an empty) a file</h3>
<p><code>pathlib</code> provides a method to create an empty file named <code>Path.touch()</code>. This method is very handy when you need to create a placeholder file if it does not exist.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'empty.txt'</span>).exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'empty.txt'</span>).touch()

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'empty.txt'</span>).exists()
<span class="hljs-literal">True</span>
</code></pre>
<h3 id="heading-touch-a-file-with-timestamp">Touch a file with timestamp</h3>
<p>To create a timestamped empty file, we first need to determine the <a target="_blank" href="https://stackoverflow.com/questions/9637838/convert-string-date-to-timestamp-in-python">timestamp format</a>. </p>
<p>One way to do that is to use the <code>time</code> and <code>datetime</code>. First we define a date format, then we use the <code>datetime</code> module to create the datetime object. Then, we use the <code>time.mktime</code> to get back the timestamp.</p>
<p>Once we have the timestamp, we can just use <a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">f-strings to build the filename</a>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> time, datetime

<span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">'02/03/2021'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>d = datetime.datetime.strptime(s, <span class="hljs-string">"%d/%m/%Y"</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>d
datetime.datetime(<span class="hljs-number">2021</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">0</span>, <span class="hljs-number">0</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>d.timetuple()
time.struct_time(tm_year=<span class="hljs-number">2021</span>, tm_mon=<span class="hljs-number">3</span>, tm_mday=<span class="hljs-number">2</span>, tm_hour=<span class="hljs-number">0</span>, tm_min=<span class="hljs-number">0</span>, tm_sec=<span class="hljs-number">0</span>, tm_wday=<span class="hljs-number">1</span>, tm_yday=<span class="hljs-number">61</span>, tm_isdst=<span class="hljs-number">-1</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>time.mktime(d.timetuple())
<span class="hljs-number">1614643200.0</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>int(time.mktime(d.timetuple()))
<span class="hljs-number">1614643200</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">f'empty_<span class="hljs-subst">{int(time.mktime(d.timetuple()))}</span>.txt'</span>).exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">f'empty_<span class="hljs-subst">{int(time.mktime(d.timetuple()))}</span>.txt'</span>).touch()

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">f'empty_<span class="hljs-subst">{int(time.mktime(d.timetuple()))}</span>.txt'</span>).exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>str(Path(<span class="hljs-string">f'empty_<span class="hljs-subst">{int(time.mktime(d.timetuple()))}</span>.txt'</span>))
<span class="hljs-string">'empty_1614643200.txt'</span>
</code></pre>
<h3 id="heading-how-to-touch-a-file-and-create-parent-directories">How to touch a file and create parent directories</h3>
<p>Another common problem when creating empty files is to place them in a directory that doesn't exist yet. The reason is that <code>path.touch()</code> only works if the directory exists. To illustrate that, let's see an example.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>)
PosixPath(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>).exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>).touch()
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-24</span><span class="hljs-number">-177</span>d43b041e9&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> Path(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>).touch()

~/.pyenv/versions/<span class="hljs-number">3.9</span><span class="hljs-number">.4</span>/lib/python3<span class="hljs-number">.9</span>/pathlib.py <span class="hljs-keyword">in</span> touch(self, mode, exist_ok)
   <span class="hljs-number">1302</span>         <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> exist_ok:
   <span class="hljs-number">1303</span>             flags |= os.O_EXCL
-&gt; <span class="hljs-number">1304</span>         fd = self._raw_open(flags, mode)
   <span class="hljs-number">1305</span>         os.close(fd)
   <span class="hljs-number">1306</span>

~/.pyenv/versions/<span class="hljs-number">3.9</span><span class="hljs-number">.4</span>/lib/python3<span class="hljs-number">.9</span>/pathlib.py <span class="hljs-keyword">in</span> _raw_open(self, flags, mode)
   <span class="hljs-number">1114</span>         <span class="hljs-keyword">as</span> os.open() does.
                      ...
-&gt; <span class="hljs-number">1116</span>         <span class="hljs-keyword">return</span> self._accessor.open(self, flags, mode)
   <span class="hljs-number">1117</span>
   <span class="hljs-number">1118</span>     <span class="hljs-comment"># Public API</span>

FileNotFoundError: [Errno <span class="hljs-number">2</span>] No such file <span class="hljs-keyword">or</span> directory: <span class="hljs-string">'path/not_created_yet/empty.txt'</span>
</code></pre>
<p>If the target directory does not exist, <code>pathlib</code> raises <code>FileNotFoundError</code>. To fix that we need to create the directory first, the simplest way, as described in the <a class="post-section-overview" href="#creating-directories-with-pathlib">"creating directories" section</a>, is to use the <code>Path.mkdir(parents=True, exist_ok=True)</code>. This method creates an empty directory including all parent directories.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>).exists()
<span class="hljs-literal">False</span>

<span class="hljs-comment"># let's create the empty folder first</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>folder = Path(<span class="hljs-string">'path/not_created_yet/'</span>)

<span class="hljs-comment"># it doesn't exist yet</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>folder.exists()
<span class="hljs-literal">False</span>

<span class="hljs-comment"># create it</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>folder.mkdir(parents=<span class="hljs-literal">True</span>, exist_ok=<span class="hljs-literal">True</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>folder.exists()
<span class="hljs-literal">True</span>

<span class="hljs-comment"># the folder exists, but we still need to create the empty file</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>).exists()
<span class="hljs-literal">False</span>

<span class="hljs-comment"># create it as usual using pathlib touch</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>).touch()

<span class="hljs-comment"># verify it exists</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'path/not_created_yet/empty.txt'</span>).exists()
<span class="hljs-literal">True</span>
</code></pre>
<h3 id="heading-how-to-get-the-filename-from-path">How to get the filename from path</h3>
<p>A <code>Path</code> comes with not only method but also properties. One of them is the <code>Path.name</code>, which as the name implies, returns the filename of the path. This property ignores the parent directories, and return only the file name including the extension.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>picture = Path(<span class="hljs-string">'/home/miguel/Desktop/profile.png'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>picture.name
<span class="hljs-string">'profile.png'</span>
</code></pre>
<h4 id="heading-how-to-get-the-filename-without-the-extension">How to get the filename without the extension</h4>
<p>Sometimes, you might need to retrieve the file name without the extension. A natural way of doing this would be splitting the string on the dot. However, <code>pathlib.Path</code> comes with another helper property named <code>Path.stem</code>, which returns the final component of the path, without the extension.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>picture = Path(<span class="hljs-string">'/home/miguel/Desktop/profile.png'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>picture.stem
<span class="hljs-string">'profile'</span>
</code></pre>
<h3 id="heading-how-to-get-the-file-extension-from-a-filename-using-pathlib">How to get the file extension from a filename using <code>pathlib</code></h3>
<p>If the <code>Path.stem</code> property returns the filename excluding the extension, how can we do the opposite? How to retrieve only the extension?</p>
<p>We can do that using the <code>Path.suffix</code> property.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>picture = Path(<span class="hljs-string">'/home/miguel/Desktop/profile.png'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>picture.suffix
<span class="hljs-string">'.png'</span>
</code></pre>
<p>Some files, such as <code>.tar.gz</code> has two parts as extension, and <code>Path.suffix</code> will return only the last part. To get the whole extension, you need the property <code>Path.suffixes</code>.</p>
<p>This property returns a list of all suffixes for that path. We can then use it to join the list into a single string.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>backup = Path(<span class="hljs-string">'/home/miguel/Desktop/photos.tar.gz'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>backup.suffix
<span class="hljs-string">'.gz'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>backup.suffixes
[<span class="hljs-string">'.tar'</span>, <span class="hljs-string">'.gz'</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">''</span>.join(backup.suffixes)
<span class="hljs-string">'.tar.gz'</span>
</code></pre>
<h3 id="heading-how-to-open-a-file-for-reading-with-pathlib">How to open a file for reading with <code>pathlib</code></h3>
<p>Another great feature from <code>pathlib</code> is the ability to open a file pointed to by the path. The behavior is similar to the built-in <code>open()</code> function. In fact, it accepts pretty much the same parameters.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>p = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/recipe.txt'</span>)

<span class="hljs-comment"># open the file</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>f = p.open()

<span class="hljs-comment"># read it</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>lines = f.readlines()

<span class="hljs-meta">&gt;&gt;&gt; </span>print(lines)
[<span class="hljs-string">'1. Boil water. \n'</span>, <span class="hljs-string">'2. Warm up teapot. ...\n'</span>, <span class="hljs-string">'3. Put tea into teapot and add hot water.\n'</span>, <span class="hljs-string">'4. Cover teapot and steep tea for 5 minutes.\n'</span>, <span class="hljs-string">'5. Strain tea solids and pour hot tea into tea cups.\n'</span>]

<span class="hljs-comment"># then make sure to close the file descriptor</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>f.close()

<span class="hljs-comment"># or use a context manager, and read the file in one go</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> p.open() <span class="hljs-keyword">as</span> f:
             lines = f.readlines()

<span class="hljs-meta">&gt;&gt;&gt; </span>print(lines)
[<span class="hljs-string">'1. Boil water. \n'</span>, <span class="hljs-string">'2. Warm up teapot. ...\n'</span>, <span class="hljs-string">'3. Put tea into teapot and add hot water.\n'</span>, <span class="hljs-string">'4. Cover teapot and steep tea for 5 minutes.\n'</span>, <span class="hljs-string">'5. Strain tea solids and pour hot tea into tea cups.\n'</span>]

<span class="hljs-comment"># you can also read the whole content as string</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> p.open() <span class="hljs-keyword">as</span> f:
             content = f.read()


<span class="hljs-meta">&gt;&gt;&gt; </span>print(content)
<span class="hljs-number">1.</span> Boil water.
<span class="hljs-number">2.</span> Warm up teapot. ...
<span class="hljs-number">3.</span> Put tea into teapot <span class="hljs-keyword">and</span> add hot water.
<span class="hljs-number">4.</span> Cover teapot <span class="hljs-keyword">and</span> steep tea <span class="hljs-keyword">for</span> <span class="hljs-number">5</span> minutes.
<span class="hljs-number">5.</span> Strain tea solids <span class="hljs-keyword">and</span> pour hot tea into tea cups.
</code></pre>
<h3 id="heading-how-to-read-text-files-with-pathlib">How to read text files with <code>pathlib</code></h3>
<p>In the previous section, we used the <code>Path.open()</code> method and <code>file.read()</code> function to read the contents of the text file as a string. Even though it works just fine, you still need to close the file or using the <code>with</code> keyword to close it automatically.</p>
<p><code>pathlib</code> comes with a <code>.read_text()</code> method that does that for you, which is much more convenient.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-comment"># just call '.read_text()', no need to close the file</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>content = p.read_text()

<span class="hljs-meta">&gt;&gt;&gt; </span>print(content)
<span class="hljs-number">1.</span> Boil water.
<span class="hljs-number">2.</span> Warm up teapot. ...
<span class="hljs-number">3.</span> Put tea into teapot <span class="hljs-keyword">and</span> add hot water.
<span class="hljs-number">4.</span> Cover teapot <span class="hljs-keyword">and</span> steep tea <span class="hljs-keyword">for</span> <span class="hljs-number">5</span> minutes.
<span class="hljs-number">5.</span> Strain tea solids <span class="hljs-keyword">and</span> pour hot tea into tea cups.
</code></pre>
<blockquote>
<p>The file is opened and then closed. The optional parameters have the same meaning as in open(). <a target="_blank" href="https://docs.python.org/3/library/pathlib.html#pathlib.Path.read_text">pathlib docs</a></p>
</blockquote>
<h3 id="heading-how-to-read-json-files-from-path-with-pathlib">How to read JSON files from path with <code>pathlib</code></h3>
<p>A JSON file a nothing more than a text file structured according to the JSON specification. To read a JSON, we can open the path for reading—as we do for text files—and use <code>json.loads()</code> function from the the <code>json</code> module.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> json
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>response = Path(<span class="hljs-string">'./jsons/response.json'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> response.open() <span class="hljs-keyword">as</span> f:
        resp = json.load(f)

<span class="hljs-meta">&gt;&gt;&gt; </span>resp
{<span class="hljs-string">'name'</span>: <span class="hljs-string">'remi'</span>, <span class="hljs-string">'age'</span>: <span class="hljs-number">28</span>}
</code></pre>
<h3 id="heading-how-to-read-binary-files-with-pathlib">How to read binary files with <code>pathlib</code></h3>
<p>At this point, if you know how to read a text file, then you reading binary files will be easy. We can do this two ways:</p>
<ul>
<li>with the <code>Path.open()</code> method passing the flags <code>rb</code></li>
<li>with the <code>Path.read_bytes()</code> method</li>
</ul>
<p>Let's start with the first method.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>picture = Path(<span class="hljs-string">'/home/miguel/Desktop/profile.png'</span>)

<span class="hljs-comment"># open the file</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>f = picture.open()

<span class="hljs-comment"># read it</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>image_bytes = f.read()

<span class="hljs-meta">&gt;&gt;&gt; </span>print(image_bytes)
<span class="hljs-string">b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01R\x00\x00\x01p\x08\x02\x00\x00\x00e\xd3d\x85\x00\x00\x00\x03sBIT\x08\x08\x08\xdb\xe1O\xe0\x00\x00\x00\x10tEXtSoftware\x00Shutterc\x82\xd0\t\x00\x00 \x00IDATx\xda\xd4\xbdkw\x1cY\x92\x1ch\xe6~#2\x13\xe0\xa3\xaa\xbbg
...  [OMITTED] ....
0e\xe5\x88\xfc\x7fa\x1a\xc2p\x17\xf0N\xad\x00\x00\x00\x00IEND\xaeB`\x82'</span>

<span class="hljs-comment"># then make sure to close the file descriptor</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>f.close()

<span class="hljs-comment"># or use a context manager, and read the file in one go</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> p.open(<span class="hljs-string">'rb'</span>) <span class="hljs-keyword">as</span> f:
            image_bytes = f.read()

<span class="hljs-meta">&gt;&gt;&gt; </span>print(image_bytes)
<span class="hljs-string">b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01R\x00\x00\x01p\x08\x02\x00\x00\x00e\xd3d\x85\x00\x00\x00\x03sBIT\x08\x08\x08\xdb\xe1O\xe0\x00\x00\x00\x10tEXtSoftware\x00Shutterc\x82\xd0\t\x00\x00 \x00IDATx\xda\xd4\xbdkw\x1cY\x92\x1ch\xe6~#2\x13\xe0\xa3\xaa\xbbg
...  [OMITTED] ....
0e\xe5\x88\xfc\x7fa\x1a\xc2p\x17\xf0N\xad\x00\x00\x00\x00IEND\xaeB`\x82'</span>
</code></pre>
<p>And just like <code>Path.read_text()</code>, <code>pathlib</code> comes with a <code>.read_bytes()</code> method that can open and close the file for you.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-comment"># just call '.read_bytes()', no need to close the file</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>picture = Path(<span class="hljs-string">'/home/miguel/Desktop/profile.png'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>picture.read_bytes()
<span class="hljs-string">b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01R\x00\x00\x01p\x08\x02\x00\x00\x00e\xd3d\x85\x00\x00\x00\x03sBIT\x08\x08\x08\xdb\xe1O\xe0\x00\x00\x00\x10tEXtSoftware\x00Shutterc\x82\xd0\t\x00\x00 \x00IDATx\xda\xd4\xbdkw\x1cY\x92\x1ch\xe6~#2\x13\xe0\xa3\xaa\xbbg
...  [OMITTED] ....
0e\xe5\x88\xfc\x7fa\x1a\xc2p\x17\xf0N\xad\x00\x00\x00\x00IEND\xaeB`\x82'</span>
</code></pre>
<h3 id="heading-how-to-open-all-files-in-a-directory-in-python">How to open all files in a directory in Python</h3>
<p>Let's image you need a Python script to search all files in a directory and open them all. Maybe you want to filter by extension, or you want to do it recursively. If you've been following this guide from the beginning, you now know <a class="post-section-overview" href="#how-to-list-only-the-files-with-isfile">how to use the <code>Path.iterdir()</code> method</a>.</p>
<p>To open all files in a directory, we can combine <code>Path.iterdir()</code> with <code>Path.is_file()</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> range(<span class="hljs-number">2</span>):
        print(i)
<span class="hljs-comment"># we can use iterdir to traverse all paths in a directory</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> path <span class="hljs-keyword">in</span> pathlib.Path(<span class="hljs-string">"my_images"</span>).iterdir():
        <span class="hljs-comment"># if the path is a file, then we open it</span>
        <span class="hljs-keyword">if</span> path.is_file():
            <span class="hljs-keyword">with</span> path.open(path, <span class="hljs-string">"rb"</span>) <span class="hljs-keyword">as</span> f:
                image_bytes = f.read()
                load_image_from_bytes(image_bytes)
</code></pre>
<p>If you need to do it recursively, we can use <code>Path.rglob()</code> instead of <code>Path.iterdir()</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib
<span class="hljs-comment"># we can use rglob to walk nested directories</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> path <span class="hljs-keyword">in</span> pathlib.Path(<span class="hljs-string">"my_images"</span>).rglob(<span class="hljs-string">'*'</span>):
        <span class="hljs-comment"># if the path is a file, then we open it</span>
        <span class="hljs-keyword">if</span> path.is_file():
            <span class="hljs-keyword">with</span> path.open(path, <span class="hljs-string">"rb"</span>) <span class="hljs-keyword">as</span> f:
                image_bytes = f.read()
                load_image_from_bytes(image_bytes)
</code></pre>
<h3 id="heading-how-to-write-a-text-file-with-pathlib">How to write a text file with <code>pathlib</code></h3>
<p>In previous sections, we saw how to read text files using <code>Path.read_text()</code>.</p>
<p>To write a text file to disk, <code>pathlib</code> comes with a <code>Path.write_text()</code>. The benefits of using this method is that it writes the data and close the file for you, and the <a target="_blank" href="https://docs.python.org/3/library/pathlib.html#pathlib.Path.write_text">optional parameters have the same meaning as in open()</a>.</p>
<blockquote>
<p>⚠️ WARNING: If you open an existing file, <code>Path.write_text()</code> will overwrite it.</p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib

<span class="hljs-meta">&gt;&gt;&gt; </span>file_path = pathlib.Path(<span class="hljs-string">'/home/miguel/Desktop/blog/recipe.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>recipe_txt = <span class="hljs-string">'''
    1. Boil water.
    2. Warm up teapot. ...
    3. Put tea into teapot and add hot water.
    4. Cover teapot and steep tea for 5 minutes.
    5. Strain tea solids and pour hot tea into tea cups.
    '''</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>file_path.exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>file_path.write_text(recipe_txt)
<span class="hljs-number">180</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>content = file_path.read_text()

<span class="hljs-meta">&gt;&gt;&gt; </span>print(content)

<span class="hljs-number">1.</span> Boil water.
<span class="hljs-number">2.</span> Warm up teapot. ...
<span class="hljs-number">3.</span> Put tea into teapot <span class="hljs-keyword">and</span> add hot water.
<span class="hljs-number">4.</span> Cover teapot <span class="hljs-keyword">and</span> steep tea <span class="hljs-keyword">for</span> <span class="hljs-number">5</span> minutes.
<span class="hljs-number">5.</span> Strain tea solids <span class="hljs-keyword">and</span> pour hot tea into tea cups.
</code></pre>
<h3 id="heading-how-to-write-json-files-to-path-with-pathlib">How to write JSON files to path with <code>pathlib</code></h3>
<p>Python represents JSON objects as plain dictionaries, to write them to a file as JSON using <code>pathlib</code>, we need to combine the <code>json.dump</code> function and <code>Path.open()</code>, the same way we did to read a JSON from disk.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> json

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib

<span class="hljs-meta">&gt;&gt;&gt; </span>resp = {<span class="hljs-string">'name'</span>: <span class="hljs-string">'remi'</span>, <span class="hljs-string">'age'</span>: <span class="hljs-number">28</span>}

<span class="hljs-meta">&gt;&gt;&gt; </span>response = pathlib.Path(<span class="hljs-string">'./response.json'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>response.exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> response.open(<span class="hljs-string">'w'</span>) <span class="hljs-keyword">as</span> f:
         json.dump(resp, f)


<span class="hljs-meta">&gt;&gt;&gt; </span>response.read_text()
<span class="hljs-string">'{"name": "remi", "age": 28}'</span>
</code></pre>
<h3 id="heading-how-to-write-bytes-data-to-a-file">How to write bytes data to a file</h3>
<p>To write bytes to a file, we can use either <code>Path.open()</code> method passing the flags <code>wb</code> or <code>Path.write_bytes()</code> method.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>image_path_1 = Path(<span class="hljs-string">'./profile.png'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>image_bytes = <span class="hljs-string">b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00 [OMITTED] \x00I
     END\xaeB`\x82'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> image_path_1.open(<span class="hljs-string">'wb'</span>) <span class="hljs-keyword">as</span> f:
         f.write(image_bytes)


<span class="hljs-meta">&gt;&gt;&gt; </span>image_path_1.read_bytes()
<span class="hljs-string">b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00 [OMITTED] \x00IEND\xaeB`\x82'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>image_path_2 = Path(<span class="hljs-string">'./profile_2.png'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>image_path_2.exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>image_path_2.write_bytes(image_bytes)
<span class="hljs-number">37</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>image_path_2.read_bytes()
<span class="hljs-string">b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00 [OMITTED] \x00IEND\xaeB`\x82'</span>
</code></pre>
<h3 id="heading-how-to-copy-files-with-pathlib">How to copy files with <code>pathlib</code></h3>
<p><code>pathlib</code> cannot copy files. However, if we have a file represented by a path that doesn't mean we can't copy it. There are two different ways of doing that:</p>
<ul>
<li>using the <code>shutil</code> module</li>
<li>using the <code>Path.read_bytes()</code> and <code>Path.write_bytes()</code> methods</li>
</ul>
<p>For the first alternative, we use the <code>shutil.copyfile(src, dst)</code> function and pass the source and destination path.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib, shutil

<span class="hljs-meta">&gt;&gt;&gt; </span>src = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox/article.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>src.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>dst = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.exists()
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>shutil.copyfile(src, dst)
PosixPath(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.read_text()
<span class="hljs-string">'This is \n\nan \n\ninteresting article.\n'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.read_text() == src.read_text()
<span class="hljs-literal">True</span>
</code></pre>
<blockquote>
<p>⚠️ WARNING: <code>shutil</code> prior to Python 3.6 cannot handle <code>Path</code> instances. You need to convert the path to string first.</p>
</blockquote>
<p>The second method involves copying the whole file, then writing it to another destination.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib, shutil

<span class="hljs-meta">&gt;&gt;&gt; </span>src = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox/article.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>src.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>dst = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.write_bytes(src.read_bytes())
<span class="hljs-number">36</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.read_text()
<span class="hljs-string">'This is \n\nan \n\ninteresting article.\n'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>dst.read_text() == src.read_text()
<span class="hljs-literal">True</span>
</code></pre>
<blockquote>
<p>⚠️ WARNING: This method will overwrite the destination path. If that's a concern, it's advisable either to check if the file exists first, or to open the file in writing mode using the <code>x</code> flag. This flag will open the file exclusive creation, thus failing with <code>FileExistsError</code> if the file already exists.</p>
</blockquote>
<p>Another downside of this approach is that it loads the file to memory. If the file is big, prefer <code>shutil.copyfileobj</code>. It supports buffering and can <a target="_blank" href="https://docs.python.org/3/library/shutil.html#shutil.copyfileobj">read the file in chunks</a>, thus avoiding uncontrolled memory consumption.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib, shutil

<span class="hljs-meta">&gt;&gt;&gt; </span>src = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox/article.txt'</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>dst = Path(<span class="hljs-string">'/home/miguel/Desktop/blog/pathlib/sandbox/reports/article.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> dst.exists():
         dst.write_bytes(src.read_bytes())
     <span class="hljs-keyword">else</span>:
         print(<span class="hljs-string">'File already exists, aborting...'</span>)

File already exists, aborting...

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> dst.open(<span class="hljs-string">'xb'</span>) <span class="hljs-keyword">as</span> f:
         f.write(src.read_bytes())

---------------------------------------------------------------------------
FileExistsError                           Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-25</span><span class="hljs-number">-1974</span>c5808b1a&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> <span class="hljs-keyword">with</span> dst.open(<span class="hljs-string">'xb'</span>) <span class="hljs-keyword">as</span> f:
      <span class="hljs-number">2</span>     f.write(src.read_bytes())
      <span class="hljs-number">3</span>
</code></pre>
<h3 id="heading-how-to-delete-a-file-with-pathlib">How to delete a file with <code>pathlib</code></h3>
<p>You can remove a file or symbolic link with the <code>Path.unlink()</code> method.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>Path(<span class="hljs-string">'path/reports/report.csv'</span>).touch()

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'path/reports/report.csv'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.unlink()

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">False</span>
</code></pre>
<p>As of Python  3.8, this method takes one argument named <code>missing_ok</code>. By default, <code>missing_ok</code> is set to <code>False</code>, which means it will raise an <code>FileNotFoundError</code> error if the file doesn't exist.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'path/reports/report.csv'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>path.unlink()
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-6</span><span class="hljs-number">-8</span>eea53121d7f&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> path.unlink()

~/.pyenv/versions/<span class="hljs-number">3.9</span><span class="hljs-number">.4</span>/lib/python3<span class="hljs-number">.9</span>/pathlib.py <span class="hljs-keyword">in</span> unlink(self, missing_ok)
   <span class="hljs-number">1342</span>         <span class="hljs-keyword">try</span>:
-&gt; <span class="hljs-number">1343</span>             self._accessor.unlink(self)
   <span class="hljs-number">1344</span>         <span class="hljs-keyword">except</span> FileNotFoundError:
   <span class="hljs-number">1345</span>             <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> missing_ok:

FileNotFoundError: [Errno <span class="hljs-number">2</span>] No such file <span class="hljs-keyword">or</span> directory: <span class="hljs-string">'path/reports/report.csv'</span>

<span class="hljs-comment"># when missing_ok is True, no error is raised</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>path.unlink(missing_ok=<span class="hljs-literal">True</span>)
</code></pre>
<h3 id="heading-how-to-delete-all-files-in-a-directory-with-pathlib">How to delete all files in a directory with <code>pathlib</code></h3>
<p>To remove all files in a folder, we need to traverse it and check if the path is a file, and if so, call <code>Path.unlink()</code> on it as we saw in the previous section.</p>
<p>To walk over the contents of a directory, we can use <code>Path.iterdir()</code>. Let's consider the following directory.</p>
<pre><code class="lang-shell">$ tree /home/miguel/path/
/home/miguel/path/
├── jsons
│   └── response.json
├── new_parent_dir
│   └── sub_dir
├── non_empty_dir
│   └── file.txt
├── not_created_yet
│   └── empty.txt
├── number.csv
├── photo_1.png
├── report.md
└── reports
</code></pre>
<p>This method only deletes the immediate files under the current directory, so <strong><em>it is not </em></strong> recursive.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> pathlib

<span class="hljs-meta">&gt;&gt;&gt; </span>path = pathlib.Path(<span class="hljs-string">'/home/miguel/path'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>list(path.iterdir())
Out[<span class="hljs-number">5</span>]:
[PosixPath(<span class="hljs-string">'/home/miguel/path/jsons'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/non_empty_dir'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/not_created_yet'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/reports'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/photo_1.png'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/number.csv'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/new_parent_dir'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/report.md'</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> p <span class="hljs-keyword">in</span> path.iterdir():
        <span class="hljs-keyword">if</span> p.is_file():
            p.unlink()


<span class="hljs-meta">&gt;&gt;&gt; </span>list(path.iterdir())
[PosixPath(<span class="hljs-string">'/home/miguel/path/jsons'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/non_empty_dir'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/not_created_yet'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/reports'</span>),
 PosixPath(<span class="hljs-string">'/home/miguel/path/new_parent_dir'</span>)]
</code></pre>
<h3 id="heading-how-to-rename-a-file-using-pathlib">How to rename a file using <code>pathlib</code></h3>
<p><code>pathlib</code> also comes with a method to rename files called <code>Path.rename(target)</code>. It takes a target file path and renames the source to the target. As of Python 3.8, <code>Path.rename()</code> returns the new Path instance.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>src_file = Path(<span class="hljs-string">'recipe.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>src_file.open(<span class="hljs-string">'w'</span>).write(<span class="hljs-string">'An delicious recipe'</span>)
<span class="hljs-number">19</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>src_file.read_text()
<span class="hljs-string">'An delicious recipe'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>target = Path(<span class="hljs-string">'new_recipe.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>src_file.rename(target)
PosixPath(<span class="hljs-string">'new_recipe.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>src_file
PosixPath(<span class="hljs-string">'recipe.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>src_file.exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>target.read_text()
<span class="hljs-string">'An delicious recipe'</span>
</code></pre>
<h4 id="heading-renaming-only-file-extension">Renaming only file extension</h4>
<p>If all you want is to change the file extension to something else, for example, change from <code>.txt</code> to <code>.md</code>, you can use <code>Path.rename(target)</code> in conjunction with <code>Path.with_suffix(suffix)</code> method, which does the following:</p>
<ul>
<li>appends a new suffix, if the original path doesn’t have one</li>
<li>removes the suffix, if the supplied suffix is an empty string</li>
</ul>
<p>Let's see an example where we change our recipe file from plain text <code>.txt</code> to markdown <code>.md</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>src_file = Path(<span class="hljs-string">'recipe.txt'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>src_file.open(<span class="hljs-string">'w'</span>).write(<span class="hljs-string">'An delicious recipe'</span>)
<span class="hljs-number">19</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>new_src_file = src_file.rename(src_file.with_suffix(<span class="hljs-string">'.md'</span>))

<span class="hljs-meta">&gt;&gt;&gt; </span>new_src_file
PosixPath(<span class="hljs-string">'recipe.md'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>src_file.exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>new_src_file.exists()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>new_src_file.read_text()
<span class="hljs-string">'An delicious recipe'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>removed_extension_file = new_src_file.rename(src_file.with_suffix(<span class="hljs-string">''</span>))

<span class="hljs-meta">&gt;&gt;&gt; </span>removed_extension_file
PosixPath(<span class="hljs-string">'recipe'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>removed_extension_file.read_text()
<span class="hljs-string">'An delicious recipe'</span>
</code></pre>
<h3 id="heading-how-to-get-the-parent-directory-of-a-file-with-pathlib">How to get the parent directory of a file with <code>pathlib</code></h3>
<p>Sometimes we want to get the name of the directory a file belongs to. You can get that through a <code>Path</code> property named <code>parent</code>. This property represents the logical parent of the path, which means it returns the parent of a file or directory.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path

<span class="hljs-meta">&gt;&gt;&gt; </span>path = Path(<span class="hljs-string">'path/reports/report.csv'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>path.exists()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>parent_dir = path.parent

<span class="hljs-meta">&gt;&gt;&gt; </span>parent_dir
PosixPath(<span class="hljs-string">'path/reports'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>parent_dir.parent
PosixPath(<span class="hljs-string">'path'</span>)
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>That was a lot to learn, and I hope you enjoyed it just as I enjoyed writing it.</p>
<p><code>pathlib</code> has been part of the standard library since Python 3.4 and it's a great solution when it comes to handling paths.</p>
<p>In this guide, we covered the most important use cases in which <code>pathlib</code> shines through tons of examples.</p>
<p>I hope this cookbook is useful to you, and see you next time.</p>
<p>Other posts you may like:</p>
<ul>
<li><p><a target="_blank" href="https://miguendes.me/how-to-find-the-current-working-directory-in-python">Find the Current Working Directory in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/python-compare-lists">The Best Ways to Compare Two Lists in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">Python F-String: 73 Examples to Help You Master It</a></p>
</li>
</ul>
<p>See you next time!</p>
<p>This article was originally published at <a target="_blank" href="https://miguendes.me/python-pathlib">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[How to Disable Autouse Fixtures in pytest]]></title><description><![CDATA[pytest is a very robust framework that comes with lots of features. 
One such feature is the autouse fixtures, a.k.a xUnit setup on steroids. They are a special type of fixture that gets invoked automatically, and its main use case is to act as a set...]]></description><link>https://miguendes.me/pytest-disable-autouse</link><guid isPermaLink="true">https://miguendes.me/pytest-disable-autouse</guid><category><![CDATA[Python]]></category><category><![CDATA[Testing]]></category><category><![CDATA[Beginner Developers]]></category><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sun, 17 Oct 2021 07:37:22 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1632646107008/1Y37WAEI9.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><code>pytest</code> is a very robust framework that comes with lots of features. </p>
<p>One such feature is the <code>autouse</code> fixtures, a.k.a xUnit setup on steroids. They are a special type of fixture that gets invoked automatically, and its main use case is to act as a setup/teardown function. </p>
<p>Another use case is to perform some task, like mocking an external dependency, that must happen before every test.</p>
<p>For example, suppose you have a set of functions that execute HTTP calls. For each one, you provide a test. To ensure your test doesn't call the real API, we can <a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">mock the call</a> using a library such <code>responses</code>. </p>
<p>However, if you want one of the tests to call the API, as in an integration test, then you'll have to disable the <code>autouse</code> fixture. And that's what we're going to see today.</p>
<p>In this post, we'll learn a simple technique to disable <code>autouse</code> fixtures for one or more tests.</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><a class="post-section-overview" href="#pytests-autouse-fixture-example"><code>pytest</code>'s <code>autouse</code> fixture - example</a></li>
<li><a class="post-section-overview" href="#disabling-an-autouse-fixture">Disabling an <code>autouse</code> fixture </a></li>
<li><a class="post-section-overview" href="#conclusion">Conclusion</a></li>
</ol>
<h2 id="heading-pytest-fixture-autouse-example"><code>pytest</code> Fixture Autouse - Example</h2>
<p>In this section, we'll build an example to illustrate the usage of <a target="_blank" href="https://docs.pytest.org/en/6.2.x/fixture.html#autouse-fixtures-fixtures-you-don-t-have-to-request">autouse fixtures</a> and how to we can disable them when necessary. </p>
<p>For this example, we'll write some tests that mock the random module. </p>
<p>Consider the following case where we'll be building a random password generator. The function takes a password length and returns a random string of size <em>length</em>. And to do that, it uses <code>random.choices</code> to randomly pick <code>k</code> chars from a seed string called <code>all_chars</code>.</p>
<pre><code class="lang-python"><span class="hljs-comment"># file: autouse/__init__.py</span>

<span class="hljs-keyword">import</span> random
<span class="hljs-keyword">import</span> string


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_random_password</span>(<span class="hljs-params">length: int = <span class="hljs-number">20</span></span>) -&gt; str:</span>
    <span class="hljs-string">"""
    Generates a random password with up to length chars.
    """</span>

    all_chars = string.ascii_letters + string.digits + string.punctuation
    <span class="hljs-keyword">return</span> <span class="hljs-string">''</span>.join(random.choices(all_chars, k=length))
</code></pre>
<p>Since we don't control how <code>random.choices</code> picks, we cannot test it in a deterministic way. To make that happen, we can patch <code>random.choices</code> and make it return a fixed list of chars.</p>
<blockquote>
<p>You can also set the <code>random.seed</code> to a fixed number before every test run by making it an <em>autouse fixture</em>.</p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-comment"># file: tests/test_random.py</span>

<span class="hljs-keyword">import</span> random

<span class="hljs-keyword">import</span> pytest

<span class="hljs-keyword">from</span> autouse <span class="hljs-keyword">import</span> get_random_password


<span class="hljs-meta">@pytest.fixture(autouse=True)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">patch_random</span>():</span>
    <span class="hljs-keyword">with</span> unittest.mock.patch(<span class="hljs-string">'autouse.random.choices'</span>) <span class="hljs-keyword">as</span> mocked_choices:
        mocked_choices.return_value = [<span class="hljs-string">'a'</span>, <span class="hljs-string">'B'</span>, <span class="hljs-string">'c'</span>, <span class="hljs-string">'2'</span>]
        <span class="hljs-keyword">yield</span>


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_mocked_random_char</span>():</span>
    <span class="hljs-keyword">assert</span> get_random_password() == <span class="hljs-string">'aBc2'</span>
</code></pre>
<p>The benefits of the <code>autouse</code> fixture are that we don't need to pass it to every test that needs it. And by using <code>yield</code>, you undo the patching after the test finishes, which is great for cleaning up.</p>
<p>If we run this test, it passes just fine.</p>
<pre><code class="lang-console">============================= test session starts ==============================
collecting ... collected 1 item

test_random.py::test_mocked_random_char PASSED                           [100%]

========================= 1 passed, 1 warning in 0.05s =========================
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1632589051784/ikRPvi5MW.png" alt="pytest autouse fixture being injected inside a test" /></p>
<h2 id="heading-disabling-an-autouse-fixture">Disabling an <code>autouse</code> fixture</h2>
<p>Now, let's say that we want to test the robustness of our random number generator and we want to test that it never generates the same string in a row.</p>
<p>To do that, we need to call the real function, and not patch it. Let's create this test and see what it does.</p>
<pre><code class="lang-python"><span class="hljs-comment"># file: tests/test_random.py</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_random_char_does_not_duplicate</span>():</span>
    password_one = get_random_password()
    password_two = get_random_password()
    <span class="hljs-keyword">assert</span> password_one != password_two
</code></pre>
<p>But when we run this test, it fails:</p>
<pre><code class="lang-console">test_random.py::test_random_char_does_not_duplicate FAILED               [100%]
test_random.py:18 (test_random_char_does_not_duplicate)
def test_random_char_does_not_duplicate():
        password_one = get_random_password()
        password_two = get_random_password()
&gt;       assert password_one != password_two
E       AssertionError: assert 'aBc2' != 'aBc2'

test_random.py:22: AssertionError
</code></pre>
<p>The reason is that <code>pytest</code> injects the <code>autouse</code> fixture to every test case within the scope you specified.</p>
<p>Now the question is, how can we disable an <code>autouse</code> fixture for one or more tests in pytest?</p>
<p>One way to do that is to create a custom <code>pytest</code> mark and annotate the test with it. For example:</p>
<pre><code class="lang-python"><span class="hljs-meta">@pytest.fixture(autouse=True)</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">patch_random</span>(<span class="hljs-params">request</span>):</span>
    <span class="hljs-keyword">if</span> <span class="hljs-string">'disable_autouse'</span> <span class="hljs-keyword">in</span> request.keywords:
        <span class="hljs-keyword">yield</span> 
    <span class="hljs-keyword">else</span>:
        <span class="hljs-keyword">with</span> unittest.mock.patch(<span class="hljs-string">'autouse.random.choices'</span>) <span class="hljs-keyword">as</span> mocked_choices:
            mocked_choices.return_value = [<span class="hljs-string">'a'</span>, <span class="hljs-string">'B'</span>, <span class="hljs-string">'c'</span>, <span class="hljs-string">'2'</span>]
            <span class="hljs-keyword">yield</span>
</code></pre>
<p>In this example, we created a <code>pytest</code> mark called <code>disable_autouse</code> and we annotated the <code>test_random_char_does_not_duplicate</code> test with it. </p>
<p>This mark becomes available in the <a target="_blank" href="https://docs.pytest.org/en/6.2.x/reference.html?highlight=request#request">request fixture</a>. We can pass this <code>request</code> fixture to the <code>autouse</code> one and check if the keyword <code>disable_autouse</code> is in the list of keywords.</p>
<p>When that's the case, we don't mock, just <code>yield</code>, which gives back the control to <code>test_random_char_does_not_duplicate</code>, thus avoiding mocking the <code>random.choices</code> function.</p>
<p>Let's see what happens when we run the test with this mark...</p>
<pre><code class="lang-python"><span class="hljs-meta">@pytest.mark.disable_autouse</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_random_char_does_not_duplicate</span>():</span>
    password_one = get_random_password()
    password_two = get_random_password()
    <span class="hljs-keyword">assert</span> password_one != password_two
</code></pre>
<p>The test passes, since it's not mocked anymore.</p>
<pre><code class="lang-console">============================= test session starts ==============================
collecting ... collected 1 item

test_random.py::test_random_char_does_not_duplicate PASSED               [100%]

========================= 1 passed, 1 warning in 0.03s =========================
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p><code>pytest</code> has some great features such as <code>autouse</code> fixture. They make it easier to set up and teardown unit tests but if we ever want to disable it, then things get trickier.</p>
<p>In this post, we learned how to disable autouse fixture in pytest by marking the tests with a <a target="_blank" href="https://docs.pytest.org/en/6.2.x/mark.html">custom pytest mark</a>. I hope you enjoyed this article and see you next time.</p>
<p>Other posts you may like:</p>
<ul>
<li><p><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">Learn how to unit test REST APIs in Python with Pytest by example.</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/7-pytest-features-and-plugins-that-will-save-you-tons-of-time">7 pytest Features and Plugins That Will Save You Tons of Time</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-use-fixtures-as-arguments-in-pytestmarkparametrize">How to Use Fixtures as Arguments in pytest.mark.parametrize</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-check-if-an-exception-is-raised-or-not-with-pytest">How to Check if an Exception Is Raised (or Not) With pytest</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/7-pytest-plugins-you-must-definitely-use">7 pytest Plugins You Must Definitely Use</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-test-complex-data-in-python">How to Unit Test Complex Data Like Numpy Arrays in Python</a></p>
</li>
</ul>
<p>References:</p>
<p><a target="_blank" href="https://stackoverflow.com/questions/38748257/disable-autouse-fixtures-on-specific-pytest-marks">Disable autouse fixtures on specific pytest marks</a></p>
<p><a target="_blank" href="https://stackoverflow.com/questions/39558812/pytest-is-there-a-way-to-ignore-an-autouse-fixture">pytest - is there a way to ignore an autouse fixture?</a></p>
<p>This article was originally published at <a target="_blank" href="https://miguendes.me/pytest-disable-autouse">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[15 Easy Ways to Trim a String in Python]]></title><description><![CDATA[I'm not gonna lie. There are multiple ways you can trim a string in Python.
But... the truth is, you don't need to know every one of them.
In this article, you'll see only the most important techniques, such as stripping leading and trailing spaces (...]]></description><link>https://miguendes.me/python-trim-string</link><guid isPermaLink="true">https://miguendes.me/python-trim-string</guid><category><![CDATA[Python]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[python beginner]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sun, 03 Oct 2021 08:11:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1633073121953/hhDCfuqCs.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I'm not gonna lie. There are multiple ways you can trim a string in Python.</p>
<p>But... the truth is, you don't need to know every one of them.</p>
<p>In this article, you'll see only the most important techniques, such as stripping leading and trailing spaces (as well as the ones inside the string). You'll also learn how to remove tabs, newlines, carriage return (CRLF), and other characters. And we'll be using nothing more than native methods and regex—no external libraries required!</p>
<p>By the end of this article, you'll have mastered:</p>
<ul>
<li><p><a class="post-section-overview" href="#how-to-trim-characters-from-a-string">How to trim a string</a></p>
<ul>
<li><p><a class="post-section-overview" href="#stripping-leading-whitespace-from-beginning-of-a-string">by stripping leading whitespace from the beginning</a></p>
</li>
<li><p><a class="post-section-overview" href="#stripping-trailing-whitespace-from-end-of-a-string">by stripping trailing whitespace from the end</a></p>
</li>
<li><p><a class="post-section-overview" href="#removing-spaces-from-from-start-and-end-of-a-string">by removing spaces the start and end of a string</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#how-to-trim-newlines">How trim newlines</a></p>
</li>
<li><p><a class="post-section-overview" href="#how-to-trim-newlines">How trim carriage return (CRLF)</a></p>
</li>
<li><p><a class="post-section-overview" href="#how-to-trim-tabs">How trim tabs</a></p>
</li>
<li><p><a class="post-section-overview" href="#how-to-remove-multiple-spaces-inside-a-string">How to trim a combination of characters from a string</a></p>
</li>
<li><p><a class="post-section-overview" href="#how-to-remove-multiple-spaces-inside-a-string">How to remove multiple spaces inside a string</a></p>
<ul>
<li><p><a class="post-section-overview" href="#removing-only-duplicates">by removing only duplicates</a></p>
</li>
<li><p><a class="post-section-overview" href="#removing-all-spaces">by removing all spaces</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#how-to-strip-a-list-of-strings">How to strip a list of strings</a></p>
</li>
<li><a class="post-section-overview" href="#how-to-strip-an-numpy-array-of-strings">How to strip an (Numpy) array of strings</a></li>
</ul>
<h2 id="heading-how-to-trim-characters-from-a-string">How to Trim Characters From a String</h2>
<p>Trimming a string means deleting certain chars from the start, the end, or both sides of a string. Removing unwanted chars makes it easier to <a target="_blank" href="https://miguendes.me/python-compare-strings">compare strings</a> and can prevent hard to debug issues.</p>
<p>You can remove any kind o character, but usually what we're interested in is deleting blank spaces, new lines, carriage return (CRLF), tabs and other special symbols.</p>
<p>In this section, we're going to see how to remove leading or trailing spaces, blank spaces, newline character, carriage return (CRLF), and tabs.</p>
<h3 id="heading-stripping-leading-whitespace-from-beginning-of-a-string">Stripping Leading Whitespace From Beginning of a String</h3>
<p>The <code>str</code> class has a very convenient method to trim leading spaces named <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.lstrip"><code>str.lstrip</code></a>, a shorthand for "left-strip", since it trims a string from the left-hand side. You can think of it as a left trim.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'   hello   '</span>.lstrip()
<span class="hljs-string">'hello   '</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631520853178/hYHhQ7pef.png" alt="using python .lstrip method to remove leading spaces from a string" /></p>
<p>When calling <code>str.lstrip</code> with no arguments, it removes all whitespaces from left to right. But if all you want is to strip the first char, then there are two ways of doing this. The first one assumes that there will always be at least one whitespace in the beginning of the string. If that's the case, then you can just slice it.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">'  hello'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>s = s[<span class="hljs-number">1</span>:]
<span class="hljs-meta">&gt;&gt;&gt; </span>s
<span class="hljs-string">' hello'</span>
</code></pre>
<p>If there's no guarantee of that, we'll need to check first if the string starts with space.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">strip_first</span>(<span class="hljs-params">s: str, ch: str = <span class="hljs-string">' '</span></span>) -&gt; str:</span>
     <span class="hljs-keyword">if</span> s <span class="hljs-keyword">and</span> s[<span class="hljs-number">0</span>] == ch:
         <span class="hljs-keyword">return</span> s[<span class="hljs-number">1</span>:]
     <span class="hljs-keyword">return</span> s

<span class="hljs-meta">&gt;&gt;&gt; </span>strip_first(<span class="hljs-string">'hello'</span>)
<span class="hljs-string">'hello'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>strip_first(<span class="hljs-string">'   hello'</span>)
 <span class="hljs-string">'  hello'</span>
</code></pre>
<h3 id="heading-stripping-trailing-whitespace-from-end-of-a-string">Stripping Trailing Whitespace From End of a String</h3>
<p>The way to remove trailing spaces from the end of the string is to use <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.rstrip"><code>str.rstrip</code></a>. </p>
<p>This method expects a list of <em>chars</em> and trims the string from the right. It removes all chars that match one of those you passed, and stop as soon as it cannot match anymore. By default, <code>str.rstrip()</code> removes blanks if you don't pass anything to it. You can think of it as a right trim.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'   hello   '</span>.rstrip()
<span class="hljs-string">'   hello'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'***hello***'</span>.rstrip(<span class="hljs-string">'*'</span>)
<span class="hljs-string">'***hello'</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631521195422/3V9MhcO_a.png" alt="using python .rstrip method to remove trailing spaces from the end of a string" /></p>
<p>Sometimes you might want to trim only the last character of a string. And we can use the same logic from the previous example. Check if the last char is a space, and use slice to remove it.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">strip_last</span>(<span class="hljs-params">s: str, ch: str = <span class="hljs-string">' '</span></span>) -&gt; str:</span>
     <span class="hljs-keyword">if</span> s <span class="hljs-keyword">and</span> s[<span class="hljs-number">-1</span>] == ch:
         <span class="hljs-keyword">return</span> s[:<span class="hljs-number">-1</span>]
     <span class="hljs-keyword">return</span> s


<span class="hljs-meta">&gt;&gt;&gt; </span>strip_last(<span class="hljs-string">'hello'</span>)
<span class="hljs-string">'hello'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>strip_last(<span class="hljs-string">'hello '</span>)
<span class="hljs-string">'hello'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>strip_last(<span class="hljs-string">''</span>)
<span class="hljs-string">''</span>
</code></pre>
<h3 id="heading-removing-spaces-from-from-start-and-end-of-a-string">Removing Spaces From From Start and End of a String</h3>
<p>If all you want is to remove whitespaces from start and end of string, <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.strip"><code>str.strip</code></a> will serve you better. </p>
<p>This method trims both sides of the string. And just like <code>str.lstrip</code> and <code>str.rstrip</code>, if you can pass any combination of chars as argument, it removes them from both ends.</p>
<blockquote>
<p>⚠️ WARNING ⚠️: A common misconception is to think that there's a trim() function in Python.</p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-comment"># by default, strip removes whitespaces</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'   hello   '</span>.strip()
<span class="hljs-string">'hello'</span>
<span class="hljs-comment"># but you can also strip other character</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'***hello***'</span>.strip(<span class="hljs-string">'*'</span>)
<span class="hljs-string">'hello'</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631521436839/b8jUI1YxQ.png" alt="using python .strip to remove spaces from both sides of a string" /></p>
<h2 id="heading-how-to-trim-newlines">How to Trim Newlines</h2>
<p>We've seen how <code>str.strip</code> can remove blank spaces from both sides of a string. I've also mentioned that this method takes a chars argument that you can use pass a combination of character you want to trim. </p>
<p>To trim line breaks, you can pass <code>\n</code> and it will strip all newlines from both sides of the string.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"""
<span class="hljs-meta">... </span>
<span class="hljs-meta">... </span>
<span class="hljs-meta">... </span> hello
<span class="hljs-meta">... </span>
<span class="hljs-meta">... </span>
<span class="hljs-meta">... </span>"""</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>s
<span class="hljs-string">'\n\n\n hello\n\n\n'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>s.strip(<span class="hljs-string">'\n'</span>)
<span class="hljs-string">' hello'</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631522146211/eWAv1hx_1m.png" alt="strip new lines from a string using .strip method" /></p>
<h2 id="heading-how-to-trim-carriage-return-crlf">How to Trim Carriage Return (CRLF)</h2>
<p>The Carriage Return (<em>CR</em>), and Line Feed (<em>LF</em>) are nothing more than a newline character. They are represented by the concatenation of <code>\r</code> and <code>\n</code> forming <code>\r\n</code>. This is how Microsoft Windows, Symbian OS and other non-Unix operating systems represent a new line <a target="_blank" href="https://stackoverflow.com/a/1552782">[source]</a>.</p>
<p>Removing them from a string is the same as removing the single newline. You feed <code>str.strip</code> with <code>\r\n</code> and method does its job!</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"  hello world\r\n\r\n"</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>print(s)
  hello world


<span class="hljs-meta">&gt;&gt;&gt; </span>s.strip(<span class="hljs-string">'\r\n'</span>)
<span class="hljs-string">'  hello world'</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631523169122/tXciFhqkn.png" alt="trimming carriage return - line feed (CRLF) from a string in python" /></p>
<h2 id="heading-how-to-trim-tabs">How to Trim Tabs</h2>
<p>If you are following this guide from the beginning you might already know how to do this. Trimming tabs from a string in Python is the same as other characters, you use <code>str.strip</code> and pass the '\t' string to it.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"\t\t\t  hello  world \t"</span>       
<span class="hljs-meta">&gt;&gt;&gt; </span>s
<span class="hljs-string">'\t\t\t  hello  world \t'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>print(s)
              hello  world     
<span class="hljs-meta">&gt;&gt;&gt; </span>s.strip(<span class="hljs-string">'\t'</span>)
<span class="hljs-string">'  hello  world '</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631523357785/9OWG3y_bh.png" alt="stripping tabs from a string using strip method in python" /></p>
<p>And that's it!</p>
<h2 id="heading-how-to-trim-a-combination-of-characters-from-a-string">How to Trim a Combination of Characters From a String</h2>
<p>As I mentioned before, <code>str.strip</code> takes as argument a string, not just a single char. This sequence of chars is a combination of all chars you want to remove from the beginning and end of your string.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"  \ns hello world \n    s"</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>s    
<span class="hljs-string">'  \ns hello world \n    s'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>print(s)

s hello world 
    s
<span class="hljs-meta">&gt;&gt;&gt; </span>s.strip(<span class="hljs-string">'\n s'</span>)
<span class="hljs-string">'hello world'</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631695493856/iYeyK5RyL.png" alt="trimming a combination of more than one char from both sides of a string in python" /></p>
<h2 id="heading-how-to-remove-multiple-spaces-inside-a-string">How to Remove Multiple Spaces Inside a String</h2>
<p>Sometimes you want to do more than trimming, let's say you want to remove chars inside the string. There are two ways of doing this: one is to remove only the duplicates; the other is to remove all extra spaces.</p>
<h3 id="heading-removing-only-duplicates">Removing Only Duplicates</h3>
<p>To remove only the duplicated characters, you can use the regex module <a target="_blank" href="https://docs.python.org/3/library/re.html"><code>re</code></a></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> re
<span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"   Python   is really   a    great language.    "</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">"\s+"</span> , <span class="hljs-string">" "</span>, s)
<span class="hljs-string">' Python is really a great language. '</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631690536081/xryOFAo3I.png" alt="removing duplicate spaces in a string in python using re module" /></p>
<p>This method gets rid of all consecutive spaces. What if you want to do not only that, but also trim the string by removing the leading and trailing blanks?</p>
<p>One way is to split the string and then joining then like so:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"   Python   is really   a    great language.    "</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">" "</span>.join(s.split())
<span class="hljs-string">'Python is really a great language.'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-comment"># This is the same as using regex then stripping the whitespaces</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">"\s+"</span> , <span class="hljs-string">" "</span>, s).strip()
<span class="hljs-string">'Python is really a great language.'</span>
</code></pre>
<h3 id="heading-removing-all-spaces">Removing All Spaces</h3>
<p>Now, if you want to strip all whitespace in your string, either use regex or call the <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.replace"><code>str.replace</code></a> method.</p>
<h4 id="heading-using-re-regex-module">Using <code>re</code> (regex module)</h4>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> re
<span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"   Python   is really   a    great language.    "</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>re.sub(<span class="hljs-string">"\s+"</span> , <span class="hljs-string">""</span>, s) 
<span class="hljs-string">'Pythonisreallyagreatlanguage.'</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631690673798/4pWbyJlMu.png" alt="removing all spaces from a string in python using regex" /></p>
<h4 id="heading-using-replace">Using <code>replace</code></h4>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"   Python   is really   a    great language.    "</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>s.replace(<span class="hljs-string">' '</span>, <span class="hljs-string">''</span>)
<span class="hljs-string">'Pythonisreallyagreatlanguage.'</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631690935792/4dTCkSt_e.png" alt="removing all spaces from a string using .replace" /></p>
<h2 id="heading-how-to-strip-a-list-of-strings">How to Strip a List of Strings</h2>
<p>Trimming a list of strings is almost the same as trimming an individual one. The only difference is that you have to iterate over the list, and call <code>str.strip</code> method on each one. You do so by using a list comprehension, for example, to return a new list with all strings trimmed.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>lst = [<span class="hljs-string">"string1\n"</span>, <span class="hljs-string">"string2\n"</span>, <span class="hljs-string">"string3\n"</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span>[s.strip(<span class="hljs-string">'\n'</span>) <span class="hljs-keyword">for</span> s <span class="hljs-keyword">in</span> lst]
[<span class="hljs-string">'string1'</span>, <span class="hljs-string">'string2'</span>, <span class="hljs-string">'string3'</span>]
</code></pre>
<h2 id="heading-how-to-strip-an-numpy-array-of-strings">How to Strip an (Numpy) Array of Strings</h2>
<p>It's very common to use <a target="_blank" href="https://numpy.org">Numpy</a> for data science tasks due to its performance and ease to use. </p>
<p>If you have a array of strings and want to trim each one of them, Numpy comes with an efficient vectorized implementation of <a target="_blank" href="https://numpy.org/doc/stable/reference/generated/numpy.char.strip.html#numpy.char.strip"><code>strip</code></a>. </p>
<p>In fact, it also has <code>.lstrip</code>, <code>.rstrip</code>, <code>.replace</code>, and many other <a target="_blank" href="https://numpy.org/doc/stable/reference/routines.char.html">string operations</a>.</p>
<p>The vectorized versions work slightly differently, they are not a method but a function in the <code>numpy.char</code> module. So you must pass the array and the list of chars you want to trim.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np
<span class="hljs-meta">&gt;&gt;&gt; </span>arr = np.array([<span class="hljs-string">' helloworld   '</span>, <span class="hljs-string">' hello'</span>])
array([<span class="hljs-string">' helloworld   '</span>, <span class="hljs-string">' hello'</span>], dtype=<span class="hljs-string">'&lt;U7'</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>np.char.strip(arr, <span class="hljs-string">' '</span>)
array([<span class="hljs-string">'helloworld'</span>, <span class="hljs-string">'hello'</span>], dtype=<span class="hljs-string">'&lt;U7'</span>)
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631692401628/hWOSR_Yw7.png" alt="python_numpy_1.png" /></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this post, you learned several ways of trimming a string in Python, including array of strings. Python allows us to strip leading and trailing characters easily. And if instead of removing the extra chars on each side you want to remove the ones internally, you can count on the <a target="_blank" href="https://docs.python.org/3/library/re.html">regex module</a>. I hope you've found this article helpful and see you next time! </p>
<p>References:</p>
<p>https://stackoverflow.com/questions/761804/how-do-i-trim-whitespace-from-a-string</p>
<p>https://stackoverflow.com/questions/8270092/remove-all-whitespace-in-a-string</p>
<p>https://stackoverflow.com/questions/1546226/is-there-a-simple-way-to-remove-multiple-spaces-in-a-string</p>
<p>Other posts you may like:</p>
<ul>
<li><p><a target="_blank" href="https://miguendes.me/python-isdigit-isnumeric-isdecimal">How to Choose Between isdigit(), isdecimal() and isnumeric() in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/python-compare-strings">How to Compare Two Strings in Python (in 8 Easy Ways)</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/pylint-consider-using-f-string">Pylint: How to fix "c0209: formatting a regular string which could be a f-string (consider-using-f-string)"</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-implement-a-random-string-generator-with-python">How to Implement a Random String Generator With Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-check-if-a-string-is-a-valid-url-in-python">How to Check If a String Is a Valid URL in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">Python F-String: 73 Examples to Help You Master It</a></p>
</li>
</ul>
<p>See you next time!</p>
<p>This post was originally published at <a target="_blank" href="https://miguendes.me/python-trim-string">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[How to Choose Between isdigit(), isdecimal() and isnumeric() in Python]]></title><description><![CDATA[In this post, you'll learn the subtle difference between str.isdigit, str.isdecimal, and
str.isnumeric in Python 3 and how to choose the best one for the job.
When processing strings, usually by reading them from some source, you might want to check ...]]></description><link>https://miguendes.me/python-isdigit-isnumeric-isdecimal</link><guid isPermaLink="true">https://miguendes.me/python-isdigit-isnumeric-isdecimal</guid><category><![CDATA[Python]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[guide]]></category><category><![CDATA[Beginner Developers]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 18 Sep 2021 08:59:43 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1630917583850/C_HkLpqgE.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this post, you'll learn the subtle difference between <code>str.isdigit</code>, <code>str.isdecimal</code>, and
<code>str.isnumeric</code> in Python 3 and how to choose the best one for the job.</p>
<p>When processing strings, usually by reading them from some source, you might want to check if the given string is a number. The string class (<code>str</code>) comes with 3 different methods that you can use for that purpose.</p>
<p>Each of them has pros and cons, and distinguishing the difference between them will save you tons of development and debugging time. </p>
<p>In this article, you will:</p>
<ul>
<li>learn what <code>str.isdigit()</code>, <code>str.isdecimal()</code>, and <code>str.isnumeric()</code> do, their limitations, how to use them, and when you should use them </li>
<li>understand the difference difference between <code>isdigit</code> vs <code>isnumeric</code> vs <code>isdecimal</code></li>
<li>understand why <code>isdigit</code>,<code>isnumeric</code>, or <code>isdecimal</code> is not working for you</li>
<li>how to solve common problems that cannot be easily solved with them, such as:<ul>
<li>how to make sure a float number string is digit</li>
<li>how to use <code>isdigit</code>,<code>isnumeric</code>, or <code>isdecimal</code> with negative numbers</li>
</ul>
</li>
</ul>
<h2 id="table-of-contents">Table of Contents</h2>
<ol>
<li><a class="post-section-overview" href="#how-isdigit-works-and-when-to-use-it">How <code>isdigit()</code> Works and When to Use It</a></li>
<li><a class="post-section-overview" href="#how-isdecimal-works-and-when-to-use-it">How <code>isdecimal()</code> Works and When to Use It</a></li>
<li><a class="post-section-overview" href="#how-isnumeric-works-and-when-to-use-it">How <code>isnumeric()</code> Works and When to Use It</a></li>
<li><p><a class="post-section-overview" href="#solving-common-problems">Solving Common Problems</a></p>
<p> 4.1. <a class="post-section-overview" href="#how-to-check-if-float-numbers-are-digits">How to Check if Float Numbers Are Digits?</a></p>
<p> 4.2. <a class="post-section-overview" href="#how-to-check-if-negative-numbers-are-digits">How to Check if Negative Numbers Are Digits?</a></p>
<p> 4.3. <a class="post-section-overview" href="#why-isdigit-is-not-working-for-me">Why <code>isdigit</code> Is Not Working for Me?</a></p>
</li>
<li><p><a class="post-section-overview" href="#conclusion">Conclusion</a></p>
</li>
</ol>
<h2 id="how-isdigit-works-and-when-to-use-it">How <code>isdigit()</code> Works and When to Use It</h2>
<p><code>str.isdigit()</code> is the most obvious choice if you want to determine if a string - or a character - is a digit in Python. </p>
<p>According to its <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.isdigit">documentation</a>, this method returns <code>True</code> if all characters in the string are digits and it's not empty, otherwise it will return <code>False</code>.  Let's see some examples:</p>
<pre><code class="lang-python"><span class="hljs-comment"># all characters in the string are digits</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'102030'</span>.isdigit()
<span class="hljs-literal">True</span>

<span class="hljs-comment"># 'a' is not a digit</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'102030a'</span>.isdigit()
<span class="hljs-literal">False</span>

<span class="hljs-comment"># isdigit fails if there's whitespace</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">' 102030'</span>.isdigit()
<span class="hljs-literal">False</span>

<span class="hljs-comment"># it must be at least one char long</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">''</span>.isdigit()
<span class="hljs-literal">False</span>

<span class="hljs-comment"># dots '.' are also not digit</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'12.5'</span>.isdigit()
<span class="hljs-literal">False</span>
</code></pre>
<blockquote>
<p>Unlike many people think, <code>isdigit</code> is not a function but a method in the <code>str</code>, <code>bytes</code>, and <code>bytearray</code> classes.</p>
</blockquote>
<p>This works well for these simpler cases, but what happens if a string has a space?</p>
<pre><code class="lang-python"><span class="hljs-comment"># ' ' (space) is not a digit</span>
In [<span class="hljs-number">8</span>]: <span class="hljs-string">' 102030'</span>.isdigit()
Out[<span class="hljs-number">8</span>]: <span class="hljs-literal">False</span>
</code></pre>
<p>This fails because the string contains a space in the beginning. As a result, we cannot use it as it is to read from unprocessed sources such as the <code>input()</code> function. You must always remember to <a target="_blank" href="https://miguendes.me/python-trim-string">preprocess the input</a> before checking with <code>isdigit()</code>. That might be one of the reasons <code>isdigit</code> is not working for you.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777099538/W-M1rCS5K.png" alt="how python str isdigit deals with integers and floats" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = input(<span class="hljs-string">'Enter a number: '</span>)
Enter a number: <span class="hljs-number">56</span> 

<span class="hljs-meta">&gt;&gt;&gt; </span>a
<span class="hljs-string">'56 '</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a.isdigit()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a.strip()
<span class="hljs-string">'56'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a.strip().isdigit()
<span class="hljs-literal">True</span>
</code></pre>
<p>Despite this strict behavior, <code>isdigit</code> has some gotchas. If we read the documentation carefully, it says that the method can also handle "superscript digits".</p>
<blockquote>
<p>Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits.</p>
</blockquote>
<p>But how does that work? Will it return <code>True</code> for strings with superscripts such as <strong>2⁷</strong>? </p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>d = <span class="hljs-string">'2'</span> + <span class="hljs-string">'\u2077'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>d
<span class="hljs-string">'2⁷'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>d.isdigit()
<span class="hljs-literal">True</span>

<span class="hljs-comment"># it accepts superscripts only</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'⁵'</span>.isdigit()
<span class="hljs-literal">True</span>

<span class="hljs-comment"># and superscripts first followed by a number</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'⁵5'</span>.isdigit()
<span class="hljs-literal">True</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777153785/vDEhiFmQ2.png" alt="how python isdigit handling superscript characters" /></p>
<p>It turns out it does! You can actually use it with <code>input()</code>:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = input(<span class="hljs-string">'Enter a number:'</span>)
Enter a number:<span class="hljs-number">2</span>⁷

<span class="hljs-meta">&gt;&gt;&gt; </span>a
<span class="hljs-string">'2⁷'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>a.isdigit()
<span class="hljs-literal">True</span>
</code></pre>
<p>Even though it works well with superscripts, it doesn't handle fractions chars. This method is really about single digits.</p>
<pre><code class="lang-python"><span class="hljs-comment"># fractions in Unicode are not digits</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'⅕'</span>.isdigit()
<span class="hljs-literal">False</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777194798/-jiwnzqFT.png" alt="image showing how python isdigit work with fraction chars in unicode" /></p>
<p>As we can see, <code>str.isdigit()</code> works really well with Unicode characters. If we take a look at the unit test suite for this method, we can see some interesting test cases.</p>
<pre><code class="lang-python"><span class="hljs-comment"># https://github.com/python/cpython/blob/3.10/Lib/test/test_unicode.py#L704</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_isdigit</span>(<span class="hljs-params">self</span>):</span>
        super().test_isdigit()
        self.checkequalnofix(<span class="hljs-literal">True</span>, <span class="hljs-string">'\u2460'</span>, <span class="hljs-string">'isdigit'</span>)
        self.checkequalnofix(<span class="hljs-literal">False</span>, <span class="hljs-string">'\xbc'</span>, <span class="hljs-string">'isdigit'</span>)
        self.checkequalnofix(<span class="hljs-literal">True</span>, <span class="hljs-string">'\u0660'</span>, <span class="hljs-string">'isdigit'</span>)

        <span class="hljs-keyword">for</span> ch <span class="hljs-keyword">in</span> [<span class="hljs-string">'\U00010401'</span>, <span class="hljs-string">'\U00010427'</span>, <span class="hljs-string">'\U00010429'</span>, <span class="hljs-string">'\U0001044E'</span>,
                   <span class="hljs-string">'\U0001F40D'</span>, <span class="hljs-string">'\U0001F46F'</span>, <span class="hljs-string">'\U00011065'</span>]:
            self.assertFalse(ch.isdigit(), <span class="hljs-string">'{!a} is not a digit.'</span>.format(ch))
        <span class="hljs-keyword">for</span> ch <span class="hljs-keyword">in</span> [<span class="hljs-string">'\U0001D7F6'</span>, <span class="hljs-string">'\U00011066'</span>, <span class="hljs-string">'\U000104A0'</span>, <span class="hljs-string">'\U0001F107'</span>]:
            self.assertTrue(ch.isdigit(), <span class="hljs-string">'{!a} is a digit.'</span>.format(ch))
</code></pre>
<p>The image below shows some of these test cases.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777231144/HnSeydtl6.png" alt="image showing how python isdigit deal with unicode digits" /></p>
<blockquote>
<p>str.isdigit() works really well with numeric Unicode</p>
</blockquote>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631779373243/qU6qqvuQt.png" alt="image showing that isdigit cannot work with non-numeric values" /></p>
<blockquote>
<p>Unicode characters that don't represent digits are not accepted</p>
</blockquote>
<h3 id="summary-of-what-isdigit-cannot-do">Summary of What <code>isdigit</code> Cannot Do</h3>
<p><em>Can it handle whitespace?</em></p>
<p>No</p>
<p><em>Can it handle hexadecimal?</em></p>
<p>No</p>
<p><em>Does it raise exception?</em></p>
<p>No</p>
<p><em>Does it accept negative digits (with minus sign)?</em></p>
<p>No</p>
<h3 id="when-to-use-it">When to Use it?</h3>
<p>Use <code>str.isdigit</code> when you want to verify that each and every character in a string is a single digit, that is, not punctuation, not a letter, and not negative.</p>
<h2 id="how-isdecimal-works-and-when-to-use-it">How <code>isdecimal()</code> Works and When to Use It</h2>
<p>The <code>str.isdecimal()</code> method is very similar, <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.isdecimal">it returns <code>True</code> if all chars are decimal characters and the string is not empty</a>. This means that superscripts are <em>NOT</em> decimal numbers, thus they'll return <code>False</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'5'</span>.isdecimal()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'⁵'</span>.isdecimal()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'5⁵'</span>.isdecimal()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'-4'</span>.isdecimal()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'4.5'</span>.isdecimal()
<span class="hljs-literal">False</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777302775/r0RC4jH0H.png" alt="isdecimal cannot accept superscript" /></p>
<blockquote>
<p>Superscripts are not decimal numbers</p>
</blockquote>
<p><code>isdecimal</code> also accepts Unicode characters that are used to form numbers in base 10 in other languages. For example, the Arabic-Indic digit zero is considered a decimal, as a result <code>'٠'.isdecimal()</code> returns true.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'٠'</span>.isdecimal()
<span class="hljs-literal">True</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777340755/nUq5RXWof.png" alt="python isdecimal returns true for Arabic-Indic numbers in base 10" /></p>
<blockquote>
<p>Arabic-Indic such as '٠' decimal in base 10</p>
</blockquote>
<h3 id="when-to-use-it">When to Use it?</h3>
<p>Use <code>str.isdecimal</code> when you want to verify that each and every character in a string can form a base 10 number. Since punctuation, superscripts, letters, and minus sign are not decimals, they'll return <code>False</code>.</p>
<h2 id="how-isnumeric-works-and-when-to-use-it">How <code>isnumeric()</code> Works and When to Use It</h2>
<p>This one overlaps significantly with <code>isdigit</code> and <code>isdecimal</code>. According to the <a target="_blank" href="https://docs.python.org/3/library/stdtypes.html#str.isnumeric">documentation</a>, <code>isnumeric</code> returns <code>True</code> if all characters string are numeric and must not be empty.</p>
<p>The key difference here is the word <em>numeric</em>. What is the difference between a <em>numeric</em> character and a <em>digit</em> character?</p>
<p>The difference is that a digit is a single Unicode value whereas a numeric character is any Unicode symbol that represents a numeric value, and that includes fractions!</p>
<p>Not only that, <code>isnumeric</code> works well with roman numerals!</p>
<p>Let's see some examples in action.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'⅕'</span>.isdigit()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'⅕'</span>.isnumeric()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'⁵'</span>.isnumeric()
<span class="hljs-literal">True</span>

<span class="hljs-string">'5⁵'</span>.isnumeric()
<span class="hljs-literal">True</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777406082/4WNzWz51J.png" alt="several examples of python's isnumeric method that returns true" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'-4'</span>.isnumeric()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'4.5'</span>.isnumeric()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'5 '</span>.isnumeric()
<span class="hljs-literal">False</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777441608/bJw5mMmAz.png" alt="isnumeric returns false with float or negative numbers" /></p>
<pre><code class="lang-python"><span class="hljs-comment"># ⅮⅪ in roman numerals in unicode and represent 511 in base10</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'ⅮⅪ'</span>.isnumeric()
<span class="hljs-literal">True</span>

<span class="hljs-comment"># Roman numerals are not digits</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'ⅮⅪ'</span>.isdigit()
<span class="hljs-literal">False</span>

<span class="hljs-comment"># Ascii letters 'D', 'X', and 'I' are not numeric</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'DXI'</span>.isnumeric()
<span class="hljs-literal">False</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1631777465986/mXu1nv-1e.png" alt="isnumeric works well with roman numbers" /></p>
<h3 id="when-to-use-it">When to Use it?</h3>
<p>Use <code>str.isnumeric</code> when you want to verify that each and every character in a string is a valid numeric char, including fractions, superscripts and roman numbers. Since punctuation, letters, and minus sign are not numeric values, they'll evaluate to <code>False</code>.</p>
<h2 id="solving-common-problems">Solving Common Problems</h2>
<p>In this section, we'll see how to fix the most common problems when using <code>isdigit</code>, <code>isnumeric</code>, and <code>isdecimal</code>.</p>
<h3 id="how-to-check-if-float-numbers-are-digits">How to Check if Float Numbers Are Digits?</h3>
<p>The best way to check that is to try to cast it to <code>float</code>. </p>
<p>It the <code>float</code> constructor doesn't raise any exceptions, then the string is a valid float. This is a <a target="_blank" href="https://docs.python.org/3/glossary.html">pythonic idiom</a> called EAFP (<em>Easier to ask for forgiveness than permission</em>).</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">is_float_digit</span>(<span class="hljs-params">n: str</span>) -&gt; bool:</span>
     <span class="hljs-keyword">try</span>:
         float(n)
         <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
     <span class="hljs-keyword">except</span> ValueError:
         <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>is_float_digit(<span class="hljs-string">'23.45'</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>is_float_digit(<span class="hljs-string">'23.45a'</span>)
<span class="hljs-literal">False</span>
</code></pre>
<p><strong>CAUTION</strong>: This string method does not work with superscript! The only way to verify that is to <a target="_blank" href="https://stackoverflow.com/a/7643705">replace the '.'</a> and then calling `isdigit()' on it.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>is_float_digit(<span class="hljs-string">'23.45⁵'</span>)
<span class="hljs-literal">False</span>
</code></pre>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">is_float_digit_v2</span>(<span class="hljs-params">n: str</span>) -&gt; bool:</span>
     <span class="hljs-keyword">return</span> n.replace(<span class="hljs-string">'.'</span>, <span class="hljs-string">''</span>, <span class="hljs-number">1</span>).isdigit()

<span class="hljs-meta">&gt;&gt;&gt; </span>is_float_digit_v2(<span class="hljs-string">'23.45⁵'</span>)
<span class="hljs-literal">True</span>
</code></pre>
<h3 id="how-to-check-if-negative-numbers-are-digits">How to Check if Negative Numbers Are Digits?</h3>
<p>Checking numbers starting with minus sign depend on the target type. </p>
<p>Since we're talking about digits here, it makes sense to assert if the string can be converted to <code>int</code>. This is very similar to the EAFP approach discussed for floats. </p>
<p>However, just like the previous approach, it doesn't handle superscripts.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">is_negative_number_digit</span>(<span class="hljs-params">n: str</span>) -&gt; bool:</span>
     <span class="hljs-keyword">try</span>:
         int(n)
         <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
     <span class="hljs-keyword">except</span> ValueError:
         <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>is_negative_number_digit(<span class="hljs-string">'-2345'</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>is_negative_number_digit(<span class="hljs-string">'-2345⁵'</span>)
<span class="hljs-literal">False</span>
</code></pre>
<p>To fix that, the best way is to <a target="_blank" href="https://stackoverflow.com/a/28279773">strip the minus sign</a>.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">is_negative_number_digit_v2</span>(<span class="hljs-params">n: str</span>) -&gt; bool:</span>
     <span class="hljs-keyword">return</span> n.lstrip(<span class="hljs-string">'-'</span>).isdigit()

<span class="hljs-meta">&gt;&gt;&gt; </span>is_negative_number_digit_v2(<span class="hljs-string">'-2345'</span>)
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>is_negative_number_digit_v2(<span class="hljs-string">'-2345⁵'</span>)
<span class="hljs-literal">True</span>
</code></pre>
<h3 id="why-isdigit-is-not-working-for-me">Why <code>isdigit</code> Is Not Working for Me?</h3>
<p>The most common issue that prevents <code>isdigit</code> / <code>isnumeric</code> / <code>isdecimal</code> to work properly is having a leading or trailing whitespace in the string. Before using them it's imperative to <a target="_blank" href="(https://miguendes.me/python-trim-string">remove any leading or trailing whitespace, or other character such as newline (<code>\n</code>)</a>).</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">' 54'</span>.isdigit()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">' 54'</span>.strip().isdigit()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">' 54'</span>.isnumeric()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">' 54'</span>.strip().isnumeric()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">' 54'</span>.isdecimal()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">' 54'</span>.strip().isdecimal()
<span class="hljs-literal">True</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'65\n'</span>.isdigit()
<span class="hljs-literal">False</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'65\n'</span>.strip().isdigit()
<span class="hljs-literal">True</span>
</code></pre>
<h2 id="conclusion">Conclusion</h2>
<p>In this post, we saw the subtle difference between <code>isdigit</code> vs <code>isdecimal</code> vs <code>isnumeric</code> and how to choose the most appropriate string method for your use case. We also saw cases that cannot be dealt with them alone and how to overcome those limitations.</p>
<p>That's it for today and I hope you've enjoyed this post!</p>
<p>References: </p>
<p>https://docs.python.org/3/library/stdtypes.html</p>
<p>https://www.fileformat.info/info/unicode/char/0660/browsertest.htm</p>
<p>https://stackoverflow.com/a/7643705</p>
<p>https://stackoverflow.com/a/28279773</p>
]]></content:encoded></item><item><title><![CDATA[One Year of Blogging]]></title><description><![CDATA[On the 31st of Aug 2020, I published my first ever blog post. It’s been a wild journey, full of ups and downs, but a very rewarding one. In this post, I will detail everything I’ve learned, show some numbers and plans for the next year to come.
The M...]]></description><link>https://miguendes.me/one-year-of-blogging</link><guid isPermaLink="true">https://miguendes.me/one-year-of-blogging</guid><category><![CDATA[Hashnode]]></category><category><![CDATA[Blogging]]></category><category><![CDATA[Developer Blogging]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sun, 05 Sep 2021 08:09:26 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1630829100832/aa2enQwrB.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>On the 31st of Aug 2020, I published my first ever blog post. It’s been a wild journey, full of ups and downs, but a very rewarding one. In this post, I will detail everything I’ve learned, show some numbers and plans for the next year to come.</p>
<h2 id="the-motivation">The Motivation</h2>
<p>I started blogging with two goals in mind:</p>
<ul>
<li><p>to share things I find interesting</p>
</li>
<li><p>to improve my writing skills in English</p>
</li>
</ul>
<h3 id="sharing-interesting-stuff">Sharing Interesting Stuff</h3>
<p>The definition of “interesting” is in the eye of beholder. What I find cool might not catch your attention, and some things will be considered boring. If it’s boring, why sharing?</p>
<p>Had I held this mindset, I’d never get started. So I promised myself: I’d write what I considered interesting and if it helps other people, then great! If not, that’s fine!</p>
<p>Adopting this demeanor was crucial to help me get started. Going from 0 to something requires more energy than keeping consistency. So once we overcome this initial barrier, it’s just a matter of putting a bit more energy from time to time.</p>
<p>After producing the first few articles, I realized that what was interesting to me was also interesting to other people. Each “thank you” would put more gas into my motivation tank, so I kept going.</p>
<h3 id="improving-my-writing-skills">Improving My Writing Skills</h3>
<p>You might have noticed that English is not my native language. In fact, my English is pretty much self-taught. I started learning it primarily to consume English content and never had the intention to produce anything.</p>
<p>But... what is the point of learning a language and not being able to use it fully? </p>
<p>Even though I could write shorter messages, e-mails and things like that, expressing complex ideas was a challenge to me.</p>
<p>Writing is considerably harder than passive reading, and doing that in a second language is even harder. So I asked myself, “What can I do to improve my writing skills?”. </p>
<p>The answer could not be more obvious: write more!</p>
<h2 id="the-downs-part-1">The Downs - part. 1</h2>
<p>I spent quite a long time trying to set up a blog platform. I chose <a target="_blank" href="https://gohugo.io/">hugo</a> because it's fast and has many features, making it very flexible. The only problem is that this flexibility comes with a cost: it's a sea of distraction. </p>
<p>I think I spent almost a month trying to tweak it, testing themes and whatnot. I then realized that I was focusing on the wrong thing. The focus should be on the content, not on the appearance.</p>
<p>Luckily, I bumped by a tweet from <a class="user-mention" href="https://hashnode.com/@Catalinpit">Catalin Pit</a> recommending <a target="_blank" href="https://hashnode.com">hashnode</a>. It was exactly what I was looking for, I setup my blog in less than an hour and wrote my first article in a few days. </p>
<h2 id="the-downs-part-2">The Downs - part. 2</h2>
<p>This end of August was the end of summer in Europe and a second wave of Covid had been looming over. A new lockdown was imminent, and I’d have to stay inside as much as I could. Not only that, I also didn’t have to commute every day, which would consume about 1h of my day.</p>
<p>With this “free time” available, I set a pretty impressive streak and wrote one article per week. My goal was to publish every Saturday morning. </p>
<p>And did that until... I went back to Brazil for Christmas and I decided that I’d not touch a computer during that time. I had some signs of burnout and thought that would be the way to revert it.</p>
<p>After the break, I returned to the UK and had little to no motivation of writing again. Maybe I was still burned out, I don’t know. The truth is that I wrote only 4 articles after the break and I used to write that many each month.</p>
<p>I’m not sure if I can do 1 post a week ever again. My focus has shifted from quantity to quality. It’s not that my first few articles lacked quality, but keeping this standard requires too much time and energy.</p>
<h2 id="the-ups-the-stats">The Ups - The Stats</h2>
<p>Now it’s time for some interesting stuff: the stats! Up to this point, not considering this article, I’ve written 22 posts. Out of those, 5 of them brings most of the organic traffic.</p>
<h3 id="traffic-by-channel">Traffic By Channel</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1630332030067/ODoSRe4DM.png" alt="miguendes_visits.png" /></p>
<p>In 1 year, people have viewed this blog around ~129K times, according to Google Analytics. This number might be higher if we consider that <a target="_blank" href="https://plausible.io/blog/google-analytics-adblockers-missing-data">tech-savvy people block GA</a>. Taking only organic search into account, 46% of the views come from search engines such as Google, Bing and Duck Duck Go. It’s actually not bad considering that I'm not an SEO expert.</p>
<h3 id="organic-visits">Organic Visits</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1630332045084/sfBlnGdjN.png" alt="miguendes_organic.png" /></p>
<p>In this chart, we can view the organic traffic every month. I started investing in SEO after subscribing to <a target="_blank" href="https://bloggingfordevs.com/">Monica Lent's Blogging for Devs newsletter</a> by the end of 2020. The things I learned from it really paid off, as you can see in the graph! </p>
<p>The views had been growing steady until July. The decline has presumably something to do with <a target="_blank" href="https://www.tecmark.co.uk/blog/google-july-21-core-update">Google’s release of an update to its core algorithm</a>.</p>
<p>I experienced the same thing in December, and the traffic will perhaps recover like it did back them.</p>
<p>ps: If you want to learn more about SEO, I definitely recommend <a target="_blank" href="https://bloggingfordevs.com/">Monica's Blogging for Devs</a> and her new course <a target="_blank" href="https://seofordevs.com/?rh_ref=6d97a59b">SEO for Devs</a></p>
<h3 id="the-most-popular-ones-top-5-overall">The Most Popular Ones - Top 5 Overall</h3>
<p>Here’s a list of my popular posts according to GA, and backed by <a target="_blank" href="https://hashnode.com">hashnode</a> analytics.</p>
<p><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">73 Examples to Help You Master Python’s f-strings</a></p>
<p><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></p>
<p><a target="_blank" href="https://miguendes.me/5-hidden-python-features-you-probably-never-heard-of">5 Hidden Python Features You Probably Never Heard Of</a></p>
<p><a target="_blank" href="https://miguendes.me/how-to-shoot-yourself-in-the-foot-with-python-part-1">How to Shoot Yourself in the Foot With Python. Part 1.</a></p>
<p><a target="_blank" href="https://miguendes.me/how-to-use-datetimetimedelta-in-python-with-examples">How to Use datetime.timedelta in Python With Examples</a></p>
<h3 id="the-most-searched-ones-top-5-overall">The Most Searched Ones - Top 5 Overall</h3>
<p>And here's a list of my popular articles based on organic searches only.</p>
<p><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></p>
<p><a target="_blank" href="https://miguendes.me/how-to-use-datetimetimedelta-in-python-with-examples">How to Use datetime.timedelta in Python With Examples</a></p>
<p><a target="_blank" href="https://miguendes.me/how-to-pass-multiple-arguments-to-a-map-function-in-python">How to Pass Multiple Arguments to a map Function in Python</a></p>
<p><a target="_blank" href="https://miguendes.me/how-to-check-if-an-exception-is-raised-or-not-with-pytest">How to Check if an Exception Is Raised (or Not) With pytest</a></p>
<p><a target="_blank" href="https://miguendes.me/how-to-use-fixtures-as-arguments-in-pytestmarkparametrize">How to Use Fixtures as Arguments in pytest.mark.parametrize</a></p>
<h2 id="plans-for-the-future">Plans for the Future</h2>
<p>I write mostly about Python and its ecosystem. I’ll probably continue writing about it but I’m also considering adding a bit more of Machine Learning to it. I’ve been working with ML for a good four years and I think I have some cool stuff to share. </p>
<p>For now, my plan is to find a schedule that works for me and that allows me to keep consistent. I’m not sure if I can write one article per week, but one every two weeks seems very doable. Still, the focus will be always on quality first.</p>
<h2 id="conclusion">Conclusion</h2>
<p>That’s it, folks! I hope you liked this post and see you next time!
I'd like to thank the awesome <a class="user-mention" href="https://hashnode.com/@dailydevtips">Chris Bongers</a> for proofreading this article and suggest a few improvements, thanks mate!</p>
]]></content:encoded></item><item><title><![CDATA[How I Patched Python to Include This Ruby Feature]]></title><description><![CDATA[In this post, I'll present how I changed Python’s source code and compiled from scratch to accept "else-less" if expressions, similar to Ruby's "inline if", also known as conditional modifier 👇

Why?
The idea of having an else-less if expression in ...]]></description><link>https://miguendes.me/what-if-python-had-this-ruby-feature</link><guid isPermaLink="true">https://miguendes.me/what-if-python-had-this-ruby-feature</guid><category><![CDATA[Python]]></category><category><![CDATA[python projects]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 14 Aug 2021 09:37:58 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1628930803931/X7QUbnfrL.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this post, I'll present how I changed Python’s source code and compiled from scratch to accept "else-less" if expressions, similar to Ruby's "inline if", also known as <a target="_blank" href="https://docs.ruby-lang.org/en/2.0.0/syntax/control_expressions_rdoc.html"><em>conditional modifier</em></a> 👇</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1628326823840/iHE46R0jv.gif" alt="return_if_python.gif" /></p>
<h2 id="heading-why">Why?</h2>
<p>The idea of having an else-less if expression in Python came to my mind when I had to work with a Ruby service at my past job. Ruby, <a target="_blank" href="https://www.python.org/dev/peps/pep-0020/#id2">contrary to Python</a>, makes lots of things implicit [Citation needed], and this kind of <em>if expression</em> is one of them. I say it's implicit because it returns <code>nil</code> if the expression evaluates to <code>false</code>. This is also called  <a target="_blank" href="https://docs.ruby-lang.org/en/2.0.0/syntax/control_expressions_rdoc.html"><em>conditional modifier</em></a>. </p>
<pre><code class="lang-ruby">$ irb
<span class="hljs-meta">irb(main):001:0&gt;</span> RUBY_VERSION
=&gt; <span class="hljs-string">"2.7.1"</span>
<span class="hljs-meta">irb(main):002:0&gt;</span> a = <span class="hljs-number">42</span> if true
=&gt; <span class="hljs-number">42</span>
<span class="hljs-meta">irb(main):003:0&gt;</span> b = <span class="hljs-number">21</span> if false
=&gt; nil
<span class="hljs-meta">irb(main):004:0&gt;</span> b
=&gt; nil
<span class="hljs-meta">irb(main):005:0&gt;</span> a
=&gt; <span class="hljs-number">42</span>
</code></pre>
<p>In Python, one cannot do that without explicitly adding an <code>else</code> to the expression. In fact, as of <a target="_blank" href="https://github.com/python/cpython/pull/27506">this PR</a>, the interpreter will tell right away that the <code>else</code> is mandatory in the <code>SyntaxError</code> message. </p>
<pre><code class="lang-python">$ ./python
Python <span class="hljs-number">3.11</span><span class="hljs-number">.0</span>a0 (heads/main:<span class="hljs-number">938e84</span>b4fa, Aug  <span class="hljs-number">6</span> <span class="hljs-number">2021</span>, <span class="hljs-number">08</span>:<span class="hljs-number">59</span>:<span class="hljs-number">36</span>) [GCC <span class="hljs-number">7.5</span><span class="hljs-number">.0</span>] on linux
Type <span class="hljs-string">"help"</span>, <span class="hljs-string">"copyright"</span>, <span class="hljs-string">"credits"</span> <span class="hljs-keyword">or</span> <span class="hljs-string">"license"</span> <span class="hljs-keyword">for</span> more information.
<span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-number">42</span> <span class="hljs-keyword">if</span> <span class="hljs-literal">True</span>
  File <span class="hljs-string">"&lt;stdin&gt;"</span>, line <span class="hljs-number">1</span>
    a = <span class="hljs-number">42</span> <span class="hljs-keyword">if</span> <span class="hljs-literal">True</span>
        ^^^^^^^^^^
SyntaxError: expected <span class="hljs-string">'else'</span> after <span class="hljs-string">'if'</span> expression
</code></pre>
<p>However, I find Ruby's if actually very convenient. This convenience became more evident when I had to go back to Python and write things like this:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>my_var = <span class="hljs-number">42</span> <span class="hljs-keyword">if</span> some_cond <span class="hljs-keyword">else</span> <span class="hljs-literal">None</span>
</code></pre>
<p>So I thought to myself, what would be like if Python had similar feature? Could I do it myself? How hard would that be?</p>
<h2 id="heading-how">How?</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1628236985423/DKywoxVRq.webp" alt="Confused Lady Meme - Me trying to figure out Python's source code." /></p>
<blockquote>
<p>Me trying to make sense of CPython's source code.</p>
</blockquote>
<p>Digging into CPython's code and changing the language's syntax sounded not trivial to me. </p>
<p>Luckily, during the same week, I found out on Twitter that <a target="_blank" href="https://tonybaloney.github.io/">Anthony Shaw</a> had just written a <a target="_blank" href="https://realpython.com/products/cpython-internals-book/">book on CPython Internals</a> and it was available for pre-release. I didn't think twice and bought the book. </p>
<p>I've got to be honest, I'm the kind of person who buys things and doesn't use them immediately. As I had other plans in mind, I let it "getting dust" in my home folder for a while. </p>
<p>Until... I had to work with that Ruby service again. It reminded me of the CPython Internals book and how challenging hacking the guts of Python would be.</p>
<p>First thing was to go though the book from the very start and try to follow each step. The book focus on Python 3.9, so in order to follow though it, one needs to checkout the 3.9 tag, and that's what I did.</p>
<p>I learned about how the code is structured and then how to compile it. The next chapters show how to extend the grammar and add new things, such as a new operator. </p>
<p>As I got familiar with the code base and how to tweak the grammar, I decided to give it a spin and make my own changes to it. </p>
<h3 id="heading-the-first-failed-attempt">The First (Failed) Attempt</h3>
<p>As I started finding my way around CPython's code from the latest main branch, I noticed that lots of things had changed since Python 3.9, yet some fundamental concepts didn't. </p>
<p>My first shot was to dig into the grammar definition and find the if expression rule. The file is currently named <code>Grammar/python.gram</code>. Locating it was not difficult, an ordinary CTRL+F for the 'else' keyword was enough.</p>
<pre><code>file: Grammar<span class="hljs-operator">/</span>python.gram
...
expression[expr_ty] (memo):
    <span class="hljs-operator">|</span> invalid_expression
    <span class="hljs-operator">|</span> a<span class="hljs-operator">=</span>disjunction <span class="hljs-string">'if'</span> b<span class="hljs-operator">=</span>disjunction <span class="hljs-string">'else'</span> c<span class="hljs-operator">=</span>expression { _PyAST_IfExp(b, a, c, EXTRA) }
    <span class="hljs-operator">|</span> disjunction
    <span class="hljs-operator">|</span> lambdef
....
</code></pre><p>Now with the rule in hand, my idea was to add one more option to the current if expression where it would match <code>a=disjunction 'if' b=disjunction</code> and <code>c</code> expression would be <code>NULL</code>. </p>
<p>This new rule should be placed immediately after the complete one, otherwise the parser would match <code>a=disjunction 'if' b=disjunction</code> always, returning a <code>SyntaxError</code>.</p>
<pre><code>expression[expr_ty] (memo):
    <span class="hljs-operator">|</span> invalid_expression
    <span class="hljs-operator">|</span> a<span class="hljs-operator">=</span>disjunction <span class="hljs-string">'if'</span> b<span class="hljs-operator">=</span>disjunction <span class="hljs-string">'else'</span> c<span class="hljs-operator">=</span>expression { _PyAST_IfExp(b, a, c, EXTRA) }
    <span class="hljs-operator">|</span> a<span class="hljs-operator">=</span>disjunction <span class="hljs-string">'if'</span> b<span class="hljs-operator">=</span>disjunction  { _PyAST_IfExp(b, a, NULL, EXTRA) }
    <span class="hljs-operator">|</span> disjunction
    <span class="hljs-operator">|</span> lambdef
....
</code></pre><h4 id="heading-regenerating-the-parser-and-compiling-python-from-source">Regenerating the Parser and Compiling Python From Source</h4>
<p>CPython comes with a <code>Makefile</code> containing lots of useful commands. One of them is the <a target="_blank" href="https://github.com/python/cpython/blob/3.10/Makefile.pre.in#L850_L856"><code>regen-pegen</code> command</a> which converts <code>Grammar/python.gram</code> into <code>Parser/parser.c</code>. </p>
<p>Besides changing the grammar, I had to modify the AST for the <em>if expression</em>. AST stands for Abstract Syntax Tree and it is a way of representing the syntactic structure of the grammar as a tree. For a more information about ASTs, I highly recommend the <a target="_blank" href="https://craftinginterpreters.com/">Crafting Interpreters book</a> by <a target="_blank" href="https://journal.stuffwithstuff.com/">Robert Nystrom</a>. </p>
<p>Moving on, if you observe the rule for <em>if expression</em> goes like this:</p>
<pre><code>    <span class="hljs-operator">|</span> a<span class="hljs-operator">=</span>disjunction <span class="hljs-string">'if'</span> b<span class="hljs-operator">=</span>disjunction <span class="hljs-string">'else'</span> c<span class="hljs-operator">=</span>expression { _PyAST_IfExp(b, a, c, EXTRA) }
</code></pre><p>The means, when the parser finds this rule, it calls the <code>_PyAST_IfExp</code> which gives back a <code>expr_ty</code> data structure. So this gave me a clue, in order to implement the behavior of the new rule, I'd need to change <code>_PyAST_IfExp</code>.</p>
<p>To find where is located, I used my <code>rip-grep</code> skills and searched for it inside the source root.</p>
<pre><code class="lang-bash">$ rg _PyAST_IfExp -C2 .

[OMITTED]
Python/Python-ast.c
2686-
2687-expr_ty
2688:_PyAST_IfExp(expr_ty <span class="hljs-built_in">test</span>, expr_ty body, expr_ty orelse, int lineno, int
2689-             col_offset, int end_lineno, int end_col_offset, PyArena *arena)
2690-{
[OMITTED]
</code></pre>
<p>... And the implementation goes like this:</p>
<pre><code class="lang-C">expr_ty
_PyAST_IfExp(expr_ty test, expr_ty body, expr_ty orelse, <span class="hljs-keyword">int</span> lineno, <span class="hljs-keyword">int</span>
             col_offset, <span class="hljs-keyword">int</span> end_lineno, <span class="hljs-keyword">int</span> end_col_offset, PyArena *arena)
{
    expr_ty p;
    <span class="hljs-keyword">if</span> (!test) {
        PyErr_SetString(PyExc_ValueError,
                        <span class="hljs-string">"field 'test' is required for IfExp"</span>);
        <span class="hljs-keyword">return</span> <span class="hljs-literal">NULL</span>;
    }
    <span class="hljs-keyword">if</span> (!body) {
        PyErr_SetString(PyExc_ValueError,
                        <span class="hljs-string">"field 'body' is required for IfExp"</span>);
        <span class="hljs-keyword">return</span> <span class="hljs-literal">NULL</span>;
    }
    <span class="hljs-keyword">if</span> (!orelse) {
        PyErr_SetString(PyExc_ValueError,
                        <span class="hljs-string">"field 'orelse' is required for IfExp"</span>);
        <span class="hljs-keyword">return</span> <span class="hljs-literal">NULL</span>;
    }
    p = (expr_ty)_PyArena_Malloc(arena, <span class="hljs-keyword">sizeof</span>(*p));
    <span class="hljs-keyword">if</span> (!p)
        <span class="hljs-keyword">return</span> <span class="hljs-literal">NULL</span>;
    p-&gt;kind = IfExp_kind;
    p-&gt;v.IfExp.test = test;
    p-&gt;v.IfExp.body = body;
    p-&gt;v.IfExp.orelse = orelse;
    p-&gt;lineno = lineno;
    p-&gt;col_offset = col_offset;
    p-&gt;end_lineno = end_lineno;
    p-&gt;end_col_offset = end_col_offset;
    <span class="hljs-keyword">return</span> p;
}
</code></pre>
<p>Since I pass <strong>orelse</strong> as <code>NULL</code>, I thought it was just a matter of changing the body of <code>if (!orelse)</code> and assign <code>None</code> to <code>orelse</code>.</p>
<pre><code class="lang-diff">    if (!orelse) {
<span class="hljs-deletion">-       PyErr_SetString(PyExc_ValueError,</span>
<span class="hljs-deletion">-                       "field 'orelse' is required for IfExp");</span>
<span class="hljs-deletion">-       return NULL;</span>
<span class="hljs-addition">+       orelse = Py_None;</span>
    }
</code></pre>
<p>Now time to test it, I compile the code with <code>make -j8 -s</code> and fire up the interpreter.</p>
<pre><code class="lang-bash">$ make -j8 -s                                                                                                                                                            
Python/Python-ast.c: In <span class="hljs-keyword">function</span> ‘_PyAST_IfExp’:
Python/Python-ast.c:2703:16: warning: assignment from incompatible pointer <span class="hljs-built_in">type</span> [-Wincompatible-pointer-types]
         orelse = Py_None;
</code></pre>
<blockquote>
<p>Despite the glaring obvious warnings, I decided to ignore it just to see what happens. 😅</p>
</blockquote>
<pre><code class="lang-python">$ ./python
Python <span class="hljs-number">3.11</span><span class="hljs-number">.0</span>a0 (heads/ruby-<span class="hljs-keyword">if</span>-new-dirty:f92b9133ef, Aug  <span class="hljs-number">2</span> <span class="hljs-number">2021</span>, <span class="hljs-number">09</span>:<span class="hljs-number">13</span>:<span class="hljs-number">02</span>) [GCC <span class="hljs-number">7.5</span><span class="hljs-number">.0</span>] on linux
Type <span class="hljs-string">"help"</span>, <span class="hljs-string">"copyright"</span>, <span class="hljs-string">"credits"</span> <span class="hljs-keyword">or</span> <span class="hljs-string">"license"</span> <span class="hljs-keyword">for</span> more information.
<span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-number">42</span> <span class="hljs-keyword">if</span> <span class="hljs-literal">True</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>a
<span class="hljs-number">42</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>b = <span class="hljs-number">21</span> <span class="hljs-keyword">if</span> <span class="hljs-literal">False</span>
[<span class="hljs-number">1</span>]    <span class="hljs-number">16805</span> segmentation fault (core dumped)  ./python
</code></pre>
<p>Ouch! It works for the <code>if True</code> case, but assigning <code>Py_None</code> to <code>expr_ty orelse</code> causes a segfault.</p>
<p>Time to go back to see what is wrong.</p>
<h3 id="heading-the-second-attempt">The Second Attempt</h3>
<p>It wasn't too difficult to figure out where I messed up. <code>orelse</code> is a <code>expr_ty</code> and I'm assigning to it a <code>Py_None</code> which is a <code>PyObject *</code>. Again, thanks to <code>rip-grep</code> I found its definition.</p>
<pre><code class="lang-bash">$ rg constant -tc -C2

Include/internal/pycore_asdl.h
14-typedef PyObject * string;
15-typedef PyObject * object;
16:typedef PyObject * constant;
</code></pre>
<p>Now, how did I find out <code>Py_None</code> was a constant? </p>
<p>Whilst reviewing <code>Grammar/python.gram</code> file, I found that one of the rules for the new pattern matching syntax is defined like this:</p>
<pre><code># Literal patterns are used <span class="hljs-keyword">for</span> equality <span class="hljs-keyword">and</span> <span class="hljs-keyword">identity</span> constraints
literal_pattern[pattern_ty]:
    | <span class="hljs-keyword">value</span>=signed_number !(<span class="hljs-string">'+'</span> | <span class="hljs-string">'-'</span>) { _PyAST_MatchValue(<span class="hljs-keyword">value</span>, EXTRA) }
    | <span class="hljs-keyword">value</span>=complex_number { _PyAST_MatchValue(<span class="hljs-keyword">value</span>, EXTRA) }
    | <span class="hljs-keyword">value</span>=strings { _PyAST_MatchValue(<span class="hljs-keyword">value</span>, EXTRA) }
    | <span class="hljs-string">'None'</span> { _PyAST_MatchSingleton(Py_None, EXTRA) }
</code></pre><p>However, this rule is a <code>pattern_ty</code> not an <code>expr_ty</code>. But that's fine. What really matters is to understand what <code>_PyAST_MatchSingleton</code> actually is. Then, I searched for it in <code>Python/Python-ast.c</code>.</p>
<pre><code class="lang-C">file: Python/Python-ast.c
...
pattern_ty
_PyAST_MatchSingleton(constant value, <span class="hljs-keyword">int</span> lineno, <span class="hljs-keyword">int</span> col_offset, <span class="hljs-keyword">int</span>
                      end_lineno, <span class="hljs-keyword">int</span> end_col_offset, PyArena *arena)
...
</code></pre>
<p>Now back to the "drawing board", I look for the definition of a <code>None</code> node in the grammar. To my great relief, I find it!</p>
<pre><code>atom[expr_ty]:
    <span class="hljs-operator">|</span> NAME
    <span class="hljs-operator">|</span> <span class="hljs-string">'True'</span> { _PyAST_Constant(Py_True, NULL, EXTRA) }
    <span class="hljs-operator">|</span> <span class="hljs-string">'False'</span> { _PyAST_Constant(Py_False, NULL, EXTRA) }
    <span class="hljs-operator">|</span> <span class="hljs-string">'None'</span> { _PyAST_Constant(Py_None, NULL, EXTRA) }
....
</code></pre><p>At this point, I had all the information I needed. To return a <code>expr_ty</code> representing <code>None</code> I need to create a node in the AST which is constant by using the <code>_PyAST_Constant</code> function.</p>
<pre><code class="lang-diff">    | a=disjunction 'if' b=disjunction 'else' c=expression { _PyAST_IfExp(b, a, c, EXTRA) }
<span class="hljs-deletion">-   | a=disjunction 'if' b=disjunction { _PyAST_IfExp(b, a, NULL, EXTRA) }</span>
<span class="hljs-addition">+   | a=disjunction 'if' b=disjunction { _PyAST_IfExp(b, a, _PyAST_Constant(Py_None, NULL, EXTRA), EXTRA) }</span>
    | disjunction
</code></pre>
<p>Now I must revert <code>Python/Python-ast.c</code> as well. Since I'm feeding it a valid <code>expr_ty</code>, it will never be <code>NULL</code>.</p>
<pre><code class="lang-diff">file: Python/Python-ast.c
...
     if (!orelse) {
<span class="hljs-deletion">-        orelse = Py_None;</span>
<span class="hljs-addition">+        PyErr_SetString(PyExc_ValueError,</span>
<span class="hljs-addition">+                        "field 'orelse' is required for IfExp");</span>
<span class="hljs-addition">+        return NULL;</span>
     }
...
</code></pre>
<p>Let's compile it again and see what happens!</p>
<pre><code class="lang-bash">$ make -j8 -s &amp;&amp; ./python                             
Python 3.11.0a0 (heads/ruby-if-new-dirty:25c439ebef, Aug  2 2021, 09:25:18) [GCC 7.5.0] on linux
Type <span class="hljs-string">"help"</span>, <span class="hljs-string">"copyright"</span>, <span class="hljs-string">"credits"</span> or <span class="hljs-string">"license"</span> <span class="hljs-keyword">for</span> more information.
&gt;&gt;&gt; c = 42 <span class="hljs-keyword">if</span> True
&gt;&gt;&gt; c
42
&gt;&gt;&gt; b = 21 <span class="hljs-keyword">if</span> False
&gt;&gt;&gt; <span class="hljs-built_in">type</span>(b)
&lt;class <span class="hljs-string">'NoneType'</span>&gt;
&gt;&gt;&gt;
</code></pre>
<p>WOT!? It works! 🎉🎉🎉</p>
<p>Now, we need to do one more test. Ruby functions allow returning a value if a condition matches and if not, the rest of the function body gets executed. Like this 👇</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1628326726524/LAuTU6e-8.png" alt="ruby_function.png" /></p>
<p>At this point I wonder if that would work out-of-the-box. I rush to the interpreter again and write the same function.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">test</span>):</span>
<span class="hljs-meta">... </span>    <span class="hljs-keyword">return</span> <span class="hljs-number">42</span> <span class="hljs-keyword">if</span> test
<span class="hljs-meta">... </span>    print(<span class="hljs-string">'missed return'</span>)
<span class="hljs-meta">... </span>    <span class="hljs-keyword">return</span> <span class="hljs-number">21</span>
<span class="hljs-meta">... </span>
<span class="hljs-meta">&gt;&gt;&gt; </span>f(<span class="hljs-literal">False</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>f(<span class="hljs-literal">True</span>)
<span class="hljs-number">42</span>
&gt;&gt;&gt;
</code></pre>
<p>Ooopss...</p>
<p><img src="https://memeguy.com/photos/images/fuck-your-dreams-kid-135651.gif" alt="fuck your dreams kid meme" /></p>
<p>The function returns <code>None</code> if <em>test</em> is <code>False</code>... To help me debug this, I summoned the <a target="_blank" href="https://docs.python.org/3/library/ast.html"><code>ast</code> module</a>. The official docs define it like so:</p>
<blockquote>
<p>The ast module helps Python applications to process trees of the Python abstract syntax grammar. The abstract syntax itself might change with each Python release; this module helps to find out programmatically what the current grammar looks like.</p>
</blockquote>
<p>Let's print the AST for this function...</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>fc = <span class="hljs-string">'''
<span class="hljs-meta">... </span>def f(test):
<span class="hljs-meta">... </span>    return 42 if test
<span class="hljs-meta">... </span>    print('missed return')
<span class="hljs-meta">... </span>    return 21
<span class="hljs-meta">... </span>'''</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>print(ast.dump(ast.parse(fc), indent=<span class="hljs-number">4</span>))
Module(
    body=[
        FunctionDef(
            name=<span class="hljs-string">'f'</span>,
            args=arguments(
                posonlyargs=[],
                args=[
                    arg(arg=<span class="hljs-string">'test'</span>)],
                kwonlyargs=[],
                kw_defaults=[],
                defaults=[]),
            body=[
                Return(
                    value=IfExp(
                        test=Name(id=<span class="hljs-string">'test'</span>, ctx=Load()),
                        body=Constant(value=<span class="hljs-number">42</span>),
                        orelse=Constant(value=<span class="hljs-literal">None</span>))),
                Expr(
                    value=Call(
                        func=Name(id=<span class="hljs-string">'print'</span>, ctx=Load()),
                        args=[
                            Constant(value=<span class="hljs-string">'missed return'</span>)],
                        keywords=[])),
                Return(
                    value=Constant(value=<span class="hljs-number">21</span>))],
            decorator_list=[])],
    type_ignores=[])
</code></pre>
<p>Now things make more sense, my change to the grammar was just a syntax sugar. It turns an expression like this <code>a if b</code> into this <code>a if b else None</code>. The problem here is that Python will return no matter what, so the rest of the function is ignored. </p>
<p>We can also look at the <a target="_blank" href="https://en.wikipedia.org/wiki/Bytecode"><em>bytecode</em></a> generated to understand what exactly is executed by the interpreter. And for that, we can use the <a target="_blank" href="https://docs.python.org/3/library/dis.html"><code>dis</code> module</a>. According to the docs:</p>
<blockquote>
<p>The dis module supports the analysis of CPython bytecode by disassembling it.</p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> dis
<span class="hljs-meta">&gt;&gt;&gt; </span>dis.dis(f)
  <span class="hljs-number">2</span>           <span class="hljs-number">0</span> LOAD_FAST                <span class="hljs-number">0</span> (test)
              <span class="hljs-number">2</span> POP_JUMP_IF_FALSE        <span class="hljs-number">4</span> (to <span class="hljs-number">8</span>)
              <span class="hljs-number">4</span> LOAD_CONST               <span class="hljs-number">1</span> (<span class="hljs-number">42</span>)
              <span class="hljs-number">6</span> RETURN_VALUE
        &gt;&gt;    <span class="hljs-number">8</span> LOAD_CONST               <span class="hljs-number">0</span> (<span class="hljs-literal">None</span>)
             <span class="hljs-number">10</span> RETURN_VALUE
</code></pre>
<p>What this basically means is that in case the <em>test</em> is false, the execution jumps to 8, which loads the <code>None</code> into the top of the stack and returns it.</p>
<h3 id="heading-supporting-return-if">Supporting "return-if"</h3>
<p>To support the same Ruby feature, I can turn the expression <code>return 42 if test</code> into a regular if statement that returns if <code>test</code> is true.</p>
<p>To do that, I needed to add one more rule. This time, it would be a rule that matches the <code>return &lt;value&gt; if &lt;test&gt;</code> piece of code. Not only that, we need a <code>_PyAST_</code> function that creates the node for us. Let's then call it <code>_PyAST_ReturnIfExpr</code>.</p>
<pre><code class="lang-diff">file: Grammar/python.gram

return_stmt[stmt_ty]:
<span class="hljs-addition">+   | 'return' a=star_expressions 'if' b=disjunction { _PyAST_ReturnIfExpr(a, b, EXTRA) }</span>
    | 'return' a=[star_expressions] { _PyAST_Return(a, EXTRA) }
</code></pre>
<p>As mentioned previously, the implementation for all these functions reside in <code>Python/Python-ast.c</code>, and the their definition in <code>Include/internal/pycore_ast.h</code>, so I put <code>_PyAST_ReturnIfExpr</code> there.</p>
<pre><code class="lang-diff">file: Include/internal/pycore_ast.h

 stmt_ty _PyAST_Return(expr_ty value, int lineno, int col_offset, int
                       end_lineno, int end_col_offset, PyArena *arena);
<span class="hljs-addition">+stmt_ty _PyAST_ReturnIfExpr(expr_ty value, expr_ty test, int lineno, int col_of</span>
fset, int
<span class="hljs-addition">+                      end_lineno, int end_col_offset, PyArena *arena);</span>
 stmt_ty _PyAST_Delete(asdl_expr_seq * targets, int lineno, int col_offset, int
                       end_lineno, int end_col_offset, PyArena *arena);
</code></pre>
<pre><code class="lang-diff">file: Python/Python-ast.c

 }

<span class="hljs-addition">+stmt_ty</span>
<span class="hljs-addition">+_PyAST_ReturnIfExpr(expr_ty value, expr_ty test, int lineno, int col_offset, int end_lineno, int</span>
<span class="hljs-addition">+              end_col_offset, PyArena *arena)</span>
<span class="hljs-addition">+{</span>
<span class="hljs-addition">+    stmt_ty ret, p;</span>
<span class="hljs-addition">+    ret = _PyAST_Return(value, lineno, col_offset, end_lineno, end_col_offset, arena);</span>
<span class="hljs-addition">+</span>
<span class="hljs-addition">+    asdl_stmt_seq *body;  </span>
<span class="hljs-addition">+    body = _Py_asdl_stmt_seq_new(1, arena); </span>
<span class="hljs-addition">+    asdl_seq_SET(body, 0, ret);</span>
<span class="hljs-addition">+</span>
<span class="hljs-addition">+    p = _PyAST_If(test, body, NULL, lineno, col_offset, end_lineno, end_col_offset, arena);</span>
<span class="hljs-addition">+</span>
<span class="hljs-addition">+    return p;</span>
<span class="hljs-addition">+}</span>
<span class="hljs-addition">+</span>
 stmt_ty
</code></pre>
<p>Let's pause for a bit to examine the implementation of <code>_PyAST_ReturnIfExpr</code>. Like I mentioned previously, I want to turn <code>return &lt;value&gt; if &lt;test&gt;</code> into <code>if &lt;test&gt;: return &lt;value&gt;</code>. </p>
<p>Both <code>return</code> and the regular <code>if</code> are statements, so in CPython they're represented as <code>stmt_ty</code>. The <code>_PyAST_If</code> expectes a <code>expr_ty test</code> and a body, which is a sequence of statements. In this case, <code>body</code> is <code>asdl_stmt_seq *body</code>.</p>
<p>As a result, what we really want here is a <code>if</code> statement with a body where the only statement is a <code>return &lt;value&gt;</code> one.</p>
<p>CPython disposes of some convenient functions to build <code>asdl_stmt_seq *</code> and one of them is <code>_Py_asdl_stmt_seq_new</code>. So I used it to create the body and add the return statement I created a few lines before with <code>_PyAST_Return</code>. </p>
<p>Once that's done, the last step is to pass the <code>test</code> as well as the <code>body</code> to <code>_PyAST_If</code>. </p>
<p>And before I forget, you may be wondering what on earth is the <code>PyArena *arena</code>. <strong>Arena</strong> is a CPython abstraction used for memory allocation. It allows efficient memory usage by using memory mapping <a target="_blank" href="http://man7.org/linux/man-pages/man2/mmap.2.html"><code>mmap()</code></a> and placing them in contiguous chunks of memory <a target="_blank" href="https://realpython.com/products/cpython-internals-book/">[reference]</a>.</p>
<p>Now it's time to regenerate the parser and test it one more time.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">test</span>):</span>
<span class="hljs-meta">... </span>    <span class="hljs-keyword">return</span> <span class="hljs-number">42</span> <span class="hljs-keyword">if</span> test
<span class="hljs-meta">... </span>    print(<span class="hljs-string">'missed return'</span>)
<span class="hljs-meta">... </span>    <span class="hljs-keyword">return</span> <span class="hljs-number">21</span>
<span class="hljs-meta">... </span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> dis
<span class="hljs-meta">&gt;&gt;&gt; </span>f(<span class="hljs-literal">False</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>f(<span class="hljs-literal">True</span>)
<span class="hljs-number">42</span>
</code></pre>
<p>Oh no! It doesn't work... </p>
<p><img src="https://i.imgur.com/GnXw2GO.gif" alt="godfather sad face" /></p>
<p>Let's check the bytecodes...</p>
<pre><code class="lang-python">
<span class="hljs-meta">&gt;&gt;&gt; </span>dis.dis(f)
  <span class="hljs-number">2</span>           <span class="hljs-number">0</span> LOAD_FAST                <span class="hljs-number">0</span> (test)
              <span class="hljs-number">2</span> POP_JUMP_IF_FALSE        <span class="hljs-number">4</span> (to <span class="hljs-number">8</span>)
              <span class="hljs-number">4</span> LOAD_CONST               <span class="hljs-number">1</span> (<span class="hljs-number">42</span>)
              <span class="hljs-number">6</span> RETURN_VALUE
        &gt;&gt;    <span class="hljs-number">8</span> LOAD_CONST               <span class="hljs-number">0</span> (<span class="hljs-literal">None</span>)
             <span class="hljs-number">10</span> RETURN_VALUE
&gt;&gt;&gt;
</code></pre>
<p>... the same bloody bytecode instructions again!</p>
<h3 id="heading-going-back-to-the-compilers-class">Going Back to the Compilers Class</h3>
<p>At that point, I was clueless. I had no idea what was going on until... I decided to go down the rabbit hole of expanding the grammar rules. </p>
<p>The new rule I added went like this <code>'return' a=star_expressions 'if' b=disjunction { _PyAST_ReturnIfExpr(a, b, EXTRA) }</code>.</p>
<p>My only hypothesis is that <code>a=star_expressions 'if' b=disjunction</code> is being resolved to the else-less rule I added in the beginning. </p>
<p>By going over the grammar one more time, I figure that my theory holds. <code>star_expressions</code> will match <code>a=disjunction 'if' b=disjunction  { _PyAST_IfExp(b, a, NULL, EXTRA) }</code>.</p>
<p>The only way to fix this is by getting rid of the <code>star_expressions</code>. So I change the rule to:</p>
<pre><code class="lang-diff"> return_stmt[stmt_ty]:
<span class="hljs-deletion">-    | 'return' a=star_expressions 'if' b=disjunction { _PyAST_ReturnIfExpr(a, b, EXTRA) }</span>
<span class="hljs-addition">+    | 'return' a=disjunction guard=guard !'else' { _PyAST_ReturnIfExpr(a, guard, EXTRA) }</span>
     | 'return' a=[star_expressions] { _PyAST_Return(a, EXTRA) }
</code></pre>
<p>You might be wondering, what is <code>guard</code> and what is <code>!else</code> and what is <code>star_expressions</code>? </p>
<p>This 'guard' is a rule that is part of the pattern matching rules. The new pattern matching feature added in Python 3.10 allows things like this:</p>
<pre><code class="lang-python">match point:
    case Point(x, y) <span class="hljs-keyword">if</span> x == y:
        print(<span class="hljs-string">f"Y=X at <span class="hljs-subst">{x}</span>"</span>)
    case Point(x, y):
        print(<span class="hljs-string">f"Not on the diagonal"</span>)
</code></pre>
<p>And the rule goes by this:</p>
<pre><code><span class="hljs-keyword">guard</span>[expr_ty]: '<span class="hljs-keyword">if</span>' <span class="hljs-keyword">guard</span>=named_expression { <span class="hljs-keyword">guard</span> }
</code></pre><p>With that, I added one more check. To avoid it failing with <code>SyntaxError</code>, we need to make sure the rule matches only code like this: <code>return value if cond</code>. Thus, to prevent code such as <code>return an if cond else b</code> being matched prematurely, I added a <code>!'else'</code> to the rule.</p>
<p>Last, but not least, the <code>star_expressions</code> allow us to return to return destructured iterables. For example:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>():</span>
   ...:     a = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>]
   ...:     <span class="hljs-keyword">return</span> <span class="hljs-number">0</span>, *a
   ...: 

<span class="hljs-meta">&gt;&gt;&gt; </span>f()
(<span class="hljs-number">0</span>, <span class="hljs-number">1</span>, <span class="hljs-number">2</span>)
</code></pre>
<p>In this case, <code>0, *a</code> is a tuple, which falls under the category of <code>star_expressions</code>. The regular if-expression doesn't allow using <code>star_expressions</code> with it AFAIK, so changing our new <code>return</code> rule won't be an issue.</p>
<h3 id="heading-does-it-work-yet">Does it work yet?</h3>
<p>After fixing the return rule, I regenerate the grammar one more time and compile it. </p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">f</span>(<span class="hljs-params">test</span>):</span>
<span class="hljs-meta">... </span>    <span class="hljs-keyword">return</span> <span class="hljs-number">42</span> <span class="hljs-keyword">if</span> test
<span class="hljs-meta">... </span>    print(<span class="hljs-string">'missed return'</span>)
<span class="hljs-meta">... </span>    <span class="hljs-keyword">return</span> <span class="hljs-number">21</span>
<span class="hljs-meta">... </span>
<span class="hljs-meta">&gt;&gt;&gt; </span>f(<span class="hljs-literal">False</span>)
missed <span class="hljs-keyword">return</span>
<span class="hljs-number">21</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>f(<span class="hljs-literal">True</span>)
<span class="hljs-number">42</span>
</code></pre>
<p>And... IT WORKS!!</p>
<p><img src="https://media3.giphy.com/media/VaSVVc3evuyXAdN9rq/giphy.gif" alt /></p>
<p>Let's check the bytecode then...</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> dis
<span class="hljs-meta">&gt;&gt;&gt; </span>dis.dis(f)
  <span class="hljs-number">2</span>           <span class="hljs-number">0</span> LOAD_FAST                <span class="hljs-number">0</span> (test)
              <span class="hljs-number">2</span> POP_JUMP_IF_FALSE        <span class="hljs-number">4</span> (to <span class="hljs-number">8</span>)
              <span class="hljs-number">4</span> LOAD_CONST               <span class="hljs-number">1</span> (<span class="hljs-number">42</span>)
              <span class="hljs-number">6</span> RETURN_VALUE

  <span class="hljs-number">3</span>     &gt;&gt;    <span class="hljs-number">8</span> LOAD_GLOBAL              <span class="hljs-number">0</span> (<span class="hljs-keyword">print</span>)
             <span class="hljs-number">10</span> LOAD_CONST               <span class="hljs-number">2</span> (<span class="hljs-string">'missed return'</span>)
             <span class="hljs-number">12</span> CALL_FUNCTION            <span class="hljs-number">1</span>
             <span class="hljs-number">14</span> POP_TOP

  <span class="hljs-number">4</span>          <span class="hljs-number">16</span> LOAD_CONST               <span class="hljs-number">3</span> (<span class="hljs-number">21</span>)
             <span class="hljs-number">18</span> RETURN_VALUE
&gt;&gt;&gt;
</code></pre>
<p>That's precisely what I wanted. In fact, to make sure, let also see if the AST is the same as the one with regular if.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> ast
<span class="hljs-meta">&gt;&gt;&gt; </span>print(ast.dump(ast.parse(fc), indent=<span class="hljs-number">4</span>))
Module(
    body=[
        FunctionDef(
            name=<span class="hljs-string">'f'</span>,
            args=arguments(
                posonlyargs=[],
                args=[
                    arg(arg=<span class="hljs-string">'test'</span>)],
                kwonlyargs=[],
                kw_defaults=[],
                defaults=[]),
            body=[
                If(
                    test=Name(id=<span class="hljs-string">'test'</span>, ctx=Load()),
                    body=[
                        Return(
                            value=Constant(value=<span class="hljs-number">42</span>))],
                    orelse=[]),
                Expr(
                    value=Call(
                        func=Name(id=<span class="hljs-string">'print'</span>, ctx=Load()),
                        args=[
                            Constant(value=<span class="hljs-string">'missed return'</span>)],
                        keywords=[])),
                Return(
                    value=Constant(value=<span class="hljs-number">21</span>))],
            decorator_list=[])],
    type_ignores=[])
&gt;&gt;&gt;
</code></pre>
<p>And indeed it is! </p>
<pre><code class="lang-python">If(
    test=Name(id=<span class="hljs-string">'test'</span>, ctx=Load()),
    body=[
        Return(
            value=Constant(value=<span class="hljs-number">42</span>))],
     orelse=[]),
</code></pre>
<p>This node is the same as the one that would be generated by</p>
<pre><code class="lang-python"><span class="hljs-keyword">if</span> test: <span class="hljs-keyword">return</span> <span class="hljs-number">42</span>
</code></pre>
<h2 id="heading-if-its-not-tested-its-broken">If It's Not Tested, It's Broken?</h2>
<p>To conclude this journey, I thought it'd be a good idea to add some unit tests as well. Before writing anything new, I wanted to get an idea of what I had broken. </p>
<p>With the code tested manually, I run all tests using the 'test' module, <code>python -m test -j8</code>. The <code>-j8</code> means we'll use 8 processes to run the tests in parallel.</p>
<pre><code class="lang-bash">$ ./python -m <span class="hljs-built_in">test</span> -j8
</code></pre>
<p>To my surprise, only one test fail! 😱</p>
<pre><code class="lang-console">== Tests result: FAILURE ==

406 tests OK.

1 test failed:
    test_grammar
</code></pre>
<p>Since I ran all tests, it's hard to navigate on the output so I can run only this one again in isolation.</p>
<pre><code class="lang-console">======================================================================
FAIL: test_listcomps (test.test_grammar.GrammarTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/miguel/projects/cpython/Lib/test/test_grammar.py", line 1732, in test_listcomps
    check_syntax_error(self, "[x if y]")
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/miguel/projects/cpython/Lib/test/support/__init__.py", line 497, in check_syntax_error
    with testcase.assertRaisesRegex(SyntaxError, errtext) as cm:
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: SyntaxError not raised

----------------------------------------------------------------------

Ran 76 tests in 0.038s

FAILED (failures=1)
test test_grammar failed
test_grammar failed (1 failure)

== Tests result: FAILURE ==

1 test failed:
    test_grammar

1 re-run test:
    test_grammar

Total duration: 82 ms
Tests result: FAILURE
</code></pre>
<p>And there it is! It expects a syntax error when running a <code>[x if y]</code> expression. We can safely remove it and re-run the tests again.</p>
<pre><code class="lang-console">== Tests result: SUCCESS ==

1 test OK.

Total duration: 112 ms
Tests result: SUCCESS
</code></pre>
<p>Now that everything is OK, it's time to add a few more tests. It's important to test not only the new "else-less if" but also the new <code>return</code> statement.</p>
<p>By navigating though the <code>test_grammar.py</code> file we can find a test for pretty much every grammar rule. The first one I look for is <code>test_if_else_expr</code>. This test doesn't fail, so it only tests for the happy case. To make it more robust we need to add two new tests to check <code>if True</code> and <code>if False</code> case.</p>
<pre><code class="lang-python">        self.assertEqual((<span class="hljs-number">6</span> &lt; <span class="hljs-number">4</span> <span class="hljs-keyword">if</span> <span class="hljs-number">0</span>), <span class="hljs-literal">None</span>)
        self.assertEqual((<span class="hljs-number">6</span> &lt; <span class="hljs-number">4</span> <span class="hljs-keyword">if</span> <span class="hljs-number">1</span>), <span class="hljs-literal">False</span>)
</code></pre>
<p>I run everything again, all tests pass this time.</p>
<p>ps: <code>bool</code> in Python is <a target="_blank" href="https://docs.python.org/3/c-api/bool.html">subclass of integer</a>, so you can use 1 to denote <code>True</code> and 0 for <code>False</code></p>
<pre><code class="lang-console">Ran 76 tests in 0.087s

OK

== Tests result: SUCCESS ==

1 test OK.

Total duration: 174 ms
Tests result: SUCCESS
</code></pre>
<p>Lastly, we need the tests for the <code>return</code> rule. They're defined in the <code>test_return</code> test. Just like the if expression one, this test pass with no modification. </p>
<p>To test this new use case, I create a function that receives a <code>bool</code> argument and returns if the argument is true, when it's false, it skips the return, just like the manual tests I have been doing up to this point.</p>
<pre><code class="lang-python">        <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">g4</span>(<span class="hljs-params">test</span>):</span>
            a = <span class="hljs-number">1</span>
            <span class="hljs-keyword">return</span> a <span class="hljs-keyword">if</span> test
            a += <span class="hljs-number">1</span>
            <span class="hljs-keyword">return</span> a

        self.assertEqual(g4(<span class="hljs-literal">False</span>), <span class="hljs-number">2</span>)
        self.assertEqual(g4(<span class="hljs-literal">True</span>), <span class="hljs-number">1</span>)
</code></pre>
<p>Now, save the file and re-run <code>test_grammar</code> one more time.</p>
<pre><code class="lang-console">----------------------------------------------------------------------

Ran 76 tests in 0.087s

OK

== Tests result: SUCCESS ==

1 test OK.

Total duration: 174 ms
Tests result: SUCCESS
</code></pre>
<p>All good in the hood! <code>test_grammar</code> passes with flying colors and the last thing, just in case, is to re-run the full test suite. </p>
<pre><code class="lang-bash">$ ./python -m <span class="hljs-built_in">test</span> -j8
</code></pre>
<p>After a while, all tests pass and I'm very happy with the result.</p>
<h2 id="heading-limitations">Limitations</h2>
<p>If you know Ruby well, by this point you've probably noticed that what I did here is not 100% the same as a conditional modifier. For example, in Ruby you can run actual expressions in these modifiers.</p>
<pre><code class="lang-ruby"><span class="hljs-meta">irb(main):002:0&gt;</span> a = <span class="hljs-number">42</span>
<span class="hljs-meta">irb(main):003:0&gt;</span> a += <span class="hljs-number">1</span> if false
=&gt; nil
<span class="hljs-meta">irb(main):004:0&gt;</span> a
=&gt; <span class="hljs-number">42</span>
<span class="hljs-meta">irb(main):005:0&gt;</span> a += <span class="hljs-number">1</span> if true
=&gt; <span class="hljs-number">43</span>
</code></pre>
<p>I cannot do the same with my implementation.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-number">42</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>a += <span class="hljs-number">1</span> <span class="hljs-keyword">if</span> <span class="hljs-literal">False</span>
Traceback (most recent call last):
  File <span class="hljs-string">"&lt;stdin&gt;"</span>, line <span class="hljs-number">1</span>, <span class="hljs-keyword">in</span> &lt;module&gt;
TypeError: unsupported operand type(s) <span class="hljs-keyword">for</span> +=: <span class="hljs-string">'int'</span> <span class="hljs-keyword">and</span> <span class="hljs-string">'NoneType'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>a += <span class="hljs-number">1</span> <span class="hljs-keyword">if</span> <span class="hljs-literal">True</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>a
<span class="hljs-number">43</span>
</code></pre>
<p>What this reveals is that the <code>return</code> rule I created is just a workaround. If I want to make it as close as possible to Ruby's conditional modifier, I'll need to make it work with other statements as well, not just <code>return</code>.</p>
<p>Nevertheless, this is fine. My goal with this experiment was just to learn more about Python internals and see how would I navigate a little-known code base written in C and make the appropriate changes to it. And I have to admit that I'm pretty happy with the results!</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Adding a new syntax inspired by Ruby is a really nice exercise to learn more about the internals of Python. Of course, if I had to convert this as a PR, the core developers would probably find a few shortcomings, as I have already found and described in the previous section. However, since I did this just for fun, I'm very happy with the results. </p>
<p>The source code with all my changes is on my CPython fork under the <a target="_blank" href="https://github.com/miguendes/cpython/tree/ruby-if-new">branch <code>ruby-if-new</code></a>.</p>
<p>I hope you've found this post cool and let's see what comes next!</p>
<p>Other posts you may like:</p>
<ul>
<li><a target="_blank" href="https://miguendes.me/useful-resources-to-learn-pythons-internals-from-scratch">11 Useful Resources To Learn Python's Internals From Scratch</a> </li>
</ul>
<p>See you next time!</p>
]]></content:encoded></item><item><title><![CDATA[11 Useful Resources To Learn Python's Internals From Scratch]]></title><description><![CDATA["How does Python work internally?" 
I have been asking myself that question for the past few months and now it seems that I'm starting to grasp, slowly...
During this time, I have grown a strong interest in learning more about the internal working of...]]></description><link>https://miguendes.me/useful-resources-to-learn-pythons-internals-from-scratch</link><guid isPermaLink="true">https://miguendes.me/useful-resources-to-learn-pythons-internals-from-scratch</guid><category><![CDATA[Python]]></category><category><![CDATA[resources]]></category><category><![CDATA[guide]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 24 Jul 2021 08:48:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1627117381018/DGphYfnIr.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>"How does Python work internally?" </p>
<p>I have been asking myself that question for the past few months and now it seems that I'm starting to grasp, slowly...</p>
<p>During this time, I have grown a strong interest in learning more about the internal working of python. I find the CPython implementation so fascinating that I even started <a target="_blank" href="https://github.com/python/cpython/pulls?q=is%3Apr+author%3Amiguendes+is%3Aclosed">contributing to the language</a>.</p>
<p>The CPython runtime is the most popular one but there are a few others like <a target="_blank" href="https://www.pypy.org/">pypy</a>. Unlike pypy, the core language is written in C whereas the standard library is a blend of Python and C. </p>
<p>For newcomers, navigating through the code can be a daunting task. Fortunately, there are some nice resources out there that can help to smooth the learning curve.</p>
<p>In this post, I'll show you my favorite resources to start learning more about the inner workings of Python, a.k.a CPython Internals.</p>
<p>By the end of this tutorial, you should be able to:</p>
<ul>
<li>choose the best books that will help you understand Python's source code</li>
<li>learn more about the CPython internals via public talks</li>
<li>find the best blogs and other resources that cover the Python internals</li>
</ul>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><a class="post-section-overview" href="#books-about-cpython-internals">Books About CPython Internals</a>  </li>
<li><a class="post-section-overview" href="#videos-related-to-cpython-internals">Videos Related to CPython Internals</a></li>
<li><a class="post-section-overview" href="#blog-posts">Blog Posts</a></li>
<li><a class="post-section-overview" href="#other-resources-about-cpython-internals">Other Resources About CPython Internals</a></li>
<li><a class="post-section-overview" href="#conclusion">Conclusion</a></li>
</ol>
<h2 id="heading-books-about-cpython-internals">Books About CPython Internals</h2>
<p>Python <a target="_blank" href="https://www.zibtek.com/blog/the-incredible-growth-of-python/">has grown immensely</a> during the past years and the more people learning it, the more the demand for learning materials discussing advanced topics. </p>
<p>A few years ago, I bet just a few curious developers would be interested in learning more about <a target="_blank" href="https://github.com/python/cpython">CPython</a>.</p>
<p>These days it’s not unusual to find comprehensive blog posts, videos and books going over the inner guts of Python. Speaking of books, I can surely vouch for two:</p>
<ul>
<li>CPython Internals: Your Guide to the Python 3 Interpreter by Anthony Shaw</li>
<li>Inside The Python Virtual Machine by Obi Ike-Nwosu</li>
</ul>
<h3 id="heading-cpython-internals-your-guide-to-the-python-3-interpreter-a-brief-review">CPython Internals: Your Guide to the Python 3 Interpreter - A Brief Review</h3>
<p>This one is the most newly published book about CPython. Amongst all the things it covers, you will find information about:</p>
<ul>
<li>How to build and compile Python from source on MacOS, Linux and Windows</li>
<li>How to set up a development environment</li>
<li>The Python’s grammar and language specification</li>
<li>The eval loop</li>
<li>How Python manages memory</li>
<li>How to run the test suite</li>
</ul>
<p>You can find the full table of contents in the <a target="_blank" href="https://realpython.com/products/cpython-internals-book/">Real Python website</a>, it comes in digital format and paperback. The eBook versions are DRM-Free and available in three different formats: epub, mobi and PDF. The paperback, on the other hand, can be found on <a target="_blank" href="https://www.amazon.co.uk/dp/1775093344/">amazon</a> .</p>
<p><img src="https://i.gr-assets.com/images/S/compressed.photo.goodreads.com/books/1591116551l/53412684._SY475_.jpg" alt="CPython Internals Book Cover" /></p>
<p>The nicest thing to me is that Anthony demonstrates how to add a new operator to the language: the almost-equal. This operator is represented by “~=” and can determine if two numbers are close enough to each other but not exactly equal. It walks through all the fundamental changes required to make this happen, including extending the grammar.</p>
<p>Thanks to this book, I started contributing to CPython and have already <a target="_blank" href="https://github.com/python/cpython/pulls?q=is%3Apr+author%3Amiguendes+is%3Aclosed">a handful PRs </a> merged, including enhancements, documentation and bug fixes.</p>
<p>Lastly, the <a target="_blank" href="https://github.com/tonybaloney/cpython-book-samples">code samples are available on github</a>.</p>
<h3 id="heading-inside-the-python-virtual-machine-a-brief-review">Inside The Python Virtual Machine - A Brief Review</h3>
<p>Another good book covering Python's internals, and if I'm not mistaken it's the first one to explore CPython in detail. <em>Inside The Python Virtual Machine</em> is much shorter than <em>CPython Internals</em> but covers some parts of the language in more detail, such as Python objects, Code and Frame objects. It's <a target="_blank" href="https://leanpub.com/insidethepythonvirtualmachine">available for free as PDF, ePub and Kindle (mobi) on leanpub</a> but I definitely encourage you to buy it.</p>
<p><img src="https://d2sofvawe08yqg.cloudfront.net/insidethepythonvirtualmachine/s_hero?1620490351" alt="Inside The Python Virtual Machine Book Cover" /></p>
<p>What I loved the most is the in-depth examination of Python objects. It does a brilliant job dissecting the types, goes over the internals of objects and their attributes and concludes with an explanation of the Method Resolution Order (MRO). I haven’t read it in full yet but I like it so far.</p>
<h2 id="heading-videos-related-to-cpython-internals">Videos Related to CPython Internals</h2>
<p>When it comes to video contents, there isn’t much structured content out there. The first one I learned about was P. Guo’s series covering Python 2.7. Unfortunately, he’s taken out the series from his website but you can still find it via an unlisted playlist.</p>
<h3 id="heading-cpython-internals-a-ten-hour-codewalk-through-the-python-interpreter-source-code">CPython internals: A ten-hour codewalk through the Python interpreter source code</h3>
<p>Check it out the first video of the series.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/LhadeL7_EIU"></iframe>

<p>The full playlist can be found <a target="_blank" href="https://www.youtube.com/playlist?list=PLzV58Zm8FuBL6OAv1Yu6AwXZrnsFbbR0S">here</a> .</p>
<h3 id="heading-pablo-salgado-the-soul-of-the-beast">Pablo Salgado - The soul of the beast</h3>
<p>In this talk, Pablo Galindo, who is a Python’s core developer, looks at the former Python’s grammar and its limitations. This presentation is fantastic for those who want to understand the general structure of the compiler. By the end, Pablo shows how you can add a new operator to Python, the 'arrow operator' <code>-&gt;</code>.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/1_23AVsiQEc"></iframe>

<h3 id="heading-eric-snow-to-gil-or-not-to-gil-the-future-of-multi-core-cpython-pycon-2019">Eric Snow - to GIL or not to GIL: the Future of Multi-Core (C)Python - PyCon 2019</h3>
<p>In this presentation, Eric Snow talks about Python’s GIL (Global Interpreter Lock) and the future developments to circumvent its impact on performance and unlock multi-core capability in Python. It won’t dive into the source code but it’s a good intro to one of the most controversial topics in Python.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/7RlqbHCCVyc"></iframe>


<h3 id="heading-pablo-galindo-salgado-time-to-take-out-the-rubbish-garbage-collector-pycon-2019">Pablo Galindo Salgado - Time to take out the rubbish: garbage collector - PyCon 2019</h3>
<p>In this talk, Pablo presents the "magic" behind Python’s memory management by detailing the inner works of the garbage collector and why it matters. He illustrates some gotchas such as the reason you cannot rely on <code>__del__</code> method and describes in detail the reference counter.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/CLW5Lyc1FN8"></iframe>

<h3 id="heading-cpython-full-course-dev-internals">CPython Full Course - Dev Internals</h3>
<p>I’ve recently found this resource, and it looks very neat. The author examines the implementation of <code>NoneType</code> and also demonstrates how to include a <code>__len__</code> method to <code>int</code>s. </p>
<p>What caught my eye was that the author uses <a target="_blank" href="https://www.cs.cmu.edu/~gilpin/tutorial/"><code>gbd</code></a> to debug the C portion of the code. I find it nice because it’s not so easy for find videos demonstrating how to debug the CPython code.</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/Ql5XYKNZCzY"></iframe>


<h2 id="heading-blog-posts">Blog Posts</h2>
<p>My favorite blog post series on this topic is from <a target="_blank" href="https://tenthousandmeters.com/">Ten thousand meters</a> by Victor. I believe it’s the most complete and up-to-date coverage into the internals of the Python interpreter in the form blog posts.</p>
<p>Another exceptional series, albeit slightly outdated, is the <a target="_blank" href="https://eli.thegreenplace.net/tag/python-internals"><em>Python internals</em> series</a> from Eli Bendersky. There are lots of interesting stuff including adding new keywords and a great coverage on symbol tables.</p>
<h2 id="heading-other-resources-about-cpython-internals">Other Resources About CPython Internals</h2>
<h3 id="heading-a-python-interpreter-written-in-python-by-allison-kaptur">A Python Interpreter Written in Python by Allison Kaptur</h3>
<p>This article / mini book is a great resource to understand Python’s bytecode. In only 500 lines of code Allison implements <em>Byterun</em>, a Python interpreter written in pure Python. </p>
<p>As impressive as it may sound, <em>Byterun</em> can actually run a variety of simple Python programs. After reading it you’ll have a much better understanding of the Python interpreter.</p>
<p>The booklet can be found one <a target="_blank" href="https://www.aosabook.org/en/500L/a-python-interpreter-written-in-python.html">https://www.aosabook.org</a>. </p>
<h3 id="heading-pythons-official-docs">Python's Official Docs</h3>
<p>The official documentation is also an excellent place to go, the text can be particularly dry but that's what you usually expect from a reference. A nice section in particular is the <a target="_blank" href="https://devguide.python.org/exploring/">exploring guide</a>. It goes through the source code’s structure and also links to other resources. The official website also <a target="_blank" href="https://docs.python.org/3.9/c-api/index.html">shows the C-API in considerable detail</a>.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Learning about CPython can be disheartening, but thanks to the increasing demand, more and more materials have been developed which softens the learning curve considerably. In this article I presented my preferred resources to learn about the internals of Python ranging from books, to videos and blog posts. I hope this can be useful to you as it is to me. </p>
<p>Other posts you may like:</p>
<ul>
<li><a target="_blank" href="https://miguendes.me/what-if-python-had-this-ruby-feature">How I Patched Python to Include This Ruby Feature</a> </li>
</ul>
<p>See you next time!</p>
]]></content:encoded></item><item><title><![CDATA[How to Implement a Random String Generator With Python]]></title><description><![CDATA[In this post, you'll learn how to create a random string in Python using different methods; but, beware! Some of them only work with Python 3.6+.
By the end of this article, you should be able to:

use the choice function to generate a random string ...]]></description><link>https://miguendes.me/how-to-implement-a-random-string-generator-with-python</link><guid isPermaLink="true">https://miguendes.me/how-to-implement-a-random-string-generator-with-python</guid><category><![CDATA[Python]]></category><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 03 Apr 2021 19:50:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1617997854782/m6xJVXUd0.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this post, you'll learn how to create a random string in Python using different methods; but, beware! Some of them only work with Python 3.6+.</p>
<p>By the end of this article, you should be able to:</p>
<ul>
<li>use the choice function to generate a random string from <code>string.ascii_letters</code>,  <code>string.digits</code> + <code>string.punctuation</code> characters in Python 3</li>
<li>generate a secure random string, useful to create random passwords</li>
</ul>
<h2 id="heading-generating-a-random-string-with-upper-case-lower-case-digits-and-punctuation">Generating a Random String With Upper Case, Lower Case, Digits and Punctuation</h2>
<p>The <code>string</code> module comes with a <a target="_blank" href="https://docs.python.org/3/library/string.html#string-constants">nice set of constants</a> that we can combine with the <code>random</code> module to create our random string.</p>
<h3 id="heading-using-pythons-stringasciiletters-stringdigits-stringpunctuation-characters">Using Python's <code>string.ascii_letters + string.digits + string.punctuation</code> Characters</h3>
<p>By concatenating <code>string.ascii_letters</code> + <code>string.digits</code> + <code>string.punctuation</code>, we will have our pool of characters that we can pick at random using the  <a target="_blank" href="https://docs.python.org/3/library/random.html#random.choices"><code>random.choices()</code></a> method. This method returns a k sized list of elements chosen from the population with replacement. In our case, <code>k</code> will be the size of our random string.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> string

<span class="hljs-meta">&gt;&gt;&gt; </span>string.ascii_letters
<span class="hljs-string">'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>string.digits
<span class="hljs-string">'0123456789'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>string.punctuation
<span class="hljs-string">'!"#$%&amp;\'()*+,-./:;&lt;=&gt;?@[\\]^_`{|}~'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>all_chars = string.ascii_letters + string.digits + string.punctuation

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> random

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">''</span>.join(random.choices(all_chars, k=<span class="hljs-number">10</span>))
<span class="hljs-string">'4`&lt;WJ."=$r'</span>
</code></pre>
<p>This works well, but there is one limitation: <code>random.choices</code> is available only on Python 3.6+. To make that work in older versions, you'll need to call the <code>random.choice</code> function and iterate over k times.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">''</span>.join((random.choice(all_chars) <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> range(<span class="hljs-number">10</span>)))
<span class="hljs-string">'d&amp;6Bx5PX(R'</span>
</code></pre>
<h2 id="heading-how-to-generate-a-cryptographically-secure-random-string">How to Generate a Cryptographically Secure Random String</h2>
<p>The previous method works well if all we want is a random string for simple use cases. If what we want is a random string to be used as a password, then we need a more secure method. The reason is that the random module does not use a secure pseudo-number generator.</p>
<p>As an alternative we must resort to the Operating System’s pseudo-random number generator. The good news is that Python can access that using the <code>random.SystemRandom</code> class, which <a target="_blank" href="https://stackoverflow.com/a/23728630/14386821">ensures that sequences are not reproducible</a>.</p>
<p>We can re-use the previous example and change just one minor detail.</p>
<p>From: <code>random.choices</code> to <code>random.SystemRandom()</code></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>all_chars = string.ascii_letters + string.digits + string.punctuation

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> random

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">''</span>.join(random.SystemRandom().choices(all_chars, k=<span class="hljs-number">10</span>))
<span class="hljs-string">'T$WoW.sdQc'</span>
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this post we saw 2 different ways of creating random strings Python. I hope you find it useful.</p>
<p>Other posts you may like:</p>
<ul>
<li><a class="post-section-overview" href="#https://miguendes.me/design-patterns-that-make-sense-in-python-simple-factory">Design Patterns That Make Sense in Python: Simple Factory</a></li>
<li><a target="_blank" href="https://miguendes.me/how-to-pass-multiple-arguments-to-a-map-function-in-python">How to Pass Multiple Arguments to a map Function in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">73 Examples to Help You Master Python's f-strings</a></li>
<li><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">3 Ways to Unit Test REST APIs in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></li>
<li><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/5-hidden-python-features-you-probably-never-heard-of">5 Hidden Python Features You Probably Never Heard Of</a></li>
</ul>
<p>See you next time!</p>
<p>This post was originally published at <a target="_blank" href="https://miguendes.me/how-to-implement-a-random-string-generator-with-python">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[How to Sort a Dict in Descending Order by Value With Python]]></title><description><![CDATA[In this post, you will learn how to sort a Python dictionary by value descending i.e. in reverse order.
Say that you have the following dictionary containing your grades associated with a subject. You want to sort the values, in this case the grades,...]]></description><link>https://miguendes.me/how-to-sort-a-dict-in-descending-order-by-value-with-python</link><guid isPermaLink="true">https://miguendes.me/how-to-sort-a-dict-in-descending-order-by-value-with-python</guid><category><![CDATA[Python]]></category><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 20 Feb 2021 19:17:33 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1618337853243/yYc2wMg05.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this post, you will learn how to sort a Python dictionary by value descending i.e. in reverse order.</p>
<p>Say that you have the following dictionary containing your grades associated with a subject. You want to sort the values, in this case the grades, in a descending manner - the highest grade will appear first and the lowest last.</p>
<p>For example, you have this:</p>
<pre><code class="lang-python">grades = {<span class="hljs-string">"Math"</span>: <span class="hljs-number">34</span>, <span class="hljs-string">"Science"</span>: <span class="hljs-number">12</span>, <span class="hljs-string">"English"</span>: <span class="hljs-number">89</span>, <span class="hljs-string">"Physics"</span>: <span class="hljs-number">8</span>}
</code></pre>
<p>... And you want this:</p>
<pre><code class="lang-python">{<span class="hljs-string">'English'</span>: <span class="hljs-number">89</span>, <span class="hljs-string">'Math'</span>: <span class="hljs-number">34</span>:, <span class="hljs-string">'Science'</span>: <span class="hljs-number">12</span>:, <span class="hljs-string">'Physics'</span>: <span class="hljs-number">8</span>}
</code></pre>
<p>You can do this is at least 3 different ways.</p>
<h2 id="heading-sorting-a-dict-by-value-descending-using-list-comprehension">Sorting a dict by value descending using list comprehension</h2>
<p>The quickest way is to iterate over the key-value pairs of your current <code>dict</code> and call <code>sorted</code> passing the dictionary values and setting <code>reversed=True</code>.</p>
<p> If you are using Python 3.7, regular <code>dict</code>s are ordered by default. So let's use it!</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>grades = {<span class="hljs-string">"Math"</span>: <span class="hljs-number">34</span>, <span class="hljs-string">"Science"</span>: <span class="hljs-number">12</span>, <span class="hljs-string">"English"</span>: <span class="hljs-number">89</span>, <span class="hljs-string">"Physics"</span>: <span class="hljs-number">8</span>}
<span class="hljs-meta">&gt;&gt;&gt; </span>grades
{<span class="hljs-string">'Math'</span>: <span class="hljs-number">34</span>, <span class="hljs-string">'Science'</span>: <span class="hljs-number">12</span>, <span class="hljs-string">'English'</span>: <span class="hljs-number">89</span>, <span class="hljs-string">'Physics'</span>: <span class="hljs-number">8</span>}
<span class="hljs-meta">&gt;&gt;&gt; </span>value_key_pairs = ((value, key) <span class="hljs-keyword">for</span> (key,value) <span class="hljs-keyword">in</span> grades.items())
<span class="hljs-meta">&gt;&gt;&gt; </span>sorted_value_key_pairs = sorted(value_key_pairs, reverse=<span class="hljs-literal">True</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>sorted_value_key_pairs
[(<span class="hljs-number">89</span>, <span class="hljs-string">'English'</span>), (<span class="hljs-number">34</span>, <span class="hljs-string">'Math'</span>), (<span class="hljs-number">12</span>, <span class="hljs-string">'Science'</span>), (<span class="hljs-number">8</span>, <span class="hljs-string">'Physics'</span>)]
<span class="hljs-meta">&gt;&gt;&gt; </span>{k: v <span class="hljs-keyword">for</span> v, k <span class="hljs-keyword">in</span> sorted_value_key_pairs}
 {<span class="hljs-string">'English'</span>: <span class="hljs-number">89</span>, <span class="hljs-string">'Math'</span>: <span class="hljs-number">34</span>, <span class="hljs-string">'Science'</span>: <span class="hljs-number">12</span>, <span class="hljs-string">'Physics'</span>: <span class="hljs-number">8</span>}
</code></pre>
<p><em>And Voila!</em> You have your sorted grades <code>dict</code> in a descending fashion.</p>
<blockquote>
<p>What if I have Python 3.6 or lower?</p>
</blockquote>
<p>In this case, you can use <code>OrderedDict</code> from the <code>collections</code> module.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> collections <span class="hljs-keyword">import</span> OrderedDict

<span class="hljs-meta">&gt;&gt;&gt; </span>OrderedDict((k, v) <span class="hljs-keyword">for</span> v, k <span class="hljs-keyword">in</span> sorted_value_key_pairs)
OrderedDict([(<span class="hljs-string">'English'</span>, <span class="hljs-number">89</span>), (<span class="hljs-string">'Math'</span>, <span class="hljs-number">34</span>), (<span class="hljs-string">'Science'</span>, <span class="hljs-number">12</span>), (<span class="hljs-string">'Physics'</span>, <span class="hljs-number">8</span>)])
</code></pre>
<h2 id="heading-sorting-a-dictionary-in-descending-order-using-the-operator-module">Sorting a dictionary in descending order using the <code>operator</code> module</h2>
<p>The <code>operator</code> module provides a functional interface to built-in operators like <code>&lt;</code>, <code>&gt;</code>, <code>==</code> and so on.</p>
<p>This module has many useful functions and one of them is the <code>itemgetter</code>. This function returns a callable object that will fetch the item using the <code>__getitem__()</code> method. </p>
<p>In a nutshell, if you do <code>callable = operator.itemgetter(1)</code>, and pass a <em>subscriptable</em> object, say <code>('Physics', 8)</code> to this <em>callable</em>, it will return the equivalent of <code>('Physics', 8)[1]</code>.</p>
<p>For example,</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>subject_grade_pair = (<span class="hljs-string">'Physics'</span>, <span class="hljs-number">8</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>get_grade = operator.itemgetter(<span class="hljs-number">1</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>get_grade(subject_grade_pair)
<span class="hljs-number">8</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>subject_grade_pair[<span class="hljs-number">1</span>]
<span class="hljs-number">8</span>
</code></pre>
<p>Now, the question is, how can we use this to sort the values?</p>
<p>The <code>sorted</code> built-in function expects not only the <em>iterable</em> you want to sort but also a <em>key</em>. This key argument is nothing more than a function of one argument that you can feed each item of the list. And that’s exactly what we need to sort our list!</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> operator

<span class="hljs-comment"># remember, the grade is the second item in the subject - grade pair</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>sort_by_grade = operator.itemgetter(<span class="hljs-number">1</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>grades = {<span class="hljs-string">"Math"</span>: <span class="hljs-number">34</span>, <span class="hljs-string">"Science"</span>: <span class="hljs-number">12</span>, <span class="hljs-string">"English"</span>: <span class="hljs-number">89</span>, <span class="hljs-string">"Physics"</span>: <span class="hljs-number">8</span>}
<span class="hljs-meta">&gt;&gt;&gt; </span>grades
{<span class="hljs-string">'Math'</span>: <span class="hljs-number">34</span>, <span class="hljs-string">'Science'</span>: <span class="hljs-number">12</span>, <span class="hljs-string">'English'</span>: <span class="hljs-number">89</span>, <span class="hljs-string">'Physics'</span>: <span class="hljs-number">8</span>}

<span class="hljs-meta">&gt;&gt;&gt; </span>sorted_value_key_pairs = sorted(grades.items(), key=sort_by_grade, reverse=<span class="hljs-literal">True</span>)
<span class="hljs-meta">&gt;&gt;&gt; </span>{k: v <span class="hljs-keyword">for</span> v, k <span class="hljs-keyword">in</span> sorted_value_key_pairs}
 {<span class="hljs-string">'English'</span>: <span class="hljs-number">89</span>, <span class="hljs-string">'Math'</span>: <span class="hljs-number">34</span>, <span class="hljs-string">'Science'</span>: <span class="hljs-number">12</span>, <span class="hljs-string">'Physics'</span>: <span class="hljs-number">8</span>}
</code></pre>
<p>At this point you might wonder: </p>
<blockquote>
<p>This <code>sort_by_grade</code> looks like a glorified lambda...</p>
</blockquote>
<p>Well, good shout, this brings us to the last section.</p>
<h2 id="heading-using-lambda-to-sort-the-dictionary-in-descending-order">Using <code>lambda</code> to sort the dictionary in descending order</h2>
<p>So, as we saw in the previous section, <code>operator.itemgetter</code> returns a callable that is equivalent to calling the <code>__getitem__()</code> method on a <em>subscriptable</em> object.</p>
<p>This is remarkably similar to pass a lambda as a key which takes a tuple, say <code>(“Math”, 34)</code>, and returns the second item which is the grade.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>sort_by_grade_lambda = <span class="hljs-keyword">lambda</span> subject_grade_pair: subject_grade_pair[<span class="hljs-number">1</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span>sorted_value_key_pairs = sorted(grades.items(), key=sort_by_grade_lambda, reverse=<span class="hljs-literal">True</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>{k: v <span class="hljs-keyword">for</span> v, k <span class="hljs-keyword">in</span> sorted_value_key_pairs}
 {<span class="hljs-string">'English'</span>: <span class="hljs-number">89</span>, <span class="hljs-string">'Math'</span>: <span class="hljs-number">34</span>, <span class="hljs-string">'Science'</span>: <span class="hljs-number">12</span>, <span class="hljs-string">'Physics'</span>: <span class="hljs-number">8</span>}
</code></pre>
<p>And this is how you use a <code>lambda</code> function to sort a dict items by value in descending order.</p>
<h2 id="heading-sorting-a-dictionary-with-complex-objects-as-values">Sorting a dictionary with complex objects as values</h2>
<p>So far we've only dealt with simple objects such as <code>int</code>. What happens if we have a complex object, such as a custom <code>Grade</code> object as a dictionary value?</p>
<p>Let's see how it works.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Grade</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, grade: int, cutoff: int = <span class="hljs-number">70</span></span>):</span>
        self.grade = grade
        self.cutoff = cutoff
        self.passed = grade &gt;= cutoff

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__repr__</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"&lt;Grade(grade=<span class="hljs-subst">{self.grade}</span>, cutoff=<span class="hljs-subst">{self.cutoff}</span>, passed=<span class="hljs-subst">{self.passed}</span>&gt;"</span>
</code></pre>
<p>Let's try to sort it...</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>grades = {<span class="hljs-string">"Math"</span>: Grade(grade=<span class="hljs-number">34</span>), <span class="hljs-string">"Science"</span>: Grade(grade=<span class="hljs-number">12</span>), <span class="hljs-string">"English"</span>: Grade(grade=<span class="hljs-number">89</span>), <span class="hljs-string">"Physics"</span>: Grade(grade=<span class="hljs-number">8</span>)}
<span class="hljs-meta">&gt;&gt;&gt; </span>grades
grades = {<span class="hljs-string">"Math"</span>: Grade(grade=<span class="hljs-number">34</span>), <span class="hljs-string">"Science"</span>: Grade(grade=<span class="hljs-number">12</span>), <span class="hljs-string">"English"</span>: Grade(grade=<span class="hljs-number">89</span>), <span class="hljs-string">"Physics"</span>: Grade(grade=<span class="hljs-number">8</span>)}

<span class="hljs-meta">&gt;&gt;&gt; </span>value_key_pairs = ((value, key) <span class="hljs-keyword">for</span> (key,value) <span class="hljs-keyword">in</span> grades.items())
<span class="hljs-meta">&gt;&gt;&gt; </span>sorted_value_key_pairs = sorted(value_key_pairs, reverse=<span class="hljs-literal">True</span>)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-14</span><span class="hljs-number">-0</span>c94e26fda4a&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> sorted_value_key_pairs = sorted(value_key_pairs, reverse=<span class="hljs-literal">True</span>)

TypeError: <span class="hljs-string">'&lt;'</span> <span class="hljs-keyword">not</span> supported between instances of <span class="hljs-string">'Grade'</span> <span class="hljs-keyword">and</span> <span class="hljs-string">'Grade'</span>
</code></pre>
<p>Oops! It didn't work. The reason is that, as the error message says, <code>Grade</code> doesn't implement the <code>__lt__</code> operator, which makes it impossible to compare them.</p>
<p>To fix that we can either implement the <code>__lt__</code> method or use the lambda as we used before with an adaptation. Let's see the lambda approach first.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>sort_by_grade_lambda = <span class="hljs-keyword">lambda</span> subject_grade_pair: subject_grade_pair[<span class="hljs-number">1</span>].grade

<span class="hljs-meta">&gt;&gt;&gt; </span>sorted_value_key_pairs = sorted(grades.items(), key=sort_by_grade_lambda, reverse=<span class="hljs-literal">True</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>{k: v <span class="hljs-keyword">for</span> v, k <span class="hljs-keyword">in</span> sorted_value_key_pairs}
{&lt;Grade(grade=<span class="hljs-number">89</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">True</span>&gt;: <span class="hljs-string">'English'</span>,
 &lt;Grade(grade=<span class="hljs-number">34</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;: <span class="hljs-string">'Math'</span>,
 &lt;Grade(grade=<span class="hljs-number">12</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;: <span class="hljs-string">'Science'</span>,
 &lt;Grade(grade=<span class="hljs-number">8</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;: <span class="hljs-string">'Physics'</span>}
</code></pre>
<h3 id="heading-implementing-lt">Implementing <code>__lt__</code></h3>
<p>Let's see how it looks when we implement the <code>&lt;</code> operator.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Grade</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, grade: int, cutoff: int = <span class="hljs-number">70</span></span>):</span>
        self.grade = grade
        self.cutoff = cutoff
        self.passed = grade &gt;= cutoff

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__lt__</span>(<span class="hljs-params">self, other: <span class="hljs-string">"Grade"</span></span>) -&gt; bool:</span>
        <span class="hljs-keyword">return</span> self.grade &lt; other.grade

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__repr__</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"&lt;Grade(grade=<span class="hljs-subst">{self.grade}</span>, cutoff=<span class="hljs-subst">{self.cutoff}</span>, passed=<span class="hljs-subst">{self.passed}</span>&gt;"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>grades = {<span class="hljs-string">"Math"</span>: Grade(grade=<span class="hljs-number">34</span>), <span class="hljs-string">"Science"</span>: Grade(grade=<span class="hljs-number">12</span>), <span class="hljs-string">"English"</span>: Grade(grade=<span class="hljs-number">89</span>), <span class="hljs-string">"Physics"</span>: Grade(grade=<span class="hljs-number">8</span>)}

<span class="hljs-meta">&gt;&gt;&gt; </span>grades
{<span class="hljs-string">'Math'</span>: &lt;Grade(grade=<span class="hljs-number">34</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;,
 <span class="hljs-string">'Science'</span>: &lt;Grade(grade=<span class="hljs-number">12</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;,
 <span class="hljs-string">'English'</span>: &lt;Grade(grade=<span class="hljs-number">89</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">True</span>&gt;,
 <span class="hljs-string">'Physics'</span>: &lt;Grade(grade=<span class="hljs-number">8</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;}

<span class="hljs-meta">&gt;&gt;&gt; </span>value_key_pairs = ((value, key) <span class="hljs-keyword">for</span> (key,value) <span class="hljs-keyword">in</span> grades.items())

<span class="hljs-meta">&gt;&gt;&gt; </span>sorted_value_key_pairs = sorted(value_key_pairs, reverse=<span class="hljs-literal">True</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>sorted_value_key_pairs
[(&lt;Grade(grade=<span class="hljs-number">89</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">True</span>&gt;, <span class="hljs-string">'English'</span>),
 (&lt;Grade(grade=<span class="hljs-number">34</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;, <span class="hljs-string">'Math'</span>),
 (&lt;Grade(grade=<span class="hljs-number">12</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;, <span class="hljs-string">'Science'</span>),
 (&lt;Grade(grade=<span class="hljs-number">8</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;, <span class="hljs-string">'Physics'</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span>{k: v <span class="hljs-keyword">for</span> v, k <span class="hljs-keyword">in</span> sorted_value_key_pairs}
{<span class="hljs-string">'English'</span>: &lt;Grade(grade=<span class="hljs-number">89</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">True</span>&gt;,
 <span class="hljs-string">'Math'</span>: &lt;Grade(grade=<span class="hljs-number">34</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;,
 <span class="hljs-string">'Science'</span>: &lt;Grade(grade=<span class="hljs-number">12</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;,
 <span class="hljs-string">'Physics'</span>: &lt;Grade(grade=<span class="hljs-number">8</span>, cutoff=<span class="hljs-number">70</span>, passed=<span class="hljs-literal">False</span>&gt;}
</code></pre>
<p><em>And Voila!</em> You have your grades dictionary sorted by value in a descending way.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>In this post we saw 3 different ways of sorting a dictionary in descending order with Python. I hope you find it useful.</p>
<p>Other posts you may like:</p>
<ul>
<li><a class="post-section-overview" href="#https://miguendes.me/design-patterns-that-make-sense-in-python-simple-factory">Design Patterns That Make Sense in Python: Simple Factory</a></li>
<li><a target="_blank" href="https://miguendes.me/how-to-pass-multiple-arguments-to-a-map-function-in-python">How to Pass Multiple Arguments to a map Function in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">73 Examples to Help You Master Python's f-strings</a></li>
<li><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">3 Ways to Test API Client Applications in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></li>
<li><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/5-hidden-python-features-you-probably-never-heard-of">5 Hidden Python Features You Probably Never Heard Of</a></li>
</ul>
<p>See you next time!</p>
]]></content:encoded></item><item><title><![CDATA[How to Check If a String Is a Valid URL in Python]]></title><description><![CDATA[How to check if a URL is valid in python? You'll be surprise how easy it is to check that.
In this article you'll learn how to determine if a string is a valid web address or not.
The good new is, you don't need to write your own URL validator. We'll...]]></description><link>https://miguendes.me/how-to-check-if-a-string-is-a-valid-url-in-python</link><guid isPermaLink="true">https://miguendes.me/how-to-check-if-a-string-is-a-valid-url-in-python</guid><category><![CDATA[Python]]></category><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 26 Dec 2020 19:31:27 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1617906124390/7pD-mlxv4.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>How to check if a URL is valid in python? You'll be surprise how easy it is to check that.</p>
<p>In this article you'll learn how to determine if a string is a valid web address or not.</p>
<p>The good new is, you don't need to write your own URL validator. We'll see how we can leverage a third-party URL validator to do the job for us.</p>
<h2 id="using-the-validators-package">Using the <code>validators</code> package</h2>
<p>The  <a target="_blank" href="https://github.com/kvesteri/validators"><code>validators</code></a> package is a tool that comes with a wide range of validation utilities. You can validate all sorts of inputs such as emails, IP addresses, bitcoin addresses and, of course, URLs. </p>
<p>The URL validation function is available in the root of the module and will return True if the string is a valid URL, otherwise it returns an instance of <code>ValidationFailure</code>, which is a bit weird but not a deal breaker.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> validators

<span class="hljs-meta">&gt;&gt;&gt; </span>validators.url(<span class="hljs-string">"http://localhost:8000"</span>)
<span class="hljs-literal">True</span>
</code></pre>
<p>The function from previous section can be re-written as follows:</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> validators
<span class="hljs-keyword">from</span> validators <span class="hljs-keyword">import</span> ValidationFailure

...


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">is_string_an_url</span>(<span class="hljs-params">url_string: str</span>) -&gt; bool:</span>
    result = validators.url(url_string)

    <span class="hljs-keyword">if</span> isinstance(result, ValidationFailure):
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>

    <span class="hljs-keyword">return</span> result

...

<span class="hljs-meta">&gt;&gt;&gt; </span>is_string_an_url(<span class="hljs-string">"http://localhost:8000"</span>)
<span class="hljs-literal">True</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>is_string_an_url(<span class="hljs-string">"http://.www.foo.bar/"</span>)
<span class="hljs-literal">False</span>
</code></pre>
<p>⚠️ WARNING: You must <a target="_blank" href="https://miguendes.me/python-trim-string">trim all leading and trailing spaces</a> from the URL string before calling <code>validators.url</code>.</p>
<pre><code class="lang-python"><span class="hljs-comment"># URL has a whitespace at the end</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>url = <span class="hljs-string">"http://localhost:8000 "</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>is_string_an_url(url)
<span class="hljs-literal">False</span>
<span class="hljs-comment"># strip any leading or trailing spaces from the URL</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>is_string_an_url(url.strip())
<span class="hljs-literal">True</span>
</code></pre>
<h2 id="using-djangos-urlvalidator">Using <code>django</code>'s URLValidator</h2>
<p><code>django</code> is a great web framework that has many features. It bundles several utilities that makes web development easier. One such utility is the <code>validators</code> module, which contains, amongst other things, an URL validator. </p>
<p>You can validate if a string is, or not, an URL by creating an instance of <code>URLValidator</code> and calling it.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> django.core.validators <span class="hljs-keyword">import</span> URLValidator
<span class="hljs-keyword">from</span> django.core.exceptions <span class="hljs-keyword">import</span> ValidationError

...


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">is_string_an_url</span>(<span class="hljs-params">url_string: str</span>) -&gt; bool:</span>
    validate_url = URLValidator(verify_exists=<span class="hljs-literal">True</span>)

    <span class="hljs-keyword">try</span>:
        validate_url(url_string)
    <span class="hljs-keyword">except</span> ValidationError, e:
        <span class="hljs-keyword">return</span> <span class="hljs-literal">False</span>

    <span class="hljs-keyword">return</span> <span class="hljs-literal">True</span>
</code></pre>
<p>This works well, but adding Django as a dependency just to use its validator is a bit too much. Unless, of course, your project already has Django as a part of it. If not, we have another alternative.</p>
<h2 id="conclusion">Conclusion</h2>
<p>In this post we saw 2 different ways of validating a URL in Python. I hope you find it useful.</p>
<p>Other posts you may like:</p>
<ul>
<li><a class="post-section-overview" href="#https://miguendes.me/design-patterns-that-make-sense-in-python-simple-factory">Design Patterns That Make Sense in Python: Simple Factory</a></li>
<li><a target="_blank" href="https://miguendes.me/how-to-pass-multiple-arguments-to-a-map-function-in-python">How to Pass Multiple Arguments to a map Function in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">73 Examples to Help You Master Python's f-strings</a></li>
<li><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></li>
<li><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/5-hidden-python-features-you-probably-never-heard-of">5 Hidden Python Features You Probably Never Heard Of</a></li>
</ul>
<p>See you next time!</p>
]]></content:encoded></item><item><title><![CDATA[How to Find the Current Working Directory in Python]]></title><description><![CDATA[Python provides two different ways to get the current working directory. The first method uses the os module and the second uses the newer pathlib.
Using the os Module to Get the Current Directory
First thing you need to do is to import the module.
>...]]></description><link>https://miguendes.me/how-to-find-the-current-working-directory-in-python</link><guid isPermaLink="true">https://miguendes.me/how-to-find-the-current-working-directory-in-python</guid><category><![CDATA[Python]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[Python 3]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 19 Dec 2020 20:04:28 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1617735657494/nusqONN-2.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Python provides two different ways to get the current working directory. The first method uses the <code>os</code> module and the second uses <a target="_blank" href="https://miguendes.me/python-pathlib">the newer <code>pathlib</code></a>.</p>
<h2 id="heading-using-the-os-module-to-get-the-current-directory">Using the <code>os</code> Module to Get the Current Directory</h2>
<p>First thing you need to do is to import the module.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> os
</code></pre>
<p>Then, you just need call the <code>getcwd</code> function.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> os
...
<span class="hljs-meta">&gt;&gt;&gt; </span>os.getcwd()
<span class="hljs-string">'/home/miguel'</span>
</code></pre>
<p>And that is it! As you can see, the function returns a string. This is not very flexible, what if you want to list all the files in that directory? You will have to call it on <code>os</code> functions again. </p>
<p>The <code>os</code> module is pretty ancient and I wouldn’t recommend using it nowadays. In the following section, I’m going to show you the modern way of getting the current working directory in Python.</p>
<h2 id="heading-getting-the-current-working-directory-through-the-pathlib-module">Getting the Current Working Directory Through the <code>pathlib</code> Module</h2>
<p>The <code>pathlib</code> module was proposed in 2012 and added to Python in the 3.4 version. The idea was to provide an object-oriented API for filesystem paths. This module provides classes that represent the filesystem paths with semantics appropriate for different operating systems. Also, Path objects are immutable and <em>hashable</em>, which helps prevent programming errors caused by mutability.</p>
<p>To get the current working directory using <code>pathlib</code> you can use the <em>classmethod</em> <code>cwd</code> from the <code>Path</code> class. But first, you need to import it.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
</code></pre>
<p>Them, you can call the method.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> pathlib <span class="hljs-keyword">import</span> Path
...
<span class="hljs-meta">&gt;&gt;&gt; </span>Path.cwd()
PosixPath(<span class="hljs-string">'/home/miguel'</span>)
</code></pre>
<p>As you can see, the output is different than the <code>os.getcwd()</code>. As I mentioned earlier, all paths follow the semantics of the underlying filesystem. In my case, I'm using Linux, so the output is a <code>PosixPath</code>. On Windows, <code>cwd</code> returns a <code>WindowsPath</code>.</p>
<p>Being an object allows many cool functionalities such as iterating over all files just by calling a method. However, If you still want to get the string representation, you can call <code>str</code> on the <code>Path.cwd()</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>str(Path.cwd())
<span class="hljs-string">'/home/miguel'</span>
</code></pre>
<h3 id="heading-pathcwd-under-the-hood"><code>Path.cwd</code> Under the Hood</h3>
<p>How does <code>Path</code> know the current directory? The answer is: it calls the <code>os</code> and returns an instance of <code>Path</code>. The following snippet shows the actual implementation.</p>
<pre><code class="lang-python"><span class="hljs-meta">    @classmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">cwd</span>(<span class="hljs-params">cls</span>):</span>
        <span class="hljs-string">"""Return a new path pointing to the current working directory
        (as returned by os.getcwd()).
        """</span>
        <span class="hljs-keyword">return</span> cls(os.getcwd())
</code></pre>
<blockquote>
<p>Wait! On Linux it returns a <code>PosixPath</code> but the class method belongs to <code>Path</code>. How does it know?</p>
</blockquote>
<p>Great question! <code>Path</code> does some magic behind the scenes before creating the object. It implements the <code>__new__</code> magic method and calls <code>os</code> to determine the underlying operating system. Check the implementation.</p>
<pre><code class="lang-python">    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__new__</span>(<span class="hljs-params">cls, *args, **kwargs</span>):</span>
        <span class="hljs-keyword">if</span> cls <span class="hljs-keyword">is</span> Path:
            cls = WindowsPath <span class="hljs-keyword">if</span> os.name == <span class="hljs-string">'nt'</span> <span class="hljs-keyword">else</span> PosixPath
        self = cls._from_parts(args, init=<span class="hljs-literal">False</span>)
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> self._flavour.is_supported:
            <span class="hljs-keyword">raise</span> NotImplementedError(<span class="hljs-string">"cannot instantiate %r on your system"</span>
                                      % (cls.__name__,))
        self._init()
        <span class="hljs-keyword">return</span> self
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>That's it for today, folks! I hope you enjoyed this brief article.</p>
<p>Other posts you may like:</p>
<ul>
<li><p><a target="_blank" href="https://miguendes.me/python-pathlib">Python pathlib Cookbook: 57+ Examples to Master It (2021)</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">3 Ways to Test API Client Applications in Python</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/5-hidden-python-features-you-probably-never-heard-of">5 Hidden Python Features You Probably Never Heard Of</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/7-pytest-features-and-plugins-that-will-save-you-tons-of-time">7 pytest Features and Plugins That Will Save You Tons of Time</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></p>
</li>
</ul>
<p>See you next time!</p>
<p>This post was originally published at <a target="_blank" href="https://miguendes.me/how-to-find-the-current-working-directory-in-python">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[7 Different Ways to Flatten a List of Lists in Python]]></title><description><![CDATA[Ever wondered how can you flatten, or unnest, a 2D list of lists in Python?
In another words, how to turn 2-D lists into 1D: 

[[1, 2], [3, 4], [5, 6, 7], [8]] -> [1, 2, 3, 4, 5, 6, 7, 8]?
[[1, 2], [4, 5], [[[7]]], [[[[8]]]]] -> [1, 2, 3, 4, 5, 6, 7,...]]></description><link>https://miguendes.me/python-flatten-list</link><guid isPermaLink="true">https://miguendes.me/python-flatten-list</guid><category><![CDATA[Python]]></category><category><![CDATA[Python 3]]></category><category><![CDATA[Tutorial]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 12 Dec 2020 10:03:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1634377119049/lemPujOYy.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Ever wondered how can you flatten, or unnest, a 2D list of lists in Python?</p>
<p>In another words, how to turn 2-D lists into 1D: </p>
<ul>
<li><code>[[1, 2], [3, 4], [5, 6, 7], [8]]</code> -&gt; <code>[1, 2, 3, 4, 5, 6, 7, 8]</code>?</li>
<li><code>[[1, 2], [4, 5], [[[7]]], [[[[8]]]]]</code> -&gt; <code>[1, 2, 3, 4, 5, 6, 7, 8]</code>?</li>
<li><code>[1, 2, 3, [4, 5, 6], [[7, 8]]]</code> -&gt; <code>[1, 2, 3, 4, 5, 6, 7, 8]</code>?</li>
</ul>
<p>What about lists with mixed types such as list of strings, or list of tuples?</p>
<ul>
<li><code>[[1, 2], "three", ["four", "five"]]</code> -&gt; <code>[1, 2,  "three", "four", "five"]</code></li>
<li><code>[[1, 2], (3, 4), (5, 6, 7), [8]]</code> -&gt; <code>[1, 2, 3, 4, 5, 6, 7, 8]</code></li>
</ul>
<p>In this post, we’ll see how we can unnest an arbitrarily nested list of lists in 7 different ways. Each method has pros and cons, and varies in performance. By going over each one, you’ll learn how to identify the most appropriate solution for your problem by creating your own <code>flatten()</code> function in Python. </p>
<p>For all examples, we'll use Python 3, and for the tests <code>pytest</code>.</p>
<p>By the end of this guide, you'll have learned:</p>
<ul>
<li>how to flatten / unnest a list of mixed types, including list of strings, list of tuples or ints</li>
<li>the best way to flatten lists of lists with list comprehensions</li>
<li>how to unfold a list and remove duplicates</li>
<li>how to convert a nested list of lists using the built-in function <code>sum</code> from the standard library</li>
<li>how to use numpy to flatten nested lists</li>
<li>how to use <code>itertools</code> chain to create a flat list</li>
<li>the best way to flatten a nested list using recursion or without recursion</li>
</ul>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><a class="post-section-overview" href="#flattening-a-list-of-lists-with-list-comprehensions">Flattening a list of lists with list comprehensions</a></li>
<li><a class="post-section-overview" href="#how-to-flatten-list-of-strings-tuples-or-mixed-types">How to flatten list of strings, tuples or mixed types</a></li>
<li>[How to flatten a nested list and remove duplicates](#how to flatten a list and remove duplicates)</li>
<li><a class="post-section-overview" href="#flattening-a-list-of-lists-with-the-sum-function">Flattening a nested list of lists with the <code>sum</code> function</a></li>
<li><a class="post-section-overview" href="#flattening-using-itertoolschain">Flattening using <code>itertools.chain</code></a></li>
<li><a class="post-section-overview" href="#flatten-a-regular-list-of-lists-with-numpy">Flatten a regular list of lists with numpy</a></li>
<li><p><a class="post-section-overview" href="#flattening-irregular-lists">Flattening irregular lists</a></p>
<ul>
<li><p><a class="post-section-overview" href="#the-recursive-approach">The recursive approach</a></p>
</li>
<li><p><a class="post-section-overview" href="#the-iterative-approach">The iterative approach</a></p>
</li>
</ul>
</li>
<li><p><a class="post-section-overview" href="#which-method-is-faster-a-performance-comparison">Which method is faster? A performance comparison</a></p>
</li>
<li><a class="post-section-overview" href="#conclusion">Conclusion</a></li>
</ol>
<h2 id="heading-flattening-a-list-of-lists-with-list-comprehensions">Flattening a list of lists with list comprehensions</h2>
<p>Let’s imagine that we have a simple list of lists like this <code>[[1, 3], [2, 5], [1]]</code> and we want to flatten it. </p>
<p>In other words, we want to convert the original list into a flat list like this <code>[1, 3, 2, 5, 1]</code>. The first way of doing that is through list/generator comprehensions. We iterate through each sublist, then iterate over each one of them producing a single element each time. </p>
<p>The following function accepts any multidimensional lists as an argument and returns a generator. The reason for that is to avoid building a whole list in memory. We can then use the generators to create a single list.</p>
<p>To make sure everything works as expected, we can assert the behavior with the <code>test_flatten</code> unit test.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Any, Iterable


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">flatten_gen_comp</span>(<span class="hljs-params">lst: List[Any]</span>) -&gt; Iterable[Any]:</span>
    <span class="hljs-string">"""Flatten a list using generators comprehensions."""</span>
    <span class="hljs-keyword">return</span> (item
            <span class="hljs-keyword">for</span> sublist <span class="hljs-keyword">in</span> lst
            <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> sublist)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_flatten</span>():</span>
    lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">5</span>], [<span class="hljs-number">1</span>]]

    <span class="hljs-keyword">assert</span> list(flatten_gen_comp(lst)) == [<span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">1</span>]
</code></pre>
<blockquote>
<p>This function returns a generator, to get a list back we need to convert the generator to list.</p>
</blockquote>
<p>When we run the test we can see it passing...</p>
<pre><code class="lang-console">============================= test session starts ==============================

flatten.py::test_flatten PASSED                                          [100%]

============================== 1 passed in 0.01s ===============================

Process finished with exit code 0
</code></pre>
<p>If you prefer you can make the code shorter by using a lambda function.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>flatten_lambda = <span class="hljs-keyword">lambda</span> lst: (item <span class="hljs-keyword">for</span> sublist <span class="hljs-keyword">in</span> lst <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> sublist)

<span class="hljs-meta">&gt;&gt;&gt; </span>lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">5</span>], [<span class="hljs-number">1</span>]]

<span class="hljs-meta">&gt;&gt;&gt; </span>list(flatten_lambda(lst))
[<span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">1</span>]
</code></pre>
<h2 id="heading-how-to-flatten-list-of-strings-tuples-or-mixed-types">How to flatten list of strings, tuples or mixed types</h2>
<p>The technique we've seen assumes the items are not iterables. Otherwise, it flattens them as well, which is the case for strings. Let's see what happens if we plug a list of lists and strings.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Any, Iterable


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">flatten_gen_comp</span>(<span class="hljs-params">lst: List[Any]</span>) -&gt; Iterable[Any]:</span>
    <span class="hljs-string">"""Flatten a list using generators comprehensions."""</span>
    <span class="hljs-keyword">return</span> (item
            <span class="hljs-keyword">for</span> sublist <span class="hljs-keyword">in</span> lst
            <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> sublist)


<span class="hljs-meta">&gt;&gt;&gt; </span>lst = [[<span class="hljs-string">'hello'</span>, <span class="hljs-string">'world'</span>], [<span class="hljs-string">'my'</span>, <span class="hljs-string">'dear'</span>], <span class="hljs-string">'friend'</span>]
<span class="hljs-meta">&gt;&gt;&gt; </span>list(flatten(lst))
[<span class="hljs-string">'hello'</span>, <span class="hljs-string">'world'</span>, <span class="hljs-string">'my'</span>, <span class="hljs-string">'dear'</span>, <span class="hljs-string">'f'</span>, <span class="hljs-string">'r'</span>, <span class="hljs-string">'i'</span>, <span class="hljs-string">'e'</span>, <span class="hljs-string">'n'</span>, <span class="hljs-string">'d'</span>]
</code></pre>
<p>Oops, that's not what we want! The reason is, since one of the items is iterable, the function will unfold them as well.</p>
<p>One way of preventing that is by checking if the item is a list of not. If not, we don't iterate over it.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Any, Iterable


<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">flatten</span>(<span class="hljs-params">lst: List[Any]</span>) -&gt; Iterable[Any]:</span>
    <span class="hljs-string">"""Flatten a list using generators comprehensions.
        Returns a flattened version of list lst.
    """</span>

    <span class="hljs-keyword">for</span> sublist <span class="hljs-keyword">in</span> lst:
         <span class="hljs-keyword">if</span> isinstance(sublist, list):
             <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> sublist:
                 <span class="hljs-keyword">yield</span> item
         <span class="hljs-keyword">else</span>:
             <span class="hljs-keyword">yield</span> sublist

<span class="hljs-meta">&gt;&gt;&gt; </span>lst = [[<span class="hljs-string">'hello'</span>, <span class="hljs-string">'world'</span>], [<span class="hljs-string">'my'</span>, <span class="hljs-string">'dear'</span>], <span class="hljs-string">'friend'</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span>list(flatten(l))
[<span class="hljs-string">'hello'</span>, <span class="hljs-string">'world'</span>, <span class="hljs-string">'my'</span>, <span class="hljs-string">'dear'</span>, <span class="hljs-string">'f'</span>, <span class="hljs-string">'r'</span>, <span class="hljs-string">'i'</span>, <span class="hljs-string">'e'</span>, <span class="hljs-string">'n'</span>, <span class="hljs-string">'d'</span>]
</code></pre>
<p>Since we check if sublist is a list of not, this works with list of tuples as well, any list of iterables, for that matter.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], <span class="hljs-number">3</span>, (<span class="hljs-number">4</span>, <span class="hljs-number">5</span>)]
<span class="hljs-meta">&gt;&gt;&gt; </span>list(flatten(lst))
[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, (<span class="hljs-number">4</span>, <span class="hljs-number">5</span>)]
</code></pre>
<p>Lastly, this flatten function works for multidimensional list of mixed types.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], <span class="hljs-number">3</span>, (<span class="hljs-number">4</span>, <span class="hljs-number">5</span>), [<span class="hljs-string">"string"</span>], <span class="hljs-string">"hello"</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span>list(flatten(lst))
[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, (<span class="hljs-number">4</span>, <span class="hljs-number">5</span>), <span class="hljs-string">'string'</span>, <span class="hljs-string">'hello'</span>]
</code></pre>
<h2 id="heading-how-to-flatten-a-list-and-remove-duplicates">How to flatten a list and remove duplicates</h2>
<p>To flatten a list of lists and return a list without duplicates, the best way is to convert the final output to a <code>set</code>.</p>
<p>The only downside is that if the list is big, there'll be a performance penalty since we need to create the <code>set</code> using the generator, then convert <code>set</code> to <code>list</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Any, Iterable


<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">flatten</span>(<span class="hljs-params">lst: List[Any]</span>) -&gt; Iterable[Any]:</span>
    <span class="hljs-string">"""Flatten a list using generators comprehensions.
        Returns a flattened version of list lst.
    """</span>

    <span class="hljs-keyword">for</span> sublist <span class="hljs-keyword">in</span> lst:
         <span class="hljs-keyword">if</span> isinstance(sublist, list):
             <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> sublist:
                 <span class="hljs-keyword">yield</span> item
         <span class="hljs-keyword">else</span>:
             <span class="hljs-keyword">yield</span> sublist

<span class="hljs-meta">&gt;&gt;&gt; </span>lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>], <span class="hljs-number">3</span>, (<span class="hljs-number">4</span>, <span class="hljs-number">5</span>), [<span class="hljs-string">"string"</span>], <span class="hljs-string">"hello"</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-string">"hello"</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span>list(set(flatten(lst)))
[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-string">'hello'</span>, <span class="hljs-number">4</span>, (<span class="hljs-number">4</span>, <span class="hljs-number">5</span>), <span class="hljs-string">'string'</span>]
</code></pre>
<h2 id="heading-flattening-a-list-of-lists-with-the-sum-function">Flattening a list of lists with the <code>sum</code> function</h2>
<p>The second strategy is a bit unconventional and, truth to be told, very "magical". </p>
<p>Did you know that we can use the built-in function <code>sum</code> to create flattened lists? </p>
<p>All we need to do is to pass the list as an argument along an empty list. The following code snippet illustrates that.</p>
<p>If you’re curious about this approach, I discuss it in more detail in <a target="_blank" href="https://miguendes.me/5-hidden-python-features-you-probably-never-heard-of#can-you-flat-this-list">another post</a>.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Any, Iterable


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">flatten_sum</span>(<span class="hljs-params">lst: List[Any]</span>) -&gt; Iterable[Any]:</span>
    <span class="hljs-string">"""Flatten a list using sum."""</span>
    <span class="hljs-keyword">return</span> sum(lst, [])


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_flatten</span>():</span>
    lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">5</span>], [<span class="hljs-number">1</span>]]

    <span class="hljs-keyword">assert</span> list(flatten_sum(lst)) == [<span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">1</span>]
</code></pre>
<p>And the tests pass too...</p>
<pre><code class="lang-console">flatten.py::test_flatten PASSED
</code></pre>
<p>Even though it seems clever, it's not a good idea to use it in production IMHO. As we'll see later, this is the worst way to flatten a list in terms of performance.</p>
<h2 id="heading-flattening-using-itertoolschain">Flattening using <code>itertools.chain</code></h2>
<p>The third alternative is to use the <code>chain</code> function from the <code>itertools</code> module. In a nutshell, <code>chain</code> creates a single iterator from a sequence of other iterables. This function is a perfect match for our use case.</p>
<p>Equivalent implementation of <code>chain</code> taken from the <a target="_blank" href="https://docs.python.org/3/library/itertools.html#itertools.chain">official docs</a> looks like this.</p>
<pre><code class="lang-python"><span class="hljs-keyword">import</span> itertools


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">chain</span>(<span class="hljs-params">*iterables</span>):</span>
    <span class="hljs-comment"># chain('ABC', 'DEF') --&gt; A B C D E F</span>
    <span class="hljs-keyword">for</span> it <span class="hljs-keyword">in</span> iterables:
        <span class="hljs-keyword">for</span> element <span class="hljs-keyword">in</span> it:
            <span class="hljs-keyword">yield</span> element
</code></pre>
<p>We can then flatten the multi-level lists like so:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">flatten_chain</span>(<span class="hljs-params">lst: List[Any]</span>) -&gt; Iterable[Any]:</span>
    <span class="hljs-string">"""Flatten a list using chain."""</span>
    <span class="hljs-keyword">return</span> itertools.chain(*lst)


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_flatten</span>():</span>
    lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">5</span>], [<span class="hljs-number">1</span>]]

    <span class="hljs-keyword">assert</span> list(flatten_chain(lst)) == [<span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">1</span>]
</code></pre>
<p>And the test pass...</p>
<pre><code class="lang-console">[OMITTED]
....
flatten.py::test_flatten PASSED   
...
[OMITTED]
</code></pre>
<h2 id="heading-flatten-a-regular-list-of-lists-with-numpy">Flatten a regular list of lists with numpy</h2>
<p>Another option to create flat lists from nested ones is to use numpy. This library is mostly used to represent and perform operations on multidimensional arrays such as 2D and 3D arrays. </p>
<p>What most people don't know is that some of its function also work with multidimensional lists or other list of iterables. For example, we can use the <code>numpy.concatenate</code> function to flatten a regular list of lists.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> numpy <span class="hljs-keyword">as</span> np

<span class="hljs-meta">&gt;&gt;&gt; </span>lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">5</span>], [<span class="hljs-number">1</span>], [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>]]

<span class="hljs-meta">&gt;&gt;&gt; </span>list(np.concatenate(lst))
[<span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">1</span>, <span class="hljs-number">7</span>, <span class="hljs-number">8</span>]
</code></pre>
<h2 id="heading-flattening-irregular-lists">Flattening irregular lists</h2>
<p>So far we've been flattening regular lists, but what happens if we have a list like this <code>[[1, 3], [2, 5], 1]</code> or this <code>[1, [2, 3], [[4]], [], [[[[[[[[[5]]]]]]]]]]</code>? </p>
<p>Unfortunately, that ends up not so well if we try to apply any of those preceding approaches. In this section, we’ll see two unique solutions for that, one recursive and the other iterative.</p>
<h3 id="heading-the-recursive-approach">The recursive approach</h3>
<p>Solving the flatting problem recursively means iterating over each list element and deciding if the item is already flattened or not. If so, we return it, otherwise we can call <code>flatten_recursive</code> on it. But better than words is code, so let’s see some code.</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Any, Iterable


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">flatten_recursive</span>(<span class="hljs-params">lst: List[Any]</span>) -&gt; Iterable[Any]:</span>
    <span class="hljs-string">"""Flatten a list using recursion."""</span>
    <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> lst:
        <span class="hljs-keyword">if</span> isinstance(item, list):
            <span class="hljs-keyword">yield</span> <span class="hljs-keyword">from</span> flatten_recursive(item)
        <span class="hljs-keyword">else</span>:
            <span class="hljs-keyword">yield</span> item

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_flatten_recursive</span>():</span>
    lst = [[<span class="hljs-number">1</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">5</span>], <span class="hljs-number">1</span>]

    <span class="hljs-keyword">assert</span> list(flatten_recursive(lst)) == [<span class="hljs-number">1</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">5</span>, <span class="hljs-number">1</span>]
</code></pre>
<p>Bare in mind that the if <code>isinstance(item, list)</code> means it only works with lists. On the other hand, it will flatten lists of mixed types with no trouble.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>list(flatten_recursive(lst))
[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, (<span class="hljs-number">4</span>, <span class="hljs-number">5</span>), <span class="hljs-string">'string'</span>, <span class="hljs-string">'hello'</span>]
</code></pre>
<h3 id="heading-the-iterative-approach">The iterative approach</h3>
<p>The iterative approach is no doubt the most complex of all of them. </p>
<p>In this approach, we’ll use a <code>deque</code> to flatten the irregular list. Quoting the <a target="_blank" href="https://docs.python.org/3/library/collections.html#collections.deque">official docs</a>:</p>
<blockquote>
<p>"Deques are a generalization of stacks and queues (the name is pronounced “deck” and is short for “double-ended queue”). Deques support thread-safe, memory efficient appends and pops from either side of the deque with approximately the same O(1) performance in either direction." </p>
</blockquote>
<p>In other words, we can use <code>deque</code> to simulate the stacking operation of the recursive solution.</p>
<p>To do that, we’ll start just like we did in the recursion case, we’ll iterate through each element and if the element is not a list, we’ll append to the left of the <code>deque</code>. The <code>appendleft</code> method append the element to the leftmost position of the <code>deque</code>, for example:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> collections <span class="hljs-keyword">import</span> deque

<span class="hljs-meta">&gt;&gt;&gt; </span>l = deque()

<span class="hljs-meta">&gt;&gt;&gt; </span>l.appendleft(<span class="hljs-number">2</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>l
deque([<span class="hljs-number">2</span>])

<span class="hljs-meta">&gt;&gt;&gt; </span>l.appendleft(<span class="hljs-number">7</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>l
deque([<span class="hljs-number">7</span>, <span class="hljs-number">2</span>])
</code></pre>
<p>If the element <strong>is</strong> a list, though, then we need to reverse it first to pass it to “extendleft” method, like this <code>my_deque.extendleft(reversed(item))</code>. Again, similar to a <code>list</code>, <code>extendleft</code> adds each item to the leftmost position of the <code>deque</code> in series. As a result, <code>deque</code> will add the elements in a reverse order. That’s exactly why we need to reverse the sub-list before extending left. To make things clearer, let’s see an example.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>l = deque()

<span class="hljs-meta">&gt;&gt;&gt; </span>l.extendleft([<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>])

<span class="hljs-meta">&gt;&gt;&gt; </span>l
deque([<span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>])
</code></pre>
<p>The final step is to iterate over the <code>deque</code> removing the leftmost element and yielding it if it’s not a list. If the element is a list, then we need to extend it left. The full implement goes like this:</p>
<pre><code class="lang-python"><span class="hljs-keyword">from</span> typing <span class="hljs-keyword">import</span> List, Any, Iterable


<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">flatten_deque</span>(<span class="hljs-params">lst: List[Any]</span>) -&gt; Iterable[Any]:</span>
    <span class="hljs-string">"""Flatten a list using a deque."""</span>
    q = deque()
    <span class="hljs-keyword">for</span> item <span class="hljs-keyword">in</span> lst:
        <span class="hljs-keyword">if</span> isinstance(item, list):
            q.extendleft(reversed(item))
        <span class="hljs-keyword">else</span>:
            q.appendleft(item)
        <span class="hljs-keyword">while</span> q:
            elem = q.popleft()
            <span class="hljs-keyword">if</span> isinstance(elem, list):
                q.extendleft(reversed(elem))
            <span class="hljs-keyword">else</span>:
                <span class="hljs-keyword">yield</span> elem

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_flatten_super_irregular</span>():</span>
    lst = [<span class="hljs-number">1</span>, [<span class="hljs-number">2</span>, <span class="hljs-number">3</span>], [<span class="hljs-number">4</span>], [], [[[[[[[[[<span class="hljs-number">5</span>]]]]]]]]]]

    <span class="hljs-keyword">assert</span> list(flatten_deque(lst)) == [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>, <span class="hljs-number">5</span>]
</code></pre>
<p>When we run the test, it happily pass...</p>
<pre><code class="lang-console">============================= test session starts ==============================
...

flatten.py::test_flatten_super_irregular PASSED                          [100%]

============================== 1 passed in 0.01s ===============================
</code></pre>
<p>This approach can also flatten lists of mixed types with no trouble.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>list(flatten_deque(lst))
[<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, (<span class="hljs-number">4</span>, <span class="hljs-number">5</span>), <span class="hljs-string">'string'</span>, <span class="hljs-string">'hello'</span>]
</code></pre>
<h2 id="heading-which-method-is-faster-a-performance-comparison">Which method is faster? A performance comparison</h2>
<p>As we’ve seen in the previous section, if our multi-level list is irregular we have little choice. But assuming that this is not a frequent use case, how these approaches compare in terms of performance?</p>
<p>In this last part, we’ll run a benchmark and compare which solutions perform best.</p>
<p>To do that, we can use the <code>timeit</code> module, we can invoke it in <code>IPython</code> by doing <code>%timeit [code_to_measure]</code>. The following list is the timings for each one of them. As we can see, <code>flatten_chain</code> is the fastest implementation of all. It flatted our list <code>lst</code> in <code>267 µs</code> avg. The slowest implementation is <code>flatten_sum</code>, taking around <code>42 ms</code> to flatten the same list.</p>
<p>PS: A special thanks to @hynekcer who pointed out a bug in this benchmark. Since most of the functions return a generator, we need to consume all elements in order to get a better assessment. We can either iterate over the generator or create a list out of it.</p>
<h4 id="heading-flatten-generator-comprehension">Flatten generator comprehension</h4>
<pre><code class="lang-console">In [1]: lst = [[1, 2, 3], [4, 5, 6], [7], [8, 9]] * 1_000

In [2]: %timeit list(flatten_gen_comp(lst))
615 µs ± 2.74 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
</code></pre>
<h4 id="heading-flatten-sum">Flatten sum</h4>
<pre><code class="lang-console">In [3]: In [19]: %timeit list(flatten_sum(lst))
42 ms ± 660 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
</code></pre>
<h4 id="heading-flatten-chain">Flatten chain</h4>
<pre><code class="lang-console">In [4]: %timeit list(flatten_chain(lst))
267 µs ± 517 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
</code></pre>
<h4 id="heading-flatten-numpy">Flatten numpy</h4>
<pre><code class="lang-console">In [5]: %timeit list(flatten_numpy(lst))
4.65 ms ± 14.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
</code></pre>
<h4 id="heading-flatten-recursive">Flatten recursive</h4>
<pre><code class="lang-console">In [6]: %timeit list(flatten_recursive(lst))
3.02 ms ± 174 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
</code></pre>
<h4 id="heading-flatten-deque">Flatten deque</h4>
<pre><code class="lang-console">In [7]: %timeit list(flatten_deque(lst))
2.97 ms ± 21.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>We can flatten a multi-level list in several different ways with Python. Each approach has its pros and cons. In this post we looked at 5 different ways to create 1D lists from nested 2D lists, including:</p>
<ul>
<li>regular nested lists</li>
<li>irregular lists</li>
<li>list of mixed types</li>
<li>list of strings</li>
<li>list of tuples or ints</li>
<li>recursive and iterative approach</li>
<li>using the iterools module</li>
<li>removing duplicates</li>
</ul>
<p>Other posts you may like:</p>
<ul>
<li><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></li>
<li><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">Python's f-strings: 73 Examples to Help You Master It</a></li>
<li><a class="post-section-overview" href="#https://miguendes.me/design-patterns-that-make-sense-in-python-simple-factory">Design Patterns That Make Sense in Python: Simple Factory</a></li>
<li><a target="_blank" href="https://miguendes.me/how-to-pass-multiple-arguments-to-a-map-function-in-python">How to Pass Multiple Arguments to a map Function in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">3 Ways to Unit Test REST APIs in Python</a></li>
</ul>
<p>See you next time!</p>
]]></content:encoded></item><item><title><![CDATA[Design Patterns That Make Sense in Python: Simple Factory]]></title><description><![CDATA[In the first post of this series, I'll talk about Design Patterns that make sense in Python. We'll see how to implement them and how they are used in the standard library and other third-party packages. We'll see what is and how we can use the Simple...]]></description><link>https://miguendes.me/design-patterns-that-make-sense-in-python-simple-factory</link><guid isPermaLink="true">https://miguendes.me/design-patterns-that-make-sense-in-python-simple-factory</guid><category><![CDATA[Python]]></category><category><![CDATA[design patterns]]></category><category><![CDATA[Object Oriented Programming]]></category><category><![CDATA[best practices]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 05 Dec 2020 15:52:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1607182915432/YDVqXqaHy.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In the first post of this series, I'll talk about Design Patterns that make sense in Python. We'll see how to implement them and how they are used in the standard library and other third-party packages. We'll see what is and how we can use the Simple Factory Pattern. Not only that, by the end of this article you will be able to understand the problems it solves and why it makes sense in Python.</p>
<p><iframe src="https://giphy.com/embed/1kowbKFzLQqXu" width="480" height="282" class="giphy-embed"></iframe></p><p><a href="https://giphy.com/gifs/loop-1kowbKFzLQqXu">via GIPHY</a></p><p></p>
<h2 id="heading-introduction">Introduction</h2>
<p>Design Patterns has been a popular subject since the Design Patterns: Elements of Reusable Object-Oriented Software (a.k.a GoF) book was released back in 1994. GoF’s goals were to show techniques, a.k.a patterns, to improve an object-oriented design. In total, the book demonstrated 23 patterns, classified in 3 groups:</p>
<ul>
<li><p>Creational</p>
</li>
<li><p>Behavioral</p>
</li>
<li><p>Structural</p>
</li>
</ul>
<p>Among the creational patterns, we have the Factory Method. According to the book, the goal of this pattern is to define an interface to create an object. The sub classes will then decide which class will be instantiated. There’s also another variation called Simple Factory. This pattern creates an instance of an object without exposing the details behind the construction. In this article, we’ll see how to do that in Python in an idiomatic way.</p>
<blockquote>
<p>When this is useful? Can we just call the constructor directly?</p>
</blockquote>
<p>This pattern is helpful when you need to perform an extra setup before calling a constructor. In the next section we’ll see several examples on how they are used in the Python standard library and also in third-party packages such as <a target="_blank" href="https://pandas.pydata.org/">pandas</a>.</p>
<h2 id="heading-usage">Usage</h2>
<p>In this part, we’ll see how this pattern is used in practice and how you can implement it yourself.</p>
<h3 id="heading-python-standard-library">Python Standard Library</h3>
<p>The <code>datetime</code> module is one of the most important ones in the standard library. It defines a few classes such as <code>date</code>, <code>datetime</code>, and <code>timedelta</code>. This module uses the simple factory pattern extensively. A real example is the <code>date</code> class. It has a method called <code>fromtimestamp</code> that creates <code>date</code> instances given a timestamp. </p>
<pre><code class="lang-python">In [<span class="hljs-number">3</span>]: <span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> date

In [<span class="hljs-number">4</span>]: date.fromtimestamp(time.time())
Out[<span class="hljs-number">4</span>]: datetime.date(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">10</span>)
</code></pre>
<p>If we look at the implementation, we can see that it extracts the year, month and day from the <code>time</code> instance and the call the constructor (<code>cls</code>). This is the kind of setup that is abstracted away from the user.</p>
<pre><code class="lang-python"><span class="hljs-meta">    @classmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fromtimestamp</span>(<span class="hljs-params">cls, t</span>):</span>
        <span class="hljs-string">"Construct a date from a POSIX timestamp (like time.time())."</span>
        y, m, d, hh, mm, ss, weekday, jday, dst = _time.localtime(t)
        <span class="hljs-keyword">return</span> cls(y, m, d)
</code></pre>
<p>Another great example is the <code>fromisocalendar</code> method, which performs an extensive setup. Instead of leaving it to the user, the class provides the functionality “for free” by hiding that from you.</p>
<pre><code class="lang-python"><span class="hljs-comment"># https://github.com/python/cpython/blob/c304c9a7efa8751b5bc7526fa95cd5f30aac2b92/Lib/datetime.py#L860-L893</span>
...
<span class="hljs-meta">    @classmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">fromisocalendar</span>(<span class="hljs-params">cls, year, week, day</span>):</span>
        <span class="hljs-string">"""Construct a date from the ISO year, week number and weekday.
        This is the inverse of the date.isocalendar() function"""</span>
        <span class="hljs-comment"># Year is bounded this way because 9999-12-31 is (9999, 52, 5)</span>
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> MINYEAR &lt;= year &lt;= MAXYEAR:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Year is out of range: <span class="hljs-subst">{year}</span>"</span>)

        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> <span class="hljs-number">0</span> &lt; week &lt; <span class="hljs-number">53</span>:
            out_of_range = <span class="hljs-literal">True</span>

            <span class="hljs-keyword">if</span> week == <span class="hljs-number">53</span>:
                <span class="hljs-comment"># ISO years have 53 weeks in them on years starting with a</span>
                <span class="hljs-comment"># Thursday and leap years starting on a Wednesday</span>
                first_weekday = _ymd2ord(year, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>) % <span class="hljs-number">7</span>
                <span class="hljs-keyword">if</span> (first_weekday == <span class="hljs-number">4</span> <span class="hljs-keyword">or</span> (first_weekday == <span class="hljs-number">3</span> <span class="hljs-keyword">and</span>
                                           _is_leap(year))):
                    out_of_range = <span class="hljs-literal">False</span>

            <span class="hljs-keyword">if</span> out_of_range:
                <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Invalid week: <span class="hljs-subst">{week}</span>"</span>)

        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> <span class="hljs-number">0</span> &lt; day &lt; <span class="hljs-number">8</span>:
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">f"Invalid weekday: <span class="hljs-subst">{day}</span> (range is [1, 7])"</span>)

        <span class="hljs-comment"># Now compute the offset from (Y, 1, 1) in days:</span>
        day_offset = (week - <span class="hljs-number">1</span>) * <span class="hljs-number">7</span> + (day - <span class="hljs-number">1</span>)

        <span class="hljs-comment"># Calculate the ordinal day for monday, week 1</span>
        day_1 = _isoweek1monday(year)
        ord_day = day_1 + day_offset

        <span class="hljs-keyword">return</span> cls(*_ord2ymd(ord_day))
....
</code></pre>
<h3 id="heading-pandas">Pandas</h3>
<p><code>pandas</code> is one of the most used Python packages thanks to the rise of Data Science and Machine Learning. Just like Python, <code>pandas</code> also makes use of factory methods. A classic example is the <code>from_dict</code> method that belongs to the <code>DataFrame</code> class.</p>
<pre><code class="lang-python">        &gt;&gt;&gt; data = {<span class="hljs-string">'row_1'</span>: [<span class="hljs-number">3</span>, <span class="hljs-number">2</span>, <span class="hljs-number">1</span>, <span class="hljs-number">0</span>], <span class="hljs-string">'row_2'</span>: [<span class="hljs-string">'a'</span>, <span class="hljs-string">'b'</span>, <span class="hljs-string">'c'</span>, <span class="hljs-string">'d'</span>]}
        &gt;&gt;&gt; pd.DataFrame.from_dict(data, orient=<span class="hljs-string">'index'</span>)
               <span class="hljs-number">0</span>  <span class="hljs-number">1</span>  <span class="hljs-number">2</span>  <span class="hljs-number">3</span>
        row_1  <span class="hljs-number">3</span>  <span class="hljs-number">2</span>  <span class="hljs-number">1</span>  <span class="hljs-number">0</span>
        row_2  a  b  c  d
</code></pre>
<p>When we inspect the implementation we can also see a lot of setup and extra checks.</p>
<pre><code class="lang-python"><span class="hljs-meta">    @classmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">from_dict</span>(<span class="hljs-params">cls, data, orient=<span class="hljs-string">"columns"</span>, dtype=None, columns=None</span>) -&gt; DataFrame:</span>
        ...
        index = <span class="hljs-literal">None</span>
        orient = orient.lower()
        <span class="hljs-keyword">if</span> orient == <span class="hljs-string">"index"</span>:
            <span class="hljs-keyword">if</span> len(data) &gt; <span class="hljs-number">0</span>:
                <span class="hljs-comment"># TODO speed up Series case</span>
                <span class="hljs-keyword">if</span> isinstance(list(data.values())[<span class="hljs-number">0</span>], (Series, dict)):
                    data = _from_nested_dict(data)
                <span class="hljs-keyword">else</span>:
                    data, index = list(data.values()), list(data.keys())
        <span class="hljs-keyword">elif</span> orient == <span class="hljs-string">"columns"</span>:
            <span class="hljs-keyword">if</span> columns <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span>:
                <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"cannot use columns parameter with orient='columns'"</span>)
        <span class="hljs-keyword">else</span>:  <span class="hljs-comment"># pragma: no cover</span>
            <span class="hljs-keyword">raise</span> ValueError(<span class="hljs-string">"only recognize index or columns for orient"</span>)

        <span class="hljs-keyword">return</span> cls(data, index=index, columns=columns, dtype=dtype)
</code></pre>
<h3 id="heading-how-to-implement-it">How to Implement It</h3>
<p>The most idiomatic way of implementing factory methods in Python is by decorating them as <code>classmethod</code>. In Python, regular methods are attached to an object instance. We can access the objects’ fields via the <code>self</code> argument. <code>classmethod</code>, on the other hand, are bound not to an instance but to a <code>class</code>. That means when we call <code>MyClass.factory_method</code> we are passing <code>MyClass</code> as the first argument, called <code>cls</code>. This property makes them an excellent alternative for factory methods since calling <code>cls(args)</code> inside a <code>classmethod</code> is the same as <code>MyClass(args)</code>.</p>
<p>To design your own factory methods, it’s sufficient to decorate it as a <code>classmethod</code> and return a new instance built with the <code>cls</code> argument. For example, presume that we want to implement a <code>Point</code> class and we want it also be constructed from Polar coordinates. The extra setup to convert from Polar to Cartesian is kept inside the method. Not simply it’s more readable, but also simplifies the constructor.</p>
<pre><code class="lang-python"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Point</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, x: float, y: float</span>):</span>
        self.x = x
        self.y = y

<span class="hljs-meta">    @classmethod</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">from_polar</span>(<span class="hljs-params">cls, r: float, theta: float</span>) -&gt; "Point":</span>
        <span class="hljs-string">"""
        Converts a polar coordinate into cartesian point.

        &gt;&gt;&gt; Point.from_polar(r=-2**0.5, theta=math.pi / 4)
        Point(x=-1.00, y=-1.00)
        """</span>
        <span class="hljs-keyword">return</span> cls(r * math.cos(theta), r * math.sin(theta))

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__repr__</span>(<span class="hljs-params">self</span>):</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"<span class="hljs-subst">{self.__class__.__name__}</span>(x=<span class="hljs-subst">{self.x:<span class="hljs-number">.2</span>f}</span>, y=<span class="hljs-subst">{self.y:<span class="hljs-number">.2</span>f}</span>)"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>Point.from_polar(r=<span class="hljs-number">-2</span>**<span class="hljs-number">0.5</span>, theta=math.pi / <span class="hljs-number">4</span>)
Point(x=<span class="hljs-number">-1.00</span>, y=<span class="hljs-number">-1.00</span>)
</code></pre>
<h3 id="heading-conclusion">Conclusion</h3>
<p>That's pretty much it! I hope you’ve learned something different and useful. Simple Factory methods are very cool and can abstract a lot of boilerplate. Not to mention that it makes your code clean and readable. In this post I showed how this pattern is used in the standard library and in other packages such as <code>pandas</code>. </p>
<p>Other posts you may like:</p>
<ul>
<li><a target="_blank" href="https://miguendes.me/how-to-pass-multiple-arguments-to-a-map-function-in-python">How to Pass Multiple Arguments to a map Function in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">73 Examples to Help You Master Python's f-strings</a></li>
<li><a target="_blank" href="https://miguendes.me/how-to-check-if-an-exception-is-raised-or-not-with-pytest">How to Check if an Exception Is Raised (or Not) With pytest</a></li>
<li><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">3 Ways to Test API Client Applications in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></li>
<li><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/5-hidden-python-features-you-probably-never-heard-of">5 Hidden Python Features You Probably Never Heard Of</a></li>
<li><a target="_blank" href="https://miguendes.me/7-pytest-features-and-plugins-that-will-save-you-tons-of-time">7 pytest Features and Plugins That Will Save You Tons of Time</a></li>
</ul>
<p>See you next time!</p>
]]></content:encoded></item><item><title><![CDATA[How to Pass Multiple Arguments to a map Function in Python]]></title><description><![CDATA[Introduction
The map() function is everywhere in Python. It's a built in, it's part of the concurrent.futures.Executor, and also multiprocessing.Pool; but... it's limited!
It's limited because you cannot pass multiple arguments to it. However, what i...]]></description><link>https://miguendes.me/how-to-pass-multiple-arguments-to-a-map-function-in-python</link><guid isPermaLink="true">https://miguendes.me/how-to-pass-multiple-arguments-to-a-map-function-in-python</guid><category><![CDATA[Python]]></category><category><![CDATA[map]]></category><category><![CDATA[Functional Programming]]></category><category><![CDATA[multithreading]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 28 Nov 2020 10:06:29 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1606556144395/QZFZMJ76B.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="introduction">Introduction</h2>
<p>The <code>map()</code> function is everywhere in Python. It's a built in, it's part of the <code>concurrent.futures.Executor</code>, and also <code>multiprocessing.Pool</code>; but... it's limited!</p>
<p>It's limited because you cannot pass multiple arguments to it. However, what if I told you that there's some easy ways you can do that?</p>
<p>In this post, I’m going to show what you can do to map a function that expects multiple arguments. By the end of this article, you'll know:</p>
<ul>
<li><a class="post-section-overview" href="#what-does-map-do-in-python-and-the-problem-with-it">what is a map function and the problem with it</a></li>
<li><a class="post-section-overview" href="#solution-1-mapping-multiple-arguments-with-itertoolsstarmap">how to map two or more arguments with <code>itertools.starmap()</code></a></li>
<li><a class="post-section-overview" href="#solution-2-using-functoolspartial-to-freeze-the-arguments">how to use <code>functools.partial</code> to "freeze" and pass multiple arguments to map</a></li>
<li><a class="post-section-overview" href="#solution-3-mapping-multiple-arguments-by-repeating-them">the way to map multiple arguments by "repeating" them</a></li>
<li><a class="post-section-overview" href="#problem-2-passing-multiple-parameters-to-multiprocessing-poolmap">how to pass multiple args to multiprocessing <code>pool.map</code></a></li>
<li><a class="post-section-overview" href="#problem-3-how-to-pass-multiple-arguments-to-a-concurrent-futures-processpoolexecutor-or-threadpoolexecutor">how to pass multiple arguments to a concurrent futures ProcessPoolExecutor (or ThreadPoolExecutor)?</a></li>
</ul>
<p>Let's go!</p>
<h2 id="what-is-a-map-function-and-the-problem-with-it">What Is a Map Function and the Problem With It</h2>
<p>A <code>map()</code> is a function that expects one or more iterables and a function as arguments. </p>
<p>For each item in these iterables, <code>map</code> applies the function passed as argument. The result is an iterator where each element is produced by the function you provided as argument. If you pass multiple iterables, you must pass a function that accepts that many arguments.</p>
<h3 id="the-problem">The Problem</h3>
<p>Let’s imagine that you have a function called <code>sum_four</code> that takes 4 arguments and returns their sum.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sum_four</span>(<span class="hljs-params">a, b, c, d</span>):</span>
        <span class="hljs-keyword">return</span> a + b + c + d
</code></pre>
<p>Let’s also suppose that you are solving a very specific problem that requires the first 3 arguments to be fixed. In this problem, you want to compare how the function behaves when you vary only the last parameter.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a, b, c = <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>sum_four(a=a, b=b, c=c, d=<span class="hljs-number">1</span>)
 <span class="hljs-number">7</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>sum_four(a=a, b=b, c=c, d=<span class="hljs-number">2</span>)
 <span class="hljs-number">8</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>sum_four(a=a, b=b, c=c, d=<span class="hljs-number">3</span>)
 <span class="hljs-number">9</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>sum_four(a=a, b=b, c=c, d=<span class="hljs-number">4</span>)
 <span class="hljs-number">10</span>
</code></pre>
<p>Now, say that you want to use <code>map</code>, because you like functional programming, or maybe because you come from a language that encourages this paradigm. </p>
<p>Since only <code>d</code> varies, we could store all potential values for <code>d</code> we want to test in a list like this <code>all_d_values = [1, 2, 3, 4]</code>. </p>
<p>The issue is, given a function and a list of single elements, if you want to pass that list to a <code>map</code> function and it takes only one element, what can you do?</p>
<h3 id="solution-1-mapping-multiple-arguments-with-itertoolsstarmap">Solution 1 - Mapping Multiple Arguments with <code>itertools.starmap()</code></h3>
<p>The first solution is to <em>not</em> adopt the <code>map</code> function but use <code>itertools.starmap</code> instead. This function will take a function as arguments and an iterable of tuples.  Then, <code>starmap</code> will iterate over each tuple <code>t</code> and call the function by unpacking the arguments, like this <code>for t in tuples: function(*t)</code>.</p>
<p>To make things more clear, consider the following example.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> itertools

<span class="hljs-meta">&gt;&gt;&gt; </span>all_d_values = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span>items = ((a, b, c, d) <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> all_d_values)

<span class="hljs-meta">&gt;&gt;&gt; </span>list(items)
 [(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span>list(itertools.starmap(sum_four, items))
 [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">10</span>]
</code></pre>
<p>As you can see, there’s a lot of repetition, which may inevitably consume a lot of memory if the list is big. To improve that I made <code>items</code> as a generator, this way we only hold in memory the element we’ll be processing.</p>
<h3 id="solution-2-using-functoolspartial-to-freeze-the-arguments">Solution 2 - Using <code>functools.partial</code> to “Freeze” the Arguments</h3>
<p>The second solution is to use currying and create a new partial function. According to the docs, <a target="_blank" href="https://docs.python.org/3/library/functools.html#functools.partial"><code>partial()</code></a> will "freeze" some portion of a function’s arguments and/or keywords resulting in a new function with a simplified signature.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> functools

<span class="hljs-meta">&gt;&gt;&gt; </span>partial_sum_four = functools.partial(sum_four, a, b, c)

<span class="hljs-meta">&gt;&gt;&gt; </span>partial_sum_four(<span class="hljs-number">3</span>)
<span class="hljs-number">9</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>list(map(partial_sum_four, all_d_values))
[<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">10</span>]
</code></pre>
<h3 id="solution-3-mapping-multiple-arguments-by-repeating-them">Solution 3 - Mapping Multiple Arguments by "Repeating" Them</h3>
<p>The third alternative is to use the <a target="_blank" href="https://docs.python.org/3/library/itertools.html#itertools.repeat"><code>itertools.repeat()</code></a>. </p>
<p>This function produces an iterator that returns object over and over again. It will run indefinitely if you don’t specify the times argument. </p>
<p>If we take a closer look at <code>map()</code>'s  <a target="_blank" href="https://docs.python.org/3/library/functions.html#map">signature</a>, it accepts a function and multiple iterables, <code>map(function, iterable, ...)</code>. </p>
<p>According to its description, </p>
<blockquote>
<p>If additional iterable arguments are passed, function must take that many arguments and is applied to the items from all iterables in parallel. With multiple iterables, the iterator stops when the shortest iterable is exhausted.</p>
</blockquote>
<p>Bingo! We can make <code>a</code>, <code>b</code> and <code>c</code> infitnite iterables by using <code>itertools.repeat()</code>. As soon as <code>all_d_values</code> is exhausted, which is the shortest iterable, <code>map()</code> will stop.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> itertools
<span class="hljs-meta">&gt;&gt;&gt; </span>list(map(sum_four, itertools.repeat(a), itertools.repeat(b), itertools.repeat(c), all_d_values))
 [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">10</span>]
</code></pre>
<p>To put it another way, using <code>repeat()</code> is roughly equivalent to:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>list(map(sum_four, [<span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>, <span class="hljs-number">1</span>], [<span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>, <span class="hljs-number">2</span>], [<span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>], all_d_values))
 [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">10</span>]
</code></pre>
<p>You don't need to worry too much about memory as <code>repeat</code> produces the elements on the go.  In fact, it returns a <code>repeatobject</code>, not <code>list</code>  <a target="_blank" href="https://github.com/python/cpython/blob/3.9/Modules/itertoolsmodule.c#L4226">[ref]</a> .</p>
<h2 id="problem-2-passing-multiple-parameters-to-multiprocessing-poolmap">Problem 2: Passing Multiple Parameters to multiprocessing <code>Pool.map</code></h2>
<p>This problem is very similar to using the regular <code>map()</code>. The only difference is that we need to pass multiple arguments to the multiprocessing's pool map.</p>
<p>Suppose that we want to speed up our code and run <code>sum_four</code> in parallel using processes. </p>
<p>The good news is, you can use the solutions above, with one exception: <code>Pool.map</code> only accepts one iterable. This means we cannot use <code>repeat()</code> here. Let's see the alternatives.</p>
<h3 id="using-poolstarmap">Using <code>pool.starmap</code></h3>
<p>The <code>Pool</code> class from <code>multiprocessing</code> module implements a <code>starmap</code> function that works the same way as its counterpart from the <code>itertools</code> module. </p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> multiprocessing <span class="hljs-keyword">import</span> Pool

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> itertools

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sum_four</span>(<span class="hljs-params">a, b, c, d</span>):</span>
                <span class="hljs-keyword">return</span> a + b + c + d

<span class="hljs-meta">&gt;&gt;&gt; </span>a, b, c = <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>all_d_values = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span>items = [(a, b, c, d) <span class="hljs-keyword">for</span> d <span class="hljs-keyword">in</span> all_d_values]

<span class="hljs-meta">&gt;&gt;&gt; </span>items
 [(<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">1</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">2</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">3</span>), (<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>)]

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> Pool(processes=<span class="hljs-number">4</span>) <span class="hljs-keyword">as</span> pool:
         res = pool.starmap(sum_four, items)

<span class="hljs-meta">&gt;&gt;&gt; </span>res
 [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">10</span>]
</code></pre>
<h3 id="using-partial">Using <code>partial()</code></h3>
<p>As alternative, we can also rely on the good <code>partial</code> function.</p>
<pre><code class="lang-python">
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> functools

<span class="hljs-meta">&gt;&gt;&gt; </span>partial_sum_four = functools.partial(sum_four, a, b, c)

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> Pool(processes=<span class="hljs-number">4</span>) <span class="hljs-keyword">as</span> pool:
         res = pool.map(partial_sum_four, all_d_values)

<span class="hljs-meta">&gt;&gt;&gt; </span>res
 [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">10</span>]
</code></pre>
<h2 id="problem-3-how-to-pass-multiple-arguments-to-concurrent-futures-executormap">Problem 3: How to Pass Multiple Arguments to concurrent futures <code>Executor.map</code>?</h2>
<p>The <a target="_blank" href="https://docs.python.org/3/library/concurrent.futures.html"><code>concurrent.futures</code></a> module provides a high-level interface called <code>Executor</code> to run callables  asynchronously. </p>
<p>There are two different implementations available, a <code>ThreadPoolExecutor</code> and a <code>ProcessPoolExecutor</code>. </p>
<p>Contrary to <code>multiprocessing.Pool</code>, a <code>Executor</code> does not have a <code>startmap()</code> function. However, its <code>map()</code> implementation supports multiple iterables, which allow us to use <code>repeat()</code>. Another difference is that <code>Executor.map</code> returns a generator, not a list.</p>
<h3 id="using-partial-with-a-processpoolexecutor-or-threadpoolexecutor">Using <code>partial()</code> With a ProcessPoolExecutor (or ThreadPoolExecutor)</h3>
<p>By "freezing" the arguments using <code>partial</code> we use the <code>map</code> method from <code>ProcessPoolExecutor</code> like a regular map function. Since they both share the same interface, you can do the same interchangeably with a <code>ThreadPoolExecutor</code></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> concurrent.futures <span class="hljs-keyword">import</span> ProcessPoolExecutor

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> functools

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sum_four</span>(<span class="hljs-params">a, b, c, d</span>):</span>
                <span class="hljs-keyword">return</span> a + b + c + d

<span class="hljs-meta">&gt;&gt;&gt; </span>a, b, c = <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>all_d_values = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span>partial_sum_four = functools.partial(sum_four, a, b, c)

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> ProcessPoolExecutor(max_workers=<span class="hljs-number">4</span>) <span class="hljs-keyword">as</span> pool:
              res = list(pool.map(partial_sum_four, all_d_values))

<span class="hljs-meta">&gt;&gt;&gt; </span>res
 [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">10</span>]
</code></pre>
<h3 id="using-repeat">Using <code>repeat()</code></h3>
<p>Again, we can just use <code>itertools.repeat</code> to get the job done like the previous solutions.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> concurrent.futures <span class="hljs-keyword">import</span> ProcessPoolExecutor

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> itertools <span class="hljs-keyword">import</span> repeat

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">sum_four</span>(<span class="hljs-params">a, b, c, d</span>):</span>
                <span class="hljs-keyword">return</span> a + b + c + d

<span class="hljs-meta">&gt;&gt;&gt; </span>a, b, c = <span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>all_d_values = [<span class="hljs-number">1</span>, <span class="hljs-number">2</span>, <span class="hljs-number">3</span>, <span class="hljs-number">4</span>]

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">with</span> ProcessPoolExecutor(max_workers=<span class="hljs-number">4</span>) <span class="hljs-keyword">as</span> pool:
              res = list(pool.map(sum_four, repeat(a), repeat(b), repeat(c), all_d_values))

<span class="hljs-meta">&gt;&gt;&gt; </span>res
 [<span class="hljs-number">7</span>, <span class="hljs-number">8</span>, <span class="hljs-number">9</span>, <span class="hljs-number">10</span>]
</code></pre>
<h2 id="conclusion">Conclusion</h2>
<p>That’s it for today, folks! I hope you’ve learned something different and useful. The <code>map()</code> function makes Python feel like a functional programming language. <code>map()</code> is available not only as a built-in function but also as methods in the <code>multiprocessing</code> and <code>concurrent.futures</code> module. In this article, I showed what I do to map functions that take several arguments. </p>
<p>Other posts you may like:</p>
<ul>
<li><a target="_blank" href="https://miguendes.me/how-to-use-datetimetimedelta-in-python-with-examples">How to Use datetime.timedelta in Python With Examples</a></li>
<li><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">73 Examples to Help You Master Python's f-strings</a></li>
<li><a target="_blank" href="https://miguendes.me/how-to-check-if-an-exception-is-raised-or-not-with-pytest">How to Check if an Exception Is Raised (or Not) With pytest</a></li>
<li><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">3 Ways to Test API Client Applications in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></li>
<li><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/5-hidden-python-features-you-probably-never-heard-of">5 Hidden Python Features You Probably Never Heard Of</a></li>
</ul>
]]></content:encoded></item><item><title><![CDATA[How to Use Fixtures as Arguments in pytest.mark.parametrize]]></title><description><![CDATA[TL;DR
Time is a precious resource so I won't waste yours. In this post, you'll learn how to use a pytest fixture in parametrize using a library or getfixturevalue.
Introduction
In this post, we'll see how we can use pytest.mark.parametrize with fixtu...]]></description><link>https://miguendes.me/how-to-use-fixtures-as-arguments-in-pytestmarkparametrize</link><guid isPermaLink="true">https://miguendes.me/how-to-use-fixtures-as-arguments-in-pytestmarkparametrize</guid><category><![CDATA[Python]]></category><category><![CDATA[pytest]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[100DaysOfCode]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 21 Nov 2020 10:43:35 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1605649072788/Wzu1TaZWH.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-tldr">TL;DR</h2>
<p>Time is a precious resource so I won't waste yours. In this post, you'll learn how to use a pytest fixture in parametrize using a library or getfixturevalue.</p>
<h2 id="heading-introduction">Introduction</h2>
<p>In this post, we'll see how we can use <code>pytest.mark.parametrize</code> with fixtures. This is a <a target="_blank" href="https://github.com/pytest-dev/pytest/issues/349">long-wanted feature</a> that dates back to 2013. Even though <code>pytest</code> doesn't support it yet, you'll see that we can actually make it happen.</p>
<h2 id="heading-problem">Problem</h2>
<h3 id="heading-you-want-to-pass-a-fixture-to-parametrize">You want to pass a fixture to parametrize.</h3>
<p>Suppose that you have a simple function called <code>is_even(n)</code> that returns true if <code>n</code> is divisible by 2. Then you create a simple test for it that receives a fixture named <code>two</code> that returns 2. To make the test more robust, you set up another fixture named <code>four</code> that returns 4. Now you have two individual tests, as illustrated below.</p>
<p>Implementation:</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">is_even</span>(<span class="hljs-params">n: int</span>) -&gt; bool:</span>
    <span class="hljs-string">"""Returns True if n is even."""</span>
    <span class="hljs-keyword">return</span> n % <span class="hljs-number">2</span> == <span class="hljs-number">0</span>
</code></pre>
<p>Tests:</p>
<pre><code class="lang-python"><span class="hljs-meta">@pytest.fixture()</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">two</span>():</span>
    <span class="hljs-keyword">return</span> <span class="hljs-number">2</span>

<span class="hljs-meta">@pytest.fixture()</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">four</span>():</span>
    <span class="hljs-keyword">return</span> <span class="hljs-number">4</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_four_is_even</span>(<span class="hljs-params">four</span>):</span>
    <span class="hljs-string">"""Asserts that four is even"""</span>
    <span class="hljs-keyword">assert</span> is_even(four)

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_two_is_even</span>(<span class="hljs-params">two</span>):</span>
    <span class="hljs-string">"""Asserts that two is even"""</span>
    <span class="hljs-keyword">assert</span> is_even(two)
</code></pre>
<p>If we run these tests, they pass, which is good. Even though you’re quite happy with the outcome, you need to test one more thing. You want to assert that  <a target="_blank" href="https://proofwiki.org/wiki/Odd_Number_multiplied_by_Even_Number_is_Even">the multiplication of an even number by and odd one produces an even result</a>. To accomplish that, you create two more fixtures, <code>one</code> and <code>three</code>. You plan to use them as arguments in a parameterized test, like so:</p>
<pre><code class="lang-python"><span class="hljs-meta">@pytest.fixture()</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">one</span>():</span>
    <span class="hljs-keyword">return</span> <span class="hljs-number">1</span>

<span class="hljs-meta">@pytest.fixture()</span>
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">three</span>():</span>
    <span class="hljs-keyword">return</span> <span class="hljs-number">3</span>

<span class="hljs-meta">@pytest.mark.parametrize(</span>
    <span class="hljs-string">"a, b"</span>,
    [
        (one, four),
        (two, three),
    ],
)
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_multiply_is_even</span>(<span class="hljs-params">a, b</span>):</span>
    <span class="hljs-string">"""Assert that an odd number times even is even."""</span>
    <span class="hljs-keyword">assert</span> is_even(a * b)
</code></pre>
<p>When we run this test, we get the following output:</p>
<pre><code class="lang-console">_______________________ test_multiply_is_even[two-three] _______________________

a = &lt;function two at 0x7f9d862ee790&gt;, b = &lt;function three at 0x7f9d862eedc0&gt;

    @pytest.mark.parametrize(
        "a, b",
        [
            (one, four),
            (two, three),
        ],
    )
    def test_multiply_is_even(a, b):
        """Assert that an odd number times even is even."""
&gt;       assert is_even(a * b)
E       TypeError: unsupported operand type(s) for *: 'function' and 'function'

tests/test_variables.py:71: TypeError
=========================== short test summary info ============================
FAILED tests/test_variables.py::test_multiply_is_even[one-four] - TypeError: ...
FAILED tests/test_variables.py::test_multiply_is_even[two-three] - TypeError:...
============================== 2 failed in 0.05s ===============================
</code></pre>
<p>As you can see, passing a fixture as argument in a parameterized test doesn't work. </p>
<h2 id="heading-solution">Solution</h2>
<p>To make that possible, we have two alternatives. The first one is using <code>request.getfixturevalue</code>, which is available on <code>pytest</code>. This function dynamically runs a named fixture function.</p>
<pre><code class="lang-python"><span class="hljs-meta">@pytest.mark.parametrize(</span>
    <span class="hljs-string">"a, b"</span>,
    [
        (<span class="hljs-string">"one"</span>, <span class="hljs-string">"four"</span>),
        (<span class="hljs-string">"two"</span>, <span class="hljs-string">"three"</span>),
    ],
)
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_multiply_is_even_request</span>(<span class="hljs-params">a, b, request</span>):</span>
    <span class="hljs-string">"""Assert that an odd number times even is even."""</span>
    a = request.getfixturevalue(a)
    b = request.getfixturevalue(b)
    <span class="hljs-keyword">assert</span> is_even(a * b)
</code></pre>
<p>If we run the test again we get the following:</p>
<pre><code class="lang-console">============================= test session starts ==============================
...
collecting ... collected 2 items

tests/test_variables.py::test_multiply_is_even_request[one-four] PASSED  [ 50%]
tests/test_variables.py::test_multiply_is_even_request[two-three] PASSED [100%]

============================== 2 passed in 0.02s ===============================

Process finished with exit code 0
</code></pre>
<p>Great! It works like a charm. However, there’s one more alternative, and for that we’ll need a third-party package called <a target="_blank" href="https://github.com/tvorog/pytest-lazy-fixture"><code>pytest-lazy-fixture</code></a>. Let’s see how the test looks like using this lib.</p>
<pre><code class="lang-python"><span class="hljs-meta">@pytest.mark.parametrize(</span>
    <span class="hljs-string">"a, b"</span>,
    [
        (pytest.lazy_fixture((<span class="hljs-string">"one"</span>, <span class="hljs-string">"four"</span>))),
        <span class="hljs-comment"># same as (pytest.lazy_fixture(("two", "three")))</span>
        (pytest.lazy_fixture(<span class="hljs-string">"two"</span>), pytest.lazy_fixture(<span class="hljs-string">"three"</span>)), 
    ],
)
<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">test_multiply</span>(<span class="hljs-params">a, b</span>):</span>
    <span class="hljs-string">"""Assert that an odd number times even is even."""</span>
    <span class="hljs-keyword">assert</span> is_even(a * b)
</code></pre>
<p>In this example, we use it by passing a tuple with the fixtures names or passing each one of them as a different argument. When we run this test, we can see it passes!</p>
<pre><code class="lang-console">============================= test session starts ==============================
...
collecting ... collected 2 items

tests/test_variables.py::test_multiply[one-four] PASSED                  [ 50%]
tests/test_variables.py::test_multiply[two-three] PASSED                 [100%]

============================== 2 passed in 0.02s ===============================

Process finished with exit code 0
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>That’s it for today, folks! I hope you’ve learned something different and useful. Being able to reuse fixtures in parametrized tests is a must when we want to avoid repetition. Unfortunately, <code>pytest</code> doesn’t support that yet. On the other hand, we can make it happen either by using <code>getfixturevalue</code> in <code>pytest</code> or through a third-party library. </p>
<p>Other posts you may like:</p>
<ul>
<li><p><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">Learn how to unit test REST APIs in Python with Pytest by example.</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/7-pytest-features-and-plugins-that-will-save-you-tons-of-time">7 pytest Features and Plugins That Will Save You Tons of Time</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-use-fixtures-as-arguments-in-pytestmarkparametrize">How to Use Fixtures as Arguments in pytest.mark.parametrize</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-check-if-an-exception-is-raised-or-not-with-pytest">How to Check if an Exception Is Raised (or Not) With pytest</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/7-pytest-plugins-you-must-definitely-use">7 pytest Plugins You Must Definitely Use</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/pytest-disable-autouse">How to Disable Autouse Fixtures in pytest</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/how-to-use-datetimetimedelta-in-python-with-examples">How to Use datetime.timedelta in Python With Examples</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">73 Examples to Help You Master Python's f-strings</a></p>
</li>
<li><p><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></p>
</li>
</ul>
<p>See you next time!</p>
<p>This post was originally published at <a target="_blank" href="https://miguendes.me/how-to-use-fixtures-as-arguments-in-pytestmarkparametrize">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[How to Use datetime.timedelta in Python With Examples]]></title><description><![CDATA[In this tutorial, you'll learn how to use datetime.timedelta to perform date arithmetic. 
With timedelta you can add days, minutes, seconds, hours, weeks and more to a datetime.date, or a datetime.datetime object.
You'll also learn how to:

convert a...]]></description><link>https://miguendes.me/how-to-use-datetimetimedelta-in-python-with-examples</link><guid isPermaLink="true">https://miguendes.me/how-to-use-datetimetimedelta-in-python-with-examples</guid><category><![CDATA[Python]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[Beginner Developers]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 14 Nov 2020 11:09:20 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1605351339752/FKX87gkL4.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>In this tutorial, you'll learn how to use <code>datetime.timedelta</code> to perform date arithmetic. </p>
<p>With <code>timedelta</code> you can add days, minutes, seconds, hours, weeks and more to a <code>datetime.date</code>, or a <code>datetime.datetime</code> object.</p>
<p>You'll also learn how to:</p>
<ul>
<li>convert a <code>timedelta</code> to seconds, minutes, hours, or days</li>
<li>convert a time delta to years</li>
<li>how to take the difference between two dates</li>
<li>how to format a time delta as string</li>
</ul>
<p>Let's go!</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ul>
<li><a class="post-section-overview" href="#adding-seconds-minutes-hours-days-weeks-and-whatnot-to-a-date">Adding Seconds, Minutes, Hours, Days, Weeks and Whatnot to a Date</a><ul>
<li><a class="post-section-overview" href="#how-to-use-timedelta-to-add-days-to-a-date-or-datetime-object">How to Use <code>timedelta</code> to Add Days to a <code>date</code> or <code>datetime</code> Object</a></li>
<li><a class="post-section-overview" href="#how-to-use-timedelta-to-add-minutes-to-a-datetime-object">How to Use <code>timedelta</code> to Add Minutes to a <code>datetime</code> Object</a></li>
<li><a class="post-section-overview" href="#how-to-use-timedelta-to-add-weeks-hours-seconds-milliseconds-to-a-datetime">How to Use <code>timedelta</code> to Add Weeks, Hours, Seconds, Milliseconds to a <code>datetime</code></a></li>
<li><a class="post-section-overview" href="#how-to-add-years-to-a-datetime-in-python">How to Add Years to a <code>datetime</code> in Python</a></li>
<li><a class="post-section-overview" href="#how-to-add-months-to-a-datetime-in-python">How to Add Months to a <code>datetime</code> in Python</a></li>
</ul>
</li>
<li><a class="post-section-overview" href="#how-to-convert-a-timedelta-to-seconds-minutes-hours-or-days">How to convert a <code>timedelta</code> to seconds, minutes, hours, or days</a></li>
<li><a class="post-section-overview" href="#how-to-take-the-difference-between-two-dates">How to Take the Difference Between Two Dates</a><ul>
<li><a class="post-section-overview" href="#how-to-calculate-the-number-of-days-between-two-dates">How to Calculate the Number of Days Between Two Dates</a></li>
<li><a class="post-section-overview" href="#how-to-calculate-the-number-of-minutes-between-two-dates">How to Calculate the Number of Minutes Between Two Dates</a></li>
</ul>
</li>
<li><a class="post-section-overview" href="#how-to-format-a-timedelta-as-string">How to Format a <code>timedelta</code> as string</a></li>
<li><a class="post-section-overview" href="#conclusion">Conclusion</a></li>
</ul>
<h2 id="heading-adding-seconds-minutes-hours-days-weeks-and-whatnot-to-a-date">Adding Seconds, Minutes, Hours, Days, Weeks and Whatnot to a Date</h2>
<p>A <code>timedelta</code> object denotes a duration, it can also represent the difference between two dates or times. </p>
<p>We can use this object to add to or subtract a duration from a <code>date</code>, and it defines its constructor as <code>datetime.timedelta(days=0, seconds=0, microseconds=0, milliseconds=0, minutes=0, hours=0, weeks=0)</code>. As you can see, all arguments are optional and default to 0. It can take <code>int</code>s or <code>float</code>s, positive or negative. </p>
<p>Even though you can pass weeks, hours, minutes and milliseconds only days, seconds, and microseconds are stored internally.</p>
<p>In this section, we'll see basic arithmetic operations such as adding/subtracting a duration to/from a <code>date</code>.</p>
<h3 id="heading-how-to-use-timedelta-to-add-days-to-a-date-or-datetime-object">How to Use <code>timedelta</code> to Add Days to a <code>date</code> or <code>datetime</code> Object</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1605300869825/jVPk-Xr_V.png" alt="how to add days to a datetime object in Python" /></p>
<p>Since <code>timedelta</code> represents a duration, we can use it to add days to a <code>datetime</code>. The number of can be positive or negative, thus allowing us to create a date in the future or in the past. The code snippet below shows an example.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span>now = datetime.datetime.now()

<span class="hljs-meta">&gt;&gt;&gt; </span>now
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">3</span>, <span class="hljs-number">22</span>, <span class="hljs-number">5</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979147</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> timedelta

<span class="hljs-meta">&gt;&gt;&gt; </span>now + timedelta(days=<span class="hljs-number">3</span>)
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">6</span>, <span class="hljs-number">22</span>, <span class="hljs-number">5</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979147</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + timedelta(days=<span class="hljs-number">-3</span>)
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">10</span>, <span class="hljs-number">31</span>, <span class="hljs-number">22</span>, <span class="hljs-number">5</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979147</span>)
</code></pre>
<p>As you can see, adding a positive number of days yields a future date whereas adding a negative number brings the date to the past.</p>
<p>If you want to add days to a <code>date</code> object, the process is the same.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>today = datetime.date.today()

<span class="hljs-meta">&gt;&gt;&gt; </span>today
datetime.date(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">5</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>today + timedelta(days=<span class="hljs-number">3</span>)
datetime.date(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">8</span>)
</code></pre>
<h3 id="heading-how-to-use-timedelta-to-add-minutes-to-a-datetime-object">How to Use <code>timedelta</code> to Add Minutes to a <code>datetime</code> Object</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1605300881611/IuQ-eTE7r.png" alt="image describing how to add minutes to a datetime object in Python" /></p>
<p>Since <code>timedelta</code> object sets all arguments to 0 by default, we have the option to set only the ones we need. This allows us to add only minutes, for instance.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>now
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">3</span>, <span class="hljs-number">22</span>, <span class="hljs-number">5</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979147</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + timedelta(minutes=<span class="hljs-number">3</span>)
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">3</span>, <span class="hljs-number">22</span>, <span class="hljs-number">8</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979147</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + timedelta(minutes=<span class="hljs-number">-3</span>)
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">3</span>, <span class="hljs-number">22</span>, <span class="hljs-number">2</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979147</span>)
</code></pre>
<h3 id="heading-how-to-use-timedelta-to-add-weeks-hours-seconds-milliseconds-to-a-datetime">How to Use <code>timedelta</code> to Add Weeks, Hours, Seconds, Milliseconds to a <code>datetime</code></h3>
<p>Adding weeks, seconds, milliseconds and even microseconds works in a similar fashion. </p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>now + timedelta(weeks=<span class="hljs-number">3</span>)
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">24</span>, <span class="hljs-number">22</span>, <span class="hljs-number">5</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979147</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + timedelta(hours=<span class="hljs-number">3</span>)
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">4</span>, <span class="hljs-number">1</span>, <span class="hljs-number">5</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979147</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + timedelta(microseconds=<span class="hljs-number">3</span>)
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">3</span>, <span class="hljs-number">22</span>, <span class="hljs-number">5</span>, <span class="hljs-number">21</span>, <span class="hljs-number">979150</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + timedelta(milliseconds=<span class="hljs-number">3</span>)
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">3</span>, <span class="hljs-number">22</span>, <span class="hljs-number">5</span>, <span class="hljs-number">21</span>, <span class="hljs-number">982147</span>)
</code></pre>
<h3 id="heading-how-to-add-years-to-a-datetime-in-python">How to Add Years to a <code>datetime</code> in Python</h3>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1605300900657/xnhLg69XK.png" alt="image describing how to add years to a datetime object in Python" /></p>
<p>It's definitely possible to use <code>timedelta</code> to add years to a <code>datetime</code>, but some things can go wrong and it's easy to shoot yourself in the foot. For example, you need to take into account leap years yourself. </p>
<p>IMHO, the best way to add a certain number of years to a <code>datetime</code> is by using the <a target="_blank" href="https://pypi.org/project/python-dateutil/"><code>dateutil</code></a> library.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> dateutil.relativedelta <span class="hljs-keyword">import</span> relativedelta

<span class="hljs-meta">&gt;&gt;&gt; </span>now = datetime.datetime.now()

<span class="hljs-meta">&gt;&gt;&gt; </span>now
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">4</span>, <span class="hljs-number">22</span>, <span class="hljs-number">9</span>, <span class="hljs-number">5</span>, <span class="hljs-number">672091</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + relativedelta(years=<span class="hljs-number">2</span>)
datetime.datetime(<span class="hljs-number">2022</span>, <span class="hljs-number">11</span>, <span class="hljs-number">4</span>, <span class="hljs-number">22</span>, <span class="hljs-number">9</span>, <span class="hljs-number">5</span>, <span class="hljs-number">672091</span>)
</code></pre>
<h3 id="heading-how-to-add-months-to-a-datetime-in-python">How to Add Months to a <code>datetime</code> in Python</h3>
<p>Adding months to a <code>datetime</code> has the same problem as adding years using <code>timedelta</code>. This feature is not supported by default and requires manual calculation. You can use days, but you’d need to know how many days that month has and so. In a nutshell, it’s too error prone. Again, the best you can do is to use <code>dateutil.relativedelta</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> dateutil.relativedelta <span class="hljs-keyword">import</span> relativedelta
<span class="hljs-meta">&gt;&gt;&gt; </span>now
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">4</span>, <span class="hljs-number">22</span>, <span class="hljs-number">9</span>, <span class="hljs-number">5</span>, <span class="hljs-number">672091</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + relativedelta(years=<span class="hljs-number">2</span>)
datetime.datetime(<span class="hljs-number">2022</span>, <span class="hljs-number">11</span>, <span class="hljs-number">4</span>, <span class="hljs-number">22</span>, <span class="hljs-number">9</span>, <span class="hljs-number">5</span>, <span class="hljs-number">672091</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + relativedelta(months=<span class="hljs-number">12</span>)
datetime.datetime(<span class="hljs-number">2021</span>, <span class="hljs-number">11</span>, <span class="hljs-number">4</span>, <span class="hljs-number">22</span>, <span class="hljs-number">9</span>, <span class="hljs-number">5</span>, <span class="hljs-number">672091</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now + relativedelta(months=<span class="hljs-number">24</span>)
datetime.datetime(<span class="hljs-number">2022</span>, <span class="hljs-number">11</span>, <span class="hljs-number">4</span>, <span class="hljs-number">22</span>, <span class="hljs-number">9</span>, <span class="hljs-number">5</span>, <span class="hljs-number">672091</span>)
</code></pre>
<h2 id="heading-how-to-convert-a-timedelta-to-seconds-minutes-hours-or-days">How to convert a <code>timedelta</code> to seconds, minutes, hours, or days</h2>
<p>A <code>timedelta</code>object allows adding a delta to a <code>datetime</code> but sometimes is useful to convert it into a single time unit, such as seconds, or minutes.</p>
<p>In this section, we'll explore how to do that.</p>
<h3 id="heading-how-to-convert-a-timedelta-to-seconds">How to convert a <code>timedelta</code> to seconds</h3>
<p>A <code>timedelta</code> has only one method called <code>timedelta.total_seconds()</code>. This method returns the total number of seconds the duration has. If we want to convert a <code>timedelta</code> object to seconds, we can just call it.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span>delta = datetime.timedelta(days=<span class="hljs-number">1</span>, seconds=<span class="hljs-number">34</span>)

<span class="hljs-comment"># a day has 24h, each hour has 60min of 60s = 24*60*60 = 86400</span>
<span class="hljs-comment"># 86400s + 34s = 86434s</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta.total_seconds()
<span class="hljs-number">86434.0</span>
</code></pre>
<blockquote>
<p>What is the difference between <code>total_seconds()</code> and <code>timedelta.seconds</code>? </p>
</blockquote>
<p><code>total_seconds()</code>—as its name implies—corresponds to total seconds within the <strong><em>whole duration</em></strong>. On the flip side, <code>timedelta.seconds</code> is an internal property that represents the number of seconds <strong><em>within a day</em></strong>. </p>
<p>To be more precise, <code>timedelta.seconds</code> stores the seconds if it is less than a day, that is, from 0 to 86399. Otherwise, if the number of seconds is greater than 86399, <code>timedelta</code> converts this number to days, or weeks as you'll see in the next example. </p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span>delta = datetime.timedelta(seconds=<span class="hljs-number">34</span>)

<span class="hljs-comment"># Delta is withtin 0 and 86399, so delta.seconds returns that number</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta.seconds
<span class="hljs-number">34</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>delta = datetime.timedelta(days=<span class="hljs-number">1</span>, seconds=<span class="hljs-number">34</span>)

<span class="hljs-comment"># 1 days + 34s = 86434s, so it overflows to days</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta.seconds
<span class="hljs-number">0</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta.days
<span class="hljs-number">1</span>
<span class="hljs-comment"># total_seconds returns 1 day in seconds + 34s = 86434s</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta.total_seconds()
<span class="hljs-number">86434.0</span>
</code></pre>
<h3 id="heading-how-to-convert-a-timedelta-to-minutes">How to convert a <code>timedelta</code> to minutes</h3>
<p>To convert a <code>timedelta</code> to minutes you need to use a bit of math. Unfortunately, <code>timedelta</code> does not provide any way of accessing the number of minutes in a duration. In the end, you need to do the conversion yourself.</p>
<p>There are two different ways of doing this conversion:</p>
<ul>
<li>the first one you divide the <code>total_seconds()</code> by the number of seconds in a minute, which is 60</li>
<li>the second approach, you divide the <code>timedelta</code> object by <code>timedelta(minutes=1)</code></li>
</ul>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span>delta = datetime.timedelta(hours=<span class="hljs-number">3</span>, minutes=<span class="hljs-number">13</span>, seconds=<span class="hljs-number">34</span>)

<span class="hljs-comment"># there's NO minutes in a time delta object</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta.minutes
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-31</span>-b45e912051b9&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> delta.minutes

AttributeError: <span class="hljs-string">'datetime.timedelta'</span> object has no attribute <span class="hljs-string">'minutes'</span>

<span class="hljs-comment"># we set a variable to represent the number of seconds in a minute</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>NUM_SECONDS_IN_A_MIN = <span class="hljs-number">60</span>

<span class="hljs-comment"># we then divide the total seconds by the number of seconds in a minute</span>
<span class="hljs-comment"># this gives us around 193 minutes</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta.total_seconds() / NUM_SECONDS_IN_A_MIN
<span class="hljs-number">193.56666666666666</span>

<span class="hljs-comment"># alternatively, we use the divide operator</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta / datetime.timedelta(minutes=<span class="hljs-number">1</span>)
<span class="hljs-number">193.56666666666666</span>
</code></pre>
<h3 id="heading-how-to-convert-a-timedelta-to-hours">How to convert a <code>timedelta</code> to hours</h3>
<p>We can follow the same logic to convert a <code>timedelta</code> to hours. Instead of dividing the <code>total_seconds()</code> by the number of seconds in a minute, or dividing the<code>timedelta</code> object by <code>timedelta(minutes=1)</code>, we do it for hour.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span>delta = datetime.timedelta(hours=<span class="hljs-number">3</span>, minutes=<span class="hljs-number">13</span>, seconds=<span class="hljs-number">34</span>)

delta.hours
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-3</span><span class="hljs-number">-8</span>c2202cab691&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> delta.hours

AttributeError: <span class="hljs-string">'datetime.timedelta'</span> object has no attribute <span class="hljs-string">'hours'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>delta / datetime.timedelta(hours=<span class="hljs-number">1</span>)
<span class="hljs-number">3.226111111111111</span>
</code></pre>
<h3 id="heading-how-to-convert-a-timedelta-to-days">How to convert a <code>timedelta</code> to days</h3>
<p>Converting a <code>timedelta</code> to days is easier, and less confusing, than seconds. According to the <a target="_blank" href="https://docs.python.org/3/library/datetime.html#datetime.timedelta">docs</a>, only days, seconds and microseconds are stored internally. To get the number of days in a time delta, just use the <code>timedelta.days</code>.</p>
<blockquote>
<p>⚠️ WARNING: <code>timedelta.days</code> is an internal property that is not listed in the docs, so it's not a good idea to rely on it. A more robust approach is to divide the time delta object by <code>datetime.timedelta(days=1)</code>.</p>
</blockquote>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span>delta = datetime.timedelta(weeks=<span class="hljs-number">2</span>, days=<span class="hljs-number">3</span>, seconds=<span class="hljs-number">34</span>)

<span class="hljs-comment"># 1 week has 7 days, so 2 weeks has 14 days. 2 weeks + 3 days = 17 days</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta.days
<span class="hljs-number">17</span>

<span class="hljs-comment"># if you want the days including the fraction of seconds, divide it by timedelta(days=1)</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta / datetime.timedelta(days=<span class="hljs-number">1</span>)
<span class="hljs-number">17.000393518518518</span>
</code></pre>
<h3 id="heading-how-to-convert-a-timedelta-to-years">How to convert a <code>timedelta</code> to years</h3>
<p>If you've been following this guide since the beginning you might have started to pick up a pattern. However, I have bad news. </p>
<p>In the <a class="post-section-overview" href="#what-is-timedelta-in-python">"what is timedelta?"</a> section, I mentioned that you can create a <code>timedelta</code> by passing a combination of days, seconds, microseconds, milliseconds, minutes, hours, and weeks. </p>
<p>By default, <code>timedelta</code> doesn't support years. To do that we would need to calculate how many weeks there is in how many years we want to pass. It's definitely possible to use <code>timedelta</code> to add years to a <code>datetime</code>, but some things can go wrong. For example, you need to take into account leap years yourself.</p>
<p>The idea here is to create a variable that holds the number of seconds in a year. A full year has 365 days, but to account for the leap years, we add 0.25 to it, so 365.25. Each day has 24 hours of 60 min, and each minute has 60s. Multiply everything and you get the number of seconds in a year.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-comment"># 1 year has 52 weeks, so we create a delta of 2 years with 2*52</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>delta = datetime.timedelta(weeks=<span class="hljs-number">2</span>*<span class="hljs-number">52</span>, days=<span class="hljs-number">3</span>, seconds=<span class="hljs-number">34</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>delta
datetime.timedelta(days=<span class="hljs-number">731</span>, seconds=<span class="hljs-number">34</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">timedelta_to_years</span>(<span class="hljs-params">delta: datetime.timedelta</span>) -&gt; float:</span>
        seconds_in_year = <span class="hljs-number">365.25</span>*<span class="hljs-number">24</span>*<span class="hljs-number">60</span>*<span class="hljs-number">60</span>
        <span class="hljs-keyword">return</span> delta.total_seconds() / seconds_in_year

<span class="hljs-meta">&gt;&gt;&gt; </span>timedelta_to_years(delta)
<span class="hljs-number">2.0013700027885517</span>

<span class="hljs-comment"># round to int, if you don't care about the fraction</span>
<span class="hljs-meta">&gt;&gt;&gt; </span>int(timedelta_to_years(delta))
<span class="hljs-number">2</span>
</code></pre>
<p>Another alternative—to me the best one—is to get a delta duration in years is by using the <a target="_blank" href="https://pypi.org/project/python-dateutil/"><code>python-dateutil</code></a> library.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> dateutil.relativedelta <span class="hljs-keyword">import</span> relativedelta

<span class="hljs-meta">&gt;&gt;&gt; </span>delta = relativedelta(years=<span class="hljs-number">2</span>, weeks=<span class="hljs-number">3</span>, months=<span class="hljs-number">1</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>delta
relativedelta(years=+<span class="hljs-number">2</span>, months=+<span class="hljs-number">1</span>, days=+<span class="hljs-number">21</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>delta.years
<span class="hljs-number">2</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>delta = relativedelta(years=<span class="hljs-number">2</span>, weeks=<span class="hljs-number">3</span>, months=<span class="hljs-number">15</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>delta.years
<span class="hljs-number">3</span>
</code></pre>
<h2 id="heading-how-to-take-the-difference-between-two-dates">How to Take the Difference Between Two Dates</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1605300913366/Iz4BP4gD9.png" alt="how to calculate the difference between now and yesterday in Python using datetime timedelta" /></p>
<p>As discussed earlier, <code>timedelta</code> can also represent the difference between two dates. The following sub-sections illustrate how you can do that.</p>
<h3 id="heading-how-to-calculate-the-number-of-days-between-two-dates">How to Calculate the Number of Days Between Two Dates</h3>
<p>To obtain the difference between two <code>datetime</code> objects in days you can use the <code>-</code> operator, for example.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>now = datetime.datetime.now()

<span class="hljs-meta">&gt;&gt;&gt; </span>now
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">3</span>, <span class="hljs-number">22</span>, <span class="hljs-number">36</span>, <span class="hljs-number">21</span>, <span class="hljs-number">674967</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>yesterday = datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">2</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now - yesterday
datetime.timedelta(days=<span class="hljs-number">1</span>, seconds=<span class="hljs-number">81381</span>, microseconds=<span class="hljs-number">674967</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>now - yesterday
 datetime.timedelta(days=<span class="hljs-number">1</span>, seconds=<span class="hljs-number">81381</span>, microseconds=<span class="hljs-number">674967</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>(now - yesterday).days
<span class="hljs-number">1</span>
</code></pre>
<h3 id="heading-how-to-calculate-the-number-of-minutes-between-two-dates">How to Calculate the Number of Minutes Between Two Dates</h3>
<p>This one requires more work, and we can achieve it in two different ways. The first one is using <code>divmod</code> and the second one is using <code>timedelta</code>. </p>
<p>According to the  <a target="_blank" href="https://docs.python.org/3/library/functions.html#divmod">docs</a> , <code>divmod</code> takes two (non complex) numbers as arguments and return a pair of numbers consisting of their quotient and remainder when using integer division. In our case, we want to divide the total number of seconds contained in the duration by the number of seconds in one minute, which is 60.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>now = datetime.datetime.now()

<span class="hljs-meta">&gt;&gt;&gt; </span>now
datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">3</span>, <span class="hljs-number">22</span>, <span class="hljs-number">57</span>, <span class="hljs-number">12</span>, <span class="hljs-number">300437</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>yesterday = datetime.datetime(<span class="hljs-number">2020</span>, <span class="hljs-number">11</span>, <span class="hljs-number">2</span>, <span class="hljs-number">22</span>, <span class="hljs-number">57</span>, <span class="hljs-number">12</span>, <span class="hljs-number">300437</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>diff = now - yesterday

<span class="hljs-meta">&gt;&gt;&gt; </span>diff.total_seconds()
<span class="hljs-number">86400.0</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>diff / timedelta(minutes=<span class="hljs-number">1</span>)
<span class="hljs-number">1440.0</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>divmod(diff.total_seconds(), <span class="hljs-number">60</span>)
(<span class="hljs-number">1440.0</span>, <span class="hljs-number">0.0</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>int(diff / timedelta(minutes=<span class="hljs-number">1</span>))
<span class="hljs-number">1440</span>
</code></pre>
<h2 id="heading-how-to-format-a-timedelta-as-string">How to Format a <code>timedelta</code> as string</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1605300923744/8UobTQrl9.png" alt="image showing how to format a timedelta object into a string" /></p>
<p>Sometimes we want to get a string representation of a <code>timedelta</code> object. Even though you can do that by calling <code>str(timedelta_obj)</code>, sometimes the result will not be good. The reason is that it can vary depending on the length of the duration the object represents. For example, take a look at what happens when you try to print different <code>timedelta</code>s.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">from</span> datetime <span class="hljs-keyword">import</span> timedelta

<span class="hljs-meta">&gt;&gt;&gt; </span>timedelta(seconds=<span class="hljs-number">123</span>)
datetime.timedelta(seconds=<span class="hljs-number">123</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>str(timedelta(seconds=<span class="hljs-number">123</span>))
<span class="hljs-string">'0:02:03'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>str(timedelta(seconds=<span class="hljs-number">123456</span>))
<span class="hljs-string">'1 day, 10:17:36'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>str(timedelta(seconds=<span class="hljs-number">1234.56</span>))
<span class="hljs-string">'0:20:34.560000'</span>
</code></pre>
<p>With that in mind, the question is: how can we have a more consistent format?</p>
<p>Sadly, we don’t have many options other than implementing a formatting function ourselves. The good thing is, that’s not so hard. </p>
<p>Suppose we want to print the <code>timedelta</code> in this format: <code>[N days] %H:%M:%S</code>. One way to do that is using python’s f-strings.</p>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">format_timedelta</span>(<span class="hljs-params">delta: timedelta</span>) -&gt; str:</span>
    <span class="hljs-string">"""Formats a timedelta duration to [N days] %H:%M:%S format"""</span>
    seconds = int(delta.total_seconds())

    secs_in_a_day = <span class="hljs-number">86400</span>
    secs_in_a_hour = <span class="hljs-number">3600</span>
    secs_in_a_min = <span class="hljs-number">60</span>

    days, seconds = divmod(seconds, secs_in_a_day)
    hours, seconds = divmod(seconds, secs_in_a_hour)
    minutes, seconds = divmod(seconds, secs_in_a_min)

    time_fmt = <span class="hljs-string">f"<span class="hljs-subst">{hours:<span class="hljs-number">02</span>d}</span>:<span class="hljs-subst">{minutes:<span class="hljs-number">02</span>d}</span>:<span class="hljs-subst">{seconds:<span class="hljs-number">02</span>d}</span>"</span>

    <span class="hljs-keyword">if</span> days &gt; <span class="hljs-number">0</span>:
        suffix = <span class="hljs-string">"s"</span> <span class="hljs-keyword">if</span> days &gt; <span class="hljs-number">1</span> <span class="hljs-keyword">else</span> <span class="hljs-string">""</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"<span class="hljs-subst">{days}</span> day<span class="hljs-subst">{suffix}</span> <span class="hljs-subst">{time_fmt}</span>"</span>

    <span class="hljs-keyword">return</span> time_fmt
</code></pre>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>format_timedelta(timedelta(hours=<span class="hljs-number">23</span>, seconds=<span class="hljs-number">3809</span>))
<span class="hljs-string">'1 day 00:03:29'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>format_timedelta(timedelta(hours=<span class="hljs-number">23</span>))
<span class="hljs-string">'23:00:00'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>format_timedelta(timedelta(hours=<span class="hljs-number">25</span>))
<span class="hljs-string">'1 day 01:00:00'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>format_timedelta(timedelta(hours=<span class="hljs-number">48</span>, seconds=<span class="hljs-number">3700</span>))
<span class="hljs-string">'2 days 01:01:40'</span>
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>That’s it for today, folks! I hope you’ve learned something different and useful. Knowing how to perform date calculations such as addition and subtraction is very important. The <code>timedelta</code> object is good enough for most situations but if you need more complex operations go for the <code>dateutil</code> library. </p>
<p>Other posts you may like:</p>
<ul>
<li><a target="_blank" href="https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings">73 Examples to Help You Master Python's f-strings</a></li>
<li><a target="_blank" href="https://miguendes.me/how-to-check-if-an-exception-is-raised-or-not-with-pytest">How to Check if an Exception Is Raised (or Not) With pytest</a></li>
<li><a target="_blank" href="https://miguendes.me/3-ways-to-test-api-client-applications-in-python">3 Ways to Unit Test REST APIs in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></li>
<li><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/python-compare-lists">The Best Ways to Compare Two Lists in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/python-compare-strings">How to Compare Two Strings in Python (in 8 Easy Ways)</a></li>
</ul>
<p>See you next time!</p>
<p>This post was originally published at <a target="_blank" href="https://miguendes.me/how-to-use-datetimetimedelta-in-python-with-examples">https://miguendes.me</a></p>
]]></content:encoded></item><item><title><![CDATA[Python F-String: 73 Examples to Help You Master It]]></title><description><![CDATA[Python f-strings are impressive! 
Did you know you can use f-strings to string format almost anything in Python? 
You can use them to format floats, multiline strings, decimal places, objects and even use if-else conditionals within them.
In this pos...]]></description><link>https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings</link><guid isPermaLink="true">https://miguendes.me/73-examples-to-help-you-master-pythons-f-strings</guid><category><![CDATA[Python]]></category><category><![CDATA[Tutorial]]></category><category><![CDATA[Beginner Developers]]></category><category><![CDATA[100DaysOfCode]]></category><dc:creator><![CDATA[Miguel Brito]]></dc:creator><pubDate>Sat, 07 Nov 2020 10:16:55 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1604744163627/06ePVZIjo.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Python f-strings are impressive! </p>
<p>Did you know you can use f-strings to string format almost anything in Python? </p>
<p>You can use them to format floats, multiline strings, decimal places, objects and even use if-else conditionals within them.</p>
<p>In this post, I’ll show you at least 73 examples on how to format strings using Python 3's f-strings. You'll see the many ways you can take advantage of this powerful feature.</p>
<p>By the end of this guide, you'll have mastered:</p>
<ul>
<li>how to use f string to format <strong>float numbers</strong></li>
<li>how to format <strong>multiline string</strong></li>
<li>how to define <strong>decimal places</strong> in a f-string</li>
<li>how to fix invalid syntax errors such as <strong><em>"syntaxerror: f-string: unmatched '['"</em></strong> or <strong><em>f-string: unmatched '('</em></strong></li>
<li>how to use <strong>if else statement</strong> in a f-string</li>
<li>basic string <strong>interpolation formatting</strong> using f-strings </li>
<li>how to <strong>print f-strings</strong></li>
<li>how to effectively add <strong>padding</strong> using fstring</li>
</ul>
<p>Let's go!</p>
<h2 id="heading-table-of-contents">Table of Contents</h2>
<ol>
<li><a class="post-section-overview" href="#what-are-python-f-strings-aka-literal-string-interpolation">What Are Python 3's F-Strings - a.k.a Literal String Interpolation?</a></li>
<li><a class="post-section-overview" href="#how-to-format-strings-in-python-3-the-basics">How to Format Strings in Python 3 - The Basics</a></li>
<li><a class="post-section-overview" href="#limitations">Limitations</a></li>
<li><a class="post-section-overview" href="#how-to-format-an-expression">How to Format an Expression</a></li>
<li><a class="post-section-overview" href="#how-to-use-f-strings-to-debug-your-code">How to Use F-Strings to Debug Your Code</a></li>
<li><a class="post-section-overview" href="#how-to-format-a-multiline-f-string-dealing-with-new-lines-and-variables">How to Format a Multiline F-String (Dealing with New Lines and Variables)</a></li>
<li><a class="post-section-overview" href="#how-to-fix-f-strings-invalid-syntax-error">How to Fix F-String's Invalid Syntax Error</a></li>
<li><a class="post-section-overview" href="#how-to-fix-formatting-a-regular-string-which-could-be-a-f-string">How to Fix "formatting a regular string which could be a f-string"</a></li>
<li><a class="post-section-overview" href="#how-to-format-numbers-in-different-bases">How to Format Numbers in Different Bases</a></li>
<li><a class="post-section-overview" href="#how-to-print-formatted-objects-with-f-strings">How to Print Formatted Objects With F-Strings</a></li>
<li><a class="post-section-overview" href="#how-to-use-f-strings-to-format-a-float">How to Use F-Strings to Format a Float</a></li>
<li><a class="post-section-overview" href="#how-to-format-a-number-as-percentage">How to Format a Number as Percentage</a></li>
<li><a class="post-section-overview" href="#how-to-justify-or-add-padding-to-a-f-string">How to Justify or Add Padding to a F-String</a></li>
<li><a class="post-section-overview" href="#how-to-escape-characters-with-f-string">How to Escape Characters With f-string</a></li>
<li><p><a class="post-section-overview" href="#how-to-add-a-thousand-separator">How to Add a Thousand Separator</a></p>
<p>   15.1. <a class="post-section-overview" href="#how-to-format-a-number-with-commas-as-decimal-separator">How to Format a Number With Commas as Decimal Separator</a></p>
<p>   15.2. <a class="post-section-overview" href="#how-to-format-a-number-with-spaces-as-decimal-separator">How to Format a Number With Spaces as Decimal Separator</a></p>
</li>
<li><a class="post-section-overview" href="#how-to-format-a-number-in-scientific-notation-exponential-notation">How to Format a Number in Scientific Notation (Exponential Notation)</a></li>
<li><a class="post-section-overview" href="#using-if-else-conditional-in-a-f-string">Using <code>if-else</code> Conditional in a F-String</a></li>
<li><a class="post-section-overview" href="#how-to-use-f-string-with-a-dictionary">How to Use F-String With a Dictionary</a></li>
<li><a class="post-section-overview" href="#how-to-concatenate-f-strings">How to Concatenate F-Strings</a></li>
<li><a class="post-section-overview" href="#how-to-format-a-date-with-f-strings">How to Format a Date With F-String</a></li>
<li><a class="post-section-overview" href="#how-to-add-leading-zeros">How to Add Leading Zeros</a></li>
<li><a class="post-section-overview" href="#conclusion">Conclusion</a></li>
</ol>
<h2 id="heading-what-are-python-f-strings-aka-literal-string-interpolation">What Are Python F-Strings - a.k.a Literal String Interpolation?</h2>
<p>String formatting has evolved quite a bit in the history of Python. Before Python 2.6, to format a string, one would either use the <code>%</code> operator, or <code>string.Template</code> module. Some time later, the <code>str.format</code> method came along and added to the language a more flexible and robust way of formatting a string.</p>
<p>Old string formatting with <code>%</code>:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>msg = <span class="hljs-string">'hello world'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'msg: %s'</span> % msg
<span class="hljs-string">'msg: hello world'</span>
</code></pre>
<p>Using <code>string.format</code>:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>msg = <span class="hljs-string">'hello world'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">'msg: {}'</span>.format(msg)
<span class="hljs-string">'msg: hello world'</span>
</code></pre>
<p>To simplify formatting even further, in 2015, Eric Smith proposed the <a target="_blank" href="https://www.python.org/dev/peps/pep-0498/">
PEP 498 -- Literal String Interpolation
</a>, a new way to format a string for python 3.</p>
<p>PEP 498 presented this new string interpolation to be a simple and easy to use alternative to <code>str.format</code>. The only thing required is to put a 'f' before a string. And if you're new to the language, that's what <strong>f</strong> in Python means, it's a new syntax to create formatted strings.</p>
<p>Using f-strings:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>msg = <span class="hljs-string">'hello world'</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'msg: <span class="hljs-subst">{msg}</span>'</span>
<span class="hljs-string">'msg: hello world'</span>
</code></pre>
<p>And that was it! No need to use <code>str.format</code> or <code>%</code>. However, f-strings don’t replace <code>str.format</code> completely. In this guide I’ll show you an example where they are not suitable.</p>
<h2 id="heading-how-to-format-strings-in-python-3-the-basics">How to Format Strings in Python 3 - The Basics</h2>
<p>As I have shown in the previous section, formatting strings in python using f-strings is quite straightforward. The sole requirement is to provide it a valid expression. f-strings can also start with capital <code>F</code> and you can combine with raw strings to produce a formatted output. However, you cannot mix them with bytes <code>b""</code> or <code>"u"</code>.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1603916687914/zA7xXR7UF.png" alt="fig_5.png" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>book = <span class="hljs-string">"The dog guide"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>num_pages = <span class="hljs-number">124</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"The book <span class="hljs-subst">{book}</span> has <span class="hljs-subst">{num_pages}</span> pages"</span>
<span class="hljs-string">'The book The dog guide has 124 pages'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>F<span class="hljs-string">"The book {book} has {num_pages} pages"</span>
<span class="hljs-string">'The book The dog guide has 124 pages'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>print(F<span class="hljs-string">r"The book {book} has {num_pages} pages\n"</span>)
The book The dog guide has <span class="hljs-number">124</span> pages\n

<span class="hljs-meta">&gt;&gt;&gt; </span>print(FR<span class="hljs-string">"The book {book} has {num_pages} pages\n"</span>)
The book The dog guide has <span class="hljs-number">124</span> pages\n

<span class="hljs-meta">&gt;&gt;&gt; </span>print(<span class="hljs-string">f"The book <span class="hljs-subst">{book}</span> has <span class="hljs-subst">{num_pages}</span> pages\n"</span>)
The book The dog guide has <span class="hljs-number">124</span> pages
</code></pre>
<p>And that's pretty much it! In the next section, I'll show you several examples of everything you can do - and cannot do - with f-strings.</p>
<h2 id="heading-limitations">Limitations</h2>
<p>Even though f-strings are very convenient, they don't replace <code>str.format</code> completely. f-strings evaluate expressions in the context where they appear. According the the  <a class="post-section-overview" href="#https://www.python.org/dev/peps/pep-0498/">PEP 498
</a>, this means the expression has full access to local and global variables. They're also an expression evaluated at runtime. If the expression used inside the <code>{ &lt;expr&gt; }</code> cannot be evaluated, the interpreter will raise an exception.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{name}</span>"</span>
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-1</span>-f0acc441190f&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> <span class="hljs-string">f"<span class="hljs-subst">{name}</span>"</span>

NameError: name <span class="hljs-string">'name'</span> <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> defined
</code></pre>
<p>This is not a problem for the <code>str.format</code> method, as you can define the template string and then call <code>.format</code> to pass on the context.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>s = <span class="hljs-string">"{name}"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>s.format(name=<span class="hljs-string">"Python"</span>)
<span class="hljs-string">'Python'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>print(s)
{name}
</code></pre>
<p>Another limitation is that you cannot use inline comments inside a f-string.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"My name is <span class="hljs-subst">{name #name}</span>!"</span>
  File <span class="hljs-string">"&lt;ipython-input-37-0ae1738dd871&gt;"</span>, line <span class="hljs-number">1</span>
    <span class="hljs-string">f"My name is <span class="hljs-subst">{name #name}</span>!"</span>
    ^
SyntaxError: f-string expression part cannot include <span class="hljs-string">'#'</span>
</code></pre>
<h2 id="heading-how-to-format-an-expression">How to Format an Expression</h2>
<p>If you don't want to define variables, you can use literals inside the brackets. Python will evaluate the expression and display the final result.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"4 * 4 is <span class="hljs-subst">{<span class="hljs-number">4</span> * <span class="hljs-number">4</span>}</span>"</span>
<span class="hljs-string">'4 * 4 is 16'</span>
</code></pre>
<p>Or if you prefer...</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>n = <span class="hljs-number">4</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"4 * 4 is <span class="hljs-subst">{n * n}</span>"</span>
<span class="hljs-string">'4 * 4 is 16'</span>
</code></pre>
<h2 id="heading-how-to-use-f-strings-to-debug-your-code">How to Use F-Strings to Debug Your Code</h2>
<p>One of most frequent usages of f-string is debugging. Before Python 3.8, many people would do <code>hello = 42; f"hello = {hello}"</code>, but this is very repetitive. As a result, Python 3.8 brought a new feature. You can re-write that expression as <code>f"{hello=}"</code> and Python will display <code>hello=42</code>. The following example illustrates this using a function, but the principle is the same.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">magic_number</span>():</span>
     ...:     <span class="hljs-keyword">return</span> <span class="hljs-number">42</span>
     ...: 

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{magic_number() = }</span>"</span>
<span class="hljs-string">'magic_number() = 42'</span>
</code></pre>
<h2 id="heading-how-to-format-a-multiline-f-string-dealing-with-new-lines-and-variables">How to Format a Multiline F-String (Dealing with New Lines and Variables)</h2>
<p>You can use the newline character <code>\n</code> with f-strings to print a string in multiple lines.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>multi_line = (<span class="hljs-string">f'R: <span class="hljs-subst">{color[<span class="hljs-string">"R"</span>]}</span>\nG: <span class="hljs-subst">{color[<span class="hljs-string">"G"</span>]}</span>\nB: <span class="hljs-subst">{color[<span class="hljs-string">"B"</span>
    ...: ]}</span>\n'</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span>multi_line
<span class="hljs-string">'R: 123\nG: 145\nB: 255\n'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>print(multi_line)
R: <span class="hljs-number">123</span>
G: <span class="hljs-number">145</span>
B: <span class="hljs-number">255</span>
</code></pre>
<p>As an alternative, you can use triple quotes to represent the multiline string with variables. It not only allows you to add line breaks, it’s also possible to add <code>TAB</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>other = <span class="hljs-string">f"""R: <span class="hljs-subst">{color[<span class="hljs-string">"R"</span>]}</span>
    ...: G: <span class="hljs-subst">{color[<span class="hljs-string">"G"</span>]}</span>
    ...: B: <span class="hljs-subst">{color[<span class="hljs-string">"B"</span>]}</span>
    ...: """</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>print(other)
R: <span class="hljs-number">123</span>
G: <span class="hljs-number">145</span>
B: <span class="hljs-number">255</span>
</code></pre>
<p>Example with <code>TAB</code>s.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>other = <span class="hljs-string">f'''
    ...: this is an example
    ...: 
    ...: ^Iof color <span class="hljs-subst">{color[<span class="hljs-string">"R"</span>]}</span>
    ...:     
    ...: '''</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>other
<span class="hljs-string">'\nthis is an example\n\n\tof color 123\n    \n'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>print(other)

this <span class="hljs-keyword">is</span> an example

    of color <span class="hljs-number">123</span>



&gt;&gt;&gt;
</code></pre>
<h2 id="heading-how-to-fix-f-strings-invalid-syntax-error">How to Fix F-String's Invalid Syntax Error</h2>
<p>If not used correctly, f-strings can raise a <code>SyntaxError</code>. The most common cause is using double-quotes inside a double quoted f-string. The same is also true for single quotes.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>color = {<span class="hljs-string">"R"</span>: <span class="hljs-number">123</span>, <span class="hljs-string">"G"</span>: <span class="hljs-number">145</span>, <span class="hljs-string">"B"</span>: <span class="hljs-number">255</span>}

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{color[<span class="hljs-string">"R"</span>]}</span>"</span>
  File <span class="hljs-string">"&lt;ipython-input-43-1a7f5d512400&gt;"</span>, line <span class="hljs-number">1</span>
    <span class="hljs-string">f"<span class="hljs-subst">{color[<span class="hljs-string">"R"</span>]}</span>"</span>
    ^
SyntaxError: f-string: unmatched <span class="hljs-string">'['</span>


<span class="hljs-comment"># using only single quotes</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{color[<span class="hljs-string">'R'</span>]}</span>'</span>
  File <span class="hljs-string">"&lt;ipython-input-44-3499a4e3120c&gt;"</span>, line <span class="hljs-number">1</span>
    <span class="hljs-string">f'<span class="hljs-subst">{color[<span class="hljs-string">'R'</span>]}</span>'</span>
    ^
SyntaxError: f-string: unmatched <span class="hljs-string">'['</span>
</code></pre>
<p>This error not only happens with <code>'['</code>, but also <code>'('</code>. The cause is the same, it happens when you close a quote prematurely.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>print(<span class="hljs-string">f"price: <span class="hljs-subst">{format(round(<span class="hljs-number">12.345</span>), <span class="hljs-string">","</span>)}</span>"</span>)
  File <span class="hljs-string">"&lt;ipython-input-2-1ae6f786bc4d&gt;"</span>, line <span class="hljs-number">1</span>
    print(<span class="hljs-string">f"price: <span class="hljs-subst">{format(round(<span class="hljs-number">12.345</span>), <span class="hljs-string">","</span>)}</span>"</span>)
                                           ^
SyntaxError: f-string: unmatched <span class="hljs-string">'('</span>
</code></pre>
<p>To fix that, you need to use single quotes.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>print(<span class="hljs-string">f"price: <span class="hljs-subst">{format(round(<span class="hljs-number">12.345</span>), <span class="hljs-string">','</span>)}</span>"</span>)
price: <span class="hljs-number">12</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>color = {<span class="hljs-string">"R"</span>: <span class="hljs-number">123</span>, <span class="hljs-string">"G"</span>: <span class="hljs-number">145</span>, <span class="hljs-string">"B"</span>: <span class="hljs-number">255</span>}

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{color[<span class="hljs-string">'R'</span>]}</span>"</span>
<span class="hljs-string">'123'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{color[<span class="hljs-string">"R"</span>]}</span>'</span>
<span class="hljs-string">'123'</span>
</code></pre>
<p>Another common case is to use f-strings in older versions of Python. f-strings were introduced in Python 3.6. If you use it in an older version, the interpreter will raise a <code>SyntaxError: invalid syntax</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"this is an old version"</span>
  File <span class="hljs-string">"&lt;stdin&gt;"</span>, line <span class="hljs-number">1</span>
    <span class="hljs-string">f"this is an old verion"</span>
                            ^
SyntaxError: invalid syntax
</code></pre>
<p>If you see <code>invalid syntax</code>, make sure to double check the Python version you are running. In my case, I tested in on Python 2.7, and you can find the version by calling <code>sys.version</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> sys; print(sys.version)
<span class="hljs-number">2.7</span><span class="hljs-number">.18</span> (default, Apr <span class="hljs-number">20</span> <span class="hljs-number">2020</span>, <span class="hljs-number">19</span>:<span class="hljs-number">27</span>:<span class="hljs-number">10</span>) 
[GCC <span class="hljs-number">8.3</span><span class="hljs-number">.0</span>]
</code></pre>
<h2 id="heading-how-to-fix-formatting-a-regular-string-which-could-be-a-f-string">How to Fix "formatting a regular string which could be a f-string"</h2>
<p>This error happens because pylint detects the old way of formatting string such as using <code>%</code> or the <code>str.format</code> method.</p>
<pre><code>C0209: Formatting a regular <span class="hljs-keyword">string</span> which could be a f<span class="hljs-operator">-</span><span class="hljs-keyword">string</span> (consider<span class="hljs-operator">-</span><span class="hljs-keyword">using</span><span class="hljs-operator">-</span><span class="hljs-title">f</span><span class="hljs-operator">-</span><span class="hljs-title"><span class="hljs-keyword">string</span></span>)
</code></pre><p>To fix that you can either:</p>
<ul>
<li>replace the old formatting method with a f-string</li>
<li>ignore the pylint error using </li>
</ul>
<p>In <a target="_blank" href="https://miguendes.me/pylint-consider-using-f-string">this post</a>, I explain how to fix this issue step-by-step.</p>
<h3 id="heading-replacing-with-f-strings">Replacing with f-strings</h3>
<p>The following examples illustrate how to convert old method to f-string.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span> ip_address = <span class="hljs-string">"127.0.0.1"</span>

<span class="hljs-comment"># pylint complains if we use the methods below</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">"http://%s:8000/"</span> % ip_address
<span class="hljs-string">'http://127.0.0.1:8000/'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">"http://{}:8000/"</span>.format(ip_address)
<span class="hljs-string">'http://127.0.0.1:8000/'</span>

<span class="hljs-comment"># Replace it with a f-string</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"http://<span class="hljs-subst">{ip_address}</span>:8000/"</span>
<span class="hljs-string">'http://127.0.0.1:8000/'</span>
</code></pre>
<h3 id="heading-disable-pylint">Disable <code>pylint</code></h3>
<p>Alternatively, you can disable <code>pylint</code> by specifying the "disable" flag with the error code.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span> ip_address = <span class="hljs-string">"127.0.0.1"</span>

<span class="hljs-comment"># pylint complains if we use the methods below, so we can disable them</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">"http://%s:8000/"</span> % ip_address  <span class="hljs-comment"># pylint: disable=C0209</span>
<span class="hljs-string">'http://127.0.0.1:8000/'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">"http://{}:8000/"</span>.format(ip_address)  <span class="hljs-comment"># pylint: disable=C0209</span>
<span class="hljs-string">'http://127.0.0.1:8000/'</span>
</code></pre>
<p>Another way of disabling that error is to add it to the <code>.pylintrc</code> file.</p>
<pre><code><span class="hljs-comment"># .pylintrc</span>
disable=
    <span class="hljs-keyword">...</span>
    consider-using-f-string,
    <span class="hljs-keyword">...</span>
</code></pre><p>There's yet another way of disabling it, which is by placing a comment at the top of the file, like so:</p>
<pre><code class="lang-python"> <span class="hljs-comment"># pylint: disable=consider-using-f-string</span>

<span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">your_function</span>(<span class="hljs-params">fun</span>):</span>
    <span class="hljs-string">"""Your code below"""</span>
    ...
</code></pre>
<h2 id="heading-how-to-format-numbers-in-different-bases">How to Format Numbers in Different Bases</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1603918242247/zSDHIu-F8.png" alt="fig_6.png" /></p>
<p>f-strings also allow you to display an integer in different bases. For example, you can display an <code>int</code> as binary without converting it by using the <code>b</code> option.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{<span class="hljs-number">7</span>:b}</span>'</span>
<span class="hljs-string">'111'</span>
</code></pre>
<p>In summary, you can use f-strings to format: </p>
<ul>
<li><code>int</code> to binary</li>
<li><code>int</code> to hex</li>
<li><code>int</code> to octal</li>
<li><code>int</code> to HEX (where all chars are capitalized)</li>
</ul>
<p>The following example uses the padding feature and the base formatting to create a table that displays an <code>int</code> in other bases.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>bases = {
       <span class="hljs-string">"b"</span>: <span class="hljs-string">"bin"</span>, 
       <span class="hljs-string">"o"</span>: <span class="hljs-string">"oct"</span>, 
       <span class="hljs-string">"x"</span>: <span class="hljs-string">"hex"</span>, 
       <span class="hljs-string">"X"</span>: <span class="hljs-string">"HEX"</span>, 
       <span class="hljs-string">"d"</span>: <span class="hljs-string">"decimal"</span>
}
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">for</span> n <span class="hljs-keyword">in</span> range(<span class="hljs-number">1</span>, <span class="hljs-number">21</span>):
     ...:     <span class="hljs-keyword">for</span> base, desc <span class="hljs-keyword">in</span> bases.items():
     ...:         print(<span class="hljs-string">f"<span class="hljs-subst">{n:<span class="hljs-number">5</span>{base}</span>}"</span>, end=<span class="hljs-string">' '</span>)
     ...:     print()

    <span class="hljs-number">1</span>     <span class="hljs-number">1</span>     <span class="hljs-number">1</span>     <span class="hljs-number">1</span>     <span class="hljs-number">1</span> 
   <span class="hljs-number">10</span>     <span class="hljs-number">2</span>     <span class="hljs-number">2</span>     <span class="hljs-number">2</span>     <span class="hljs-number">2</span> 
   <span class="hljs-number">11</span>     <span class="hljs-number">3</span>     <span class="hljs-number">3</span>     <span class="hljs-number">3</span>     <span class="hljs-number">3</span> 
  <span class="hljs-number">100</span>     <span class="hljs-number">4</span>     <span class="hljs-number">4</span>     <span class="hljs-number">4</span>     <span class="hljs-number">4</span> 
  <span class="hljs-number">101</span>     <span class="hljs-number">5</span>     <span class="hljs-number">5</span>     <span class="hljs-number">5</span>     <span class="hljs-number">5</span> 
  <span class="hljs-number">110</span>     <span class="hljs-number">6</span>     <span class="hljs-number">6</span>     <span class="hljs-number">6</span>     <span class="hljs-number">6</span> 
  <span class="hljs-number">111</span>     <span class="hljs-number">7</span>     <span class="hljs-number">7</span>     <span class="hljs-number">7</span>     <span class="hljs-number">7</span> 
 <span class="hljs-number">1000</span>    <span class="hljs-number">10</span>     <span class="hljs-number">8</span>     <span class="hljs-number">8</span>     <span class="hljs-number">8</span> 
 <span class="hljs-number">1001</span>    <span class="hljs-number">11</span>     <span class="hljs-number">9</span>     <span class="hljs-number">9</span>     <span class="hljs-number">9</span> 
 <span class="hljs-number">1010</span>    <span class="hljs-number">12</span>     a     A    <span class="hljs-number">10</span> 
 <span class="hljs-number">1011</span>    <span class="hljs-number">13</span>     b     B    <span class="hljs-number">11</span> 
 <span class="hljs-number">1100</span>    <span class="hljs-number">14</span>     c     C    <span class="hljs-number">12</span> 
 <span class="hljs-number">1101</span>    <span class="hljs-number">15</span>     d     D    <span class="hljs-number">13</span> 
 <span class="hljs-number">1110</span>    <span class="hljs-number">16</span>     e     E    <span class="hljs-number">14</span> 
 <span class="hljs-number">1111</span>    <span class="hljs-number">17</span>     f     F    <span class="hljs-number">15</span> 
<span class="hljs-number">10000</span>    <span class="hljs-number">20</span>    <span class="hljs-number">10</span>    <span class="hljs-number">10</span>    <span class="hljs-number">16</span> 
<span class="hljs-number">10001</span>    <span class="hljs-number">21</span>    <span class="hljs-number">11</span>    <span class="hljs-number">11</span>    <span class="hljs-number">17</span> 
<span class="hljs-number">10010</span>    <span class="hljs-number">22</span>    <span class="hljs-number">12</span>    <span class="hljs-number">12</span>    <span class="hljs-number">18</span> 
<span class="hljs-number">10011</span>    <span class="hljs-number">23</span>    <span class="hljs-number">13</span>    <span class="hljs-number">13</span>    <span class="hljs-number">19</span> 
<span class="hljs-number">10100</span>    <span class="hljs-number">24</span>    <span class="hljs-number">14</span>    <span class="hljs-number">14</span>    <span class="hljs-number">20</span>
</code></pre>
<h2 id="heading-how-to-print-formatted-objects-with-f-strings">How to Print Formatted Objects With F-Strings</h2>
<p>You can print custom objects using f-strings. By default, when you pass an object instance to a f-string, it will display what the <code>__str__</code> method returns. However, you can also use the <a target="_blank" href="https://www.python.org/dev/peps/pep-3101/#explicit-conversion-flag">explicit conversion flag</a> to display the <code>__repr__</code>.</p>
<pre><code><span class="hljs-operator">!</span>r <span class="hljs-operator">-</span> converts the value to a <span class="hljs-keyword">string</span> <span class="hljs-keyword">using</span> <span class="hljs-title">repr</span>().
<span class="hljs-operator">!</span><span class="hljs-title">s</span> <span class="hljs-operator">-</span> <span class="hljs-title">converts</span> <span class="hljs-title">the</span> <span class="hljs-title">value</span> <span class="hljs-title">to</span> <span class="hljs-title">a</span> <span class="hljs-title"><span class="hljs-keyword">string</span></span> <span class="hljs-title"><span class="hljs-keyword">using</span></span> <span class="hljs-title">str</span>().
</code></pre><pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Color</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, r: float = <span class="hljs-number">255</span>, g: float = <span class="hljs-number">255</span>, b: float = <span class="hljs-number">255</span></span>):</span>
        self.r = r
        self.g = g
        self.b = b

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__str__</span>(<span class="hljs-params">self</span>) -&gt; str:</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">"A RGB color"</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__repr__</span>(<span class="hljs-params">self</span>) -&gt; str:</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"Color(r=<span class="hljs-subst">{self.r}</span>, g=<span class="hljs-subst">{self.g}</span>, b=<span class="hljs-subst">{self.b}</span>)"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>c = Color(r=<span class="hljs-number">123</span>, g=<span class="hljs-number">32</span>, b=<span class="hljs-number">255</span>)

<span class="hljs-comment"># When no option is passed, the __str__ result is printed</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{c}</span>"</span>
<span class="hljs-string">'A RGB color'</span>

<span class="hljs-comment"># When `obj!r` is used, the __repr__ output is printed</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{c!r}</span>"</span>
<span class="hljs-string">'Color(r=123, g=32, b=255)'</span>

<span class="hljs-comment"># Same as the default</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{c!s}</span>"</span>
<span class="hljs-string">'A RGB color'</span>
</code></pre>
<p>Python also allows us to <a target="_blank" href="https://www.python.org/dev/peps/pep-3101/#controlling-formatting-on-a-per-type-basis">control the formatting on a per-type basis</a>  through the <code>__format__</code> method. The following example shows how you can do all of that.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">Color</span>:</span>
    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span>(<span class="hljs-params">self, r: float = <span class="hljs-number">255</span>, g: float = <span class="hljs-number">255</span>, b: float = <span class="hljs-number">255</span></span>):</span>
        self.r = r
        self.g = g
        self.b = b

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__str__</span>(<span class="hljs-params">self</span>) -&gt; str:</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">"A RGB color"</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__repr__</span>(<span class="hljs-params">self</span>) -&gt; str:</span>
        <span class="hljs-keyword">return</span> <span class="hljs-string">f"Color(r=<span class="hljs-subst">{self.r}</span>, g=<span class="hljs-subst">{self.g}</span>, b=<span class="hljs-subst">{self.b}</span>)"</span>

    <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__format__</span>(<span class="hljs-params">self, format_spec: str</span>) -&gt; str:</span>
        <span class="hljs-keyword">if</span> <span class="hljs-keyword">not</span> format_spec <span class="hljs-keyword">or</span> format_spec == <span class="hljs-string">"s"</span>:
            <span class="hljs-keyword">return</span> str(self)

        <span class="hljs-keyword">if</span> format_spec == <span class="hljs-string">"r"</span>:
            <span class="hljs-keyword">return</span> repr(self)

        <span class="hljs-keyword">if</span> format_spec == <span class="hljs-string">"v"</span>:
            <span class="hljs-keyword">return</span> <span class="hljs-string">f"Color(r=<span class="hljs-subst">{self.r}</span>, g=<span class="hljs-subst">{self.g}</span>, b=<span class="hljs-subst">{self.b}</span>) - A nice RGB thing."</span>

        <span class="hljs-keyword">if</span> format_spec == <span class="hljs-string">"vv"</span>:
            <span class="hljs-keyword">return</span> (
                <span class="hljs-string">f"Color(r=<span class="hljs-subst">{self.r}</span>, g=<span class="hljs-subst">{self.g}</span>, b=<span class="hljs-subst">{self.b}</span>) "</span>
                <span class="hljs-string">f"- A more verbose nice RGB thing."</span>
            )

        <span class="hljs-keyword">if</span> format_spec == <span class="hljs-string">"vvv"</span>:
            <span class="hljs-keyword">return</span> (
                <span class="hljs-string">f"Color(r=<span class="hljs-subst">{self.r}</span>, g=<span class="hljs-subst">{self.g}</span>, b=<span class="hljs-subst">{self.b}</span>) "</span>
                <span class="hljs-string">f"- A SUPER verbose nice RGB thing."</span>
            )

        <span class="hljs-keyword">raise</span> ValueError(
            <span class="hljs-string">f"Unknown format code '<span class="hljs-subst">{format_spec}</span>' "</span> <span class="hljs-string">"for object of type 'Color'"</span>
        )

<span class="hljs-meta">&gt;&gt;&gt; </span>c = Color(r=<span class="hljs-number">123</span>, g=<span class="hljs-number">32</span>, b=<span class="hljs-number">255</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{c:v}</span>'</span>
<span class="hljs-string">'Color(r=123, g=32, b=255) - A nice RGB thing.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{c:vv}</span>'</span>
<span class="hljs-string">'Color(r=123, g=32, b=255) - A more verbose nice RGB thing.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{c:vvv}</span>'</span>
<span class="hljs-string">'Color(r=123, g=32, b=255) - A SUPER verbose nice RGB thing.'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{c}</span>'</span>
<span class="hljs-string">'A RGB color'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{c:s}</span>'</span>
<span class="hljs-string">'A RGB color'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{c:r}</span>'</span>
<span class="hljs-string">'Color(r=123, g=32, b=255)'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{c:j}</span>'</span>
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
&lt;ipython-input<span class="hljs-number">-20</span><span class="hljs-number">-1</span>c0ee8dd74be&gt; <span class="hljs-keyword">in</span> &lt;module&gt;
----&gt; <span class="hljs-number">1</span> <span class="hljs-string">f'<span class="hljs-subst">{c:j}</span>'</span>

&lt;ipython-input<span class="hljs-number">-15</span><span class="hljs-number">-985</span>c4992e957&gt; <span class="hljs-keyword">in</span> __format__(self, format_spec)
     <span class="hljs-number">29</span>                 <span class="hljs-string">f"- A SUPER verbose nice RGB thing."</span>
     <span class="hljs-number">30</span>             )
---&gt; <span class="hljs-number">31</span>         <span class="hljs-keyword">raise</span> ValueError(
     <span class="hljs-number">32</span>             <span class="hljs-string">f"Unknown format code '<span class="hljs-subst">{format_spec}</span>' "</span> <span class="hljs-string">"for object of type 'Color'"</span>
     <span class="hljs-number">33</span>         )

ValueError: Unknown format code <span class="hljs-string">'j'</span> <span class="hljs-keyword">for</span> object of type <span class="hljs-string">'Color'</span>
</code></pre>
<p>Lastly, there's also the <code>a</code> option that escapes non-ASCII chars. For more info: https://docs.python.org/3/library/functions.html#ascii</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>utf_str = <span class="hljs-string">"Áeiöu"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{utf_str!a}</span>"</span>
<span class="hljs-string">"'\\xc1ei\\xf6u'"</span>
</code></pre>
<h2 id="heading-how-to-use-f-strings-to-format-a-float">How to Use F-Strings to Format a Float</h2>
<p>f-strings allow format float numbers similar to <code>str.format</code> method. To do that, you can add a <code>:</code> (colon) followed by a <code>.</code> (dot) and the number of decimal places with a <code>f</code> suffix. </p>
<p>For instance, you can round a float to 2 decimal places and print the variable just like this:</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>num = <span class="hljs-number">4.123956</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"num rounded to 2 decimal places = <span class="hljs-subst">{num:<span class="hljs-number">.2</span>f}</span>"</span>
<span class="hljs-string">'num rounded to 2 decimal places = 4.12'</span>
</code></pre>
<p>If you don't specify anything, the float variable will use the full precision.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>print(<span class="hljs-string">f'<span class="hljs-subst">{num}</span>'</span>)
<span class="hljs-number">4.123956</span>
</code></pre>
<h2 id="heading-how-to-format-a-number-as-percentage">How to Format a Number as Percentage</h2>
<p>Python f-strings have a very convenient way of formatting percentage. The rules are similar to float formatting, except that you append a <code>%</code> instead of <code>f</code>. It multiplies the number by 100 displaying it in a fixed format, followed by a percent sign. You can also specify the precision.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>total = <span class="hljs-number">87</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>true_pos = <span class="hljs-number">34</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>perc = true_pos / total

<span class="hljs-meta">&gt;&gt;&gt; </span>perc
<span class="hljs-number">0.39080459770114945</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"Percentage of true positive: <span class="hljs-subst">{perc:%}</span>"</span>
<span class="hljs-string">'Percentage of true positive: 39.080460%'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"Percentage of true positive: <span class="hljs-subst">{perc:<span class="hljs-number">.2</span>%}</span>"</span>
<span class="hljs-string">'Percentage of true positive: 39.08%'</span>
</code></pre>
<h2 id="heading-how-to-justify-or-add-padding-to-a-f-string">How to Justify or Add Padding to a F-String</h2>
<p>You can justify a string quite easily using <code>&lt;</code> or <code>&gt;</code> characters.  </p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1604743387727/6iBt6HyIw.png" alt="how to justify or add padding to a string in python" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>greetings = <span class="hljs-string">"hello"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"She says <span class="hljs-subst">{greetings:&gt;<span class="hljs-number">10</span>}</span>"</span>
<span class="hljs-string">'She says      hello'</span>

<span class="hljs-comment"># Pad 10 char to the right</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{greetings:&gt;<span class="hljs-number">10</span>}</span>"</span>
<span class="hljs-string">'     hello'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{greetings:&lt;<span class="hljs-number">10</span>}</span>"</span>
<span class="hljs-string">'hello     '</span>

<span class="hljs-comment"># You can omit the &lt; for left padding</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{greetings:<span class="hljs-number">10</span>}</span>"</span>
<span class="hljs-string">'hello     '</span>
</code></pre>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1603829974543/sBUN1XIqv.png" alt="fig_2.png" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-string">"1"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b = <span class="hljs-string">"21"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>c = <span class="hljs-string">"321"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>d = <span class="hljs-string">"4321"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>print(<span class="hljs-string">"\n"</span>.join((<span class="hljs-string">f"<span class="hljs-subst">{a:&gt;<span class="hljs-number">10</span>}</span>"</span>, <span class="hljs-string">f"<span class="hljs-subst">{b:&gt;<span class="hljs-number">10</span>}</span>"</span>, <span class="hljs-string">f"<span class="hljs-subst">{c:&gt;<span class="hljs-number">10</span>}</span>"</span>, <span class="hljs-string">f"<span class="hljs-subst">{d:&gt;<span class="hljs-number">10</span>}</span>"</span>)))
         <span class="hljs-number">1</span>
        <span class="hljs-number">21</span>
       <span class="hljs-number">321</span>
      <span class="hljs-number">4321</span>
</code></pre>
<h2 id="heading-how-to-escape-characters-with-f-string">How to Escape Characters With f-string</h2>
<p>In case you want to display the variable name surrounded by the curly brackets instead of rendering its value, you can escape it using double <code>{{&lt;expr&gt;}}</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>hello = <span class="hljs-string">"world"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"{{hello}} = <span class="hljs-subst">{hello}</span>"</span>
<span class="hljs-string">'{hello} = world'</span>
</code></pre>
<p>Now, if you want to escape a double quote, you can use the backslash <code>\"</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{hello}</span> = \"hello\""</span>
<span class="hljs-string">'world = "hello"'</span>
</code></pre>
<h2 id="heading-how-to-center-a-string">How to Center a String</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1603564554049/19ujHtjzT.png" alt="fig_1.png" /></p>
<p>Centering a string can be achieved by using <code>var:^N</code> where <code>var</code> is a variable you want to display and <code>N</code> is the string length. If <code>N</code> is shorter than the <code>var</code>, then Python display the whole string.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>hello = <span class="hljs-string">"world"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{hello:^<span class="hljs-number">11</span>}</span>"</span>
<span class="hljs-string">'   world   '</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{hello:*^<span class="hljs-number">11</span>}</span>"</span>
<span class="hljs-string">'***world***'</span>

<span class="hljs-comment"># Extra padding is added to the right</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{hello:*^<span class="hljs-number">10</span>}</span>"</span>
<span class="hljs-string">'**world***'</span>

<span class="hljs-comment"># N shorter than len(hello)</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{hello:^<span class="hljs-number">2</span>}</span>"</span>
<span class="hljs-string">'world'</span>
</code></pre>
<h2 id="heading-how-to-add-a-thousand-separator">How To Add a Thousand Separator</h2>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1603872626078/l1AR3uXzj.png" alt="fig_4.png" /></p>
<p>f-strings also allow us to customize numbers. One common operation is to add an underscore to separate every thousand place.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>big_num = <span class="hljs-number">1234567890</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{big_num:_}</span>"</span>
<span class="hljs-string">'1_234_567_890'</span>
</code></pre>
<h3 id="heading-how-to-format-a-number-with-commas-as-decimal-separator">How to Format a Number With Commas as Decimal Separator</h3>
<p>In fact, you can use any char as separator. It’s also possible to use a comma as separator.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>big_num = <span class="hljs-number">1234567890</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{big_num:,}</span>"</span>
<span class="hljs-string">'1,234,567,890'</span>
</code></pre>
<p>You can also format a float with commas and set the precision in one go.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>num = <span class="hljs-number">2343552.6516251625</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{num:,<span class="hljs-number">.3</span>f}</span>"</span>
<span class="hljs-string">'2,343,552.652'</span>
</code></pre>
<h3 id="heading-how-to-format-a-number-with-spaces-as-decimal-separator">How to Format a Number With Spaces as Decimal Separator</h3>
<blockquote>
<p>What about using spaces instead?</p>
</blockquote>
<p>Well, this one is a bit “hacky” but it works. You can use the <code>,</code> as separator, then replace it with space.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>big_num = <span class="hljs-number">1234567890</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{big_num:,}</span>"</span>.replace(<span class="hljs-string">','</span>, <span class="hljs-string">' '</span>)
<span class="hljs-string">'1 234 567 890'</span>
</code></pre>
<p>Another option is to set the locale of your environment to one that uses spaces as a thousand separator such as <code>pl_PL</code>. For more info, see this thread on <a target="_blank" href="https://stackoverflow.com/a/17484665">stack overflow</a>.</p>
<h2 id="heading-how-to-format-a-number-in-scientific-notation-exponential-notation">How to Format a Number in Scientific Notation (Exponential Notation)</h2>
<p>Formatting a number in scientific notation is possible with the <code>e</code> or <code>E</code> option.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>num = <span class="hljs-number">2343552.6516251625</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{num:e}</span>"</span>
<span class="hljs-string">'2.343553e+06'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{num:E}</span>"</span>
<span class="hljs-string">'2.343553E+06'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{num:<span class="hljs-number">.2</span>e}</span>"</span>
<span class="hljs-string">'2.34e+06'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{num:<span class="hljs-number">.4</span>E}</span>"</span>
<span class="hljs-string">'2.3436E+06'</span>
</code></pre>
<h2 id="heading-using-if-else-conditional-in-a-f-string">Using <code>if-else</code> Conditional in a F-String</h2>
<p>f-strings also evaluates more complex expressions such as inline <code>if/else</code>.</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>a = <span class="hljs-string">"this is a"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>b = <span class="hljs-string">"this is b"</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{a <span class="hljs-keyword">if</span> <span class="hljs-number">10</span> &gt; <span class="hljs-number">5</span> <span class="hljs-keyword">else</span> b}</span>"</span>
<span class="hljs-string">'this is a'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{a <span class="hljs-keyword">if</span> <span class="hljs-number">10</span> &lt; <span class="hljs-number">5</span> <span class="hljs-keyword">else</span> b}</span>"</span>
<span class="hljs-string">'this is b'</span>
</code></pre>
<h2 id="heading-how-to-use-f-string-with-a-dictionary">How to Use F-String With a Dictionary</h2>
<p>You can use dictionaries in a f-string. The only requirement is to use a different quotation mark than the one enclosing the expression. </p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>color = {<span class="hljs-string">"R"</span>: <span class="hljs-number">123</span>, <span class="hljs-string">"G"</span>: <span class="hljs-number">145</span>, <span class="hljs-string">"B"</span>: <span class="hljs-number">255</span>}

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{color[<span class="hljs-string">'R'</span>]}</span>"</span>
<span class="hljs-string">'123'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{color[<span class="hljs-string">"R"</span>]}</span>'</span>
<span class="hljs-string">''</span><span class="hljs-number">123</span><span class="hljs-string">'

&gt;&gt;&gt; f"RGB = ({color['</span>R<span class="hljs-string">']}, {color['</span>G<span class="hljs-string">']}, {color['</span>B<span class="hljs-string">']})"
'</span>RGB = (<span class="hljs-number">123</span>, <span class="hljs-number">145</span>, <span class="hljs-number">255</span>)<span class="hljs-string">'</span>
</code></pre>
<h2 id="heading-how-to-concatenate-f-strings">How to Concatenate F-Strings</h2>
<p>Concatenating f-strings is like concatenating regular strings, you can do that implicitly, or explicitly by applying the <code>+</code> operator or using <code>str.join</code> method.</p>
<pre><code class="lang-python"><span class="hljs-comment"># Implicit string concatenation</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">123</span>}</span>"</span> <span class="hljs-string">" = "</span> <span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">100</span>}</span>"</span> <span class="hljs-string">" + "</span> <span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">20</span>}</span>"</span> <span class="hljs-string">" + "</span> <span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">3</span>}</span>"</span>
<span class="hljs-string">'123 = 100 + 20 + 3'</span>

<span class="hljs-comment"># Explicity concatenation using '+' operator</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">12</span>}</span>"</span> + <span class="hljs-string">" != "</span> + <span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">13</span>}</span>"</span>
<span class="hljs-string">'12 != 13'</span>

<span class="hljs-comment"># string concatenation using `str.join`</span>
<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">" "</span>.join((<span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">13</span>}</span>"</span>, <span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">45</span>}</span>"</span>))
<span class="hljs-string">'13 45'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">"#"</span>.join((<span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">13</span>}</span>"</span>, <span class="hljs-string">f"<span class="hljs-subst">{<span class="hljs-number">45</span>}</span>"</span>))
<span class="hljs-string">'13#45'</span>
</code></pre>
<h2 id="heading-how-to-format-a-date-with-f-string">How to Format a Date With F-String</h2>
<p>f-strings also support the formatting of <code>datetime</code> objects. The process is very similar to how <code>str.format</code> formats dates. For more info about the supported formats, check this  <a target="_blank" href="https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes">table</a>  in the official docs.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1603958577730/LILz0KDf2.png" alt="fig_7.png" /></p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-keyword">import</span> datetime

<span class="hljs-meta">&gt;&gt;&gt; </span>now = datetime.datetime.now()

<span class="hljs-meta">&gt;&gt;&gt; </span>ten_days_ago = now - datetime.timedelta(days=<span class="hljs-number">10</span>)

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{ten_days_ago:%Y-%m-%d %H:%M:%S}</span>'</span>
<span class="hljs-string">'2020-10-13 20:24:17'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{now:%Y-%m-%d %H:%M:%S}</span>'</span>
<span class="hljs-string">'2020-10-23 20:24:17'</span>
</code></pre>
<h2 id="heading-how-to-add-leading-zeros">How to Add Leading Zeros</h2>
<p>You can add leading zeros by adding using the format <code>{expr:0len}</code> where <code>len</code> is the length of the returned string. You can include a <em>sign</em> option. In this instance, <code>+</code> means the sign should be used for positive and negative numbers. The <code>-</code> is used only for negative numbers, which is the default behavior. For more info, check the  <a target="_blank" href="https://docs.python.org/3/library/string.html#format-specification-mini-language">string format specification page</a> .</p>
<pre><code class="lang-python"><span class="hljs-meta">&gt;&gt;&gt; </span>num = <span class="hljs-number">42</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{num:<span class="hljs-number">05</span>}</span>"</span>
<span class="hljs-string">'00042'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{num:+<span class="hljs-number">010</span>}</span>'</span>
<span class="hljs-string">'+000000042'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{num:<span class="hljs-number">-010</span>}</span>'</span>
<span class="hljs-string">'0000000042'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f"<span class="hljs-subst">{num:<span class="hljs-number">010</span>}</span>"</span>
<span class="hljs-string">'0000000042'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span>num = <span class="hljs-number">-42</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{num:+<span class="hljs-number">010</span>}</span>'</span>
<span class="hljs-string">'-000000042'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{num:<span class="hljs-number">010</span>}</span>'</span>
<span class="hljs-string">'-000000042'</span>

<span class="hljs-meta">&gt;&gt;&gt; </span><span class="hljs-string">f'<span class="hljs-subst">{num:<span class="hljs-number">-010</span>}</span>'</span>
<span class="hljs-string">'-000000042'</span>
</code></pre>
<h2 id="heading-conclusion">Conclusion</h2>
<p>That’s it for today, folks! I hope you’ve learned something different and useful. Knowing how to make the most out of f-string can make our lives so much easier. In this post, I showed the most common tricks I use in a day-to-day basis.</p>
<p>Other posts you may like:</p>
<ul>
<li><a target="_blank" href="https://miguendes.me/how-to-check-if-an-exception-is-raised-or-not-with-pytest">How to Check if an Exception Is Raised (or Not) With pytest</a></li>
<li><a target="_blank" href="https://miguendes.me/everything-you-need-to-know-about-pythons-namedtuples">Everything You Need to Know About Python's Namedtuples</a></li>
<li><a target="_blank" href="https://miguendes.me/the-best-way-to-compare-two-dictionaries-in-python">The Best Way to Compare Two Dictionaries in Python</a></li>
<li><a target="_blank" href="https://miguendes.me/useful-resources-to-learn-pythons-internals-from-scratch">11 Useful Resources To Learn Python's Internals From Scratch</a></li>
<li><a target="_blank" href="https://miguendes.me/7-pytest-features-and-plugins-that-will-save-you-tons-of-time">7 pytest Features and Plugins That Will Save You Tons of Time</a></li>
</ul>
<p>See you next time!</p>
]]></content:encoded></item></channel></rss>