Have you ever had a bug that took ages to fix and made no sense at all?
If the answer is yes, then keep reading. Chances are that if you program in Python, you will probably fall into one of these silly behaviors.
In this post, I’m going to show you 5 things that have a great potential drive you mad in Python. Some of them are very subtle and others are not obvious at all. By learning about them in advanced, you can save hours of debugging time.
Here's what we're gonna cover:
- How implicit string concatenation can be dangerous
- Be careful when using the walrus operator like this
- Don't use the += operator on lists
- Why mutable default arguments in functions is the most common bug in Python
- When comparing values, don't be clever!
Without further ado, brace yourself and let’s go!
Implicit String Concatenation
I confess that this one has costed me several hours of my life. When used correctly it is very nice but when you don’t, it’s a headache.
In Python you can concatenate strings using not only using the
+ operator but also implicitly. The following snippet illustrates a very common bug in Python.
In : "french " + "bulldog" Out: 'french bulldog'
<string> + <string> generates a new
<string>. But what you may not know is that you can leave
+ out and Python will still concatenate the string.
In : "french " "bulldog" Out: 'french bulldog'
When is this a problem, then? It'll be an issue when you want a list of strings and forget a comma.
In : dogs = ["poodle" "french bulldog", "pit bull", "american bully"] In : dogs Out: ['poodlefrench bulldog', 'pit bull', 'american bully']
Oh, it can get worse. Imagine you have a function that accepts two strings, but the second one is optional.
In : def print_pair(a: str, b: Optional[str] = None): ...: print("a: ", a, "b: ", b) In : print_pair( ...: "First string" ...: "Second string" ...: ) a: First stringSecond string b: None
You see? The function runs just fine, so that’s not great! Situations like these can hide nasty bugs. The lesson here is clear: be careful when passing strings to functions or using them as list of items.
In 2019, Python 3.8 introduced the walrus operator. This new feature generated a lot of controversies. Some people loved whereas other actually hated it. The goal of this post is not to debate that, so I’ll dive right into what makes walrus confusing.
Before Python 3.8, you could not assign a value to a variable and test if it was “truthy” in the same statement. For example, see the following example, of reading data from a socket until an empty string is read. This examples is inspired by one described in the PEP.
data = sock.recv(4096) while data: clean_data = clean(data) print("Received data:", data) data = sock.recv(4096)
while data := sock.recv(4096): clean_data = clean(data) print("Received data:", data)
That is, you do both the assignment and the checking in the same line by using
In Python we can use tuples to assign values to more than one variable in the same line.
In : a, b = 2, 3 In : a Out: 2 In : b Out: 3
Hum... we probably can do the same using walrus, right?
In : (a, b := 16, 19) Out: (2, 16, 19)
Yeah, a 3-tuple is returned!
̶T̶h̶e̶ ̶r̶e̶a̶s̶o̶n̶ ̶f̶o̶r̶ ̶t̶h̶a̶t̶ ̶i̶s̶ ̶t̶h̶a̶t̶ ̶t̶h̶e̶ ̶
̶b̶̶ ̶t̶a̶k̶e̶s̶ ̶p̶r̶e̶c̶e̶d̶e̶n̶c̶e̶ ̶a̶n̶d̶ ̶g̶e̶t̶s̶ ̶a̶s̶s̶i̶g̶n̶e̶d̶ ̶t̶o̶ ̶i̶t̶ ̶t̶h̶e̶ ̶t̶u̶p̶l̶e̶ ̶
̶1̶6̶,̶ ̶1̶9̶̶.̶ ̶I̶n̶ ̶o̶t̶h̶e̶r̶ ̶w̶o̶r̶d̶s̶,̶ ̶i̶t̶’̶s̶ ̶t̶h̶e̶ ̶s̶a̶m̶e̶ ̶a̶s̶ ̶
̶(̶a̶,̶ ̶(̶b̶ ̶:̶=̶ ̶1̶6̶,̶ ̶1̶9̶)̶)̶̶.̶ ̶A̶n̶d̶ ̶s̶i̶n̶c̶e̶ ̶
̶a̶̶ ̶h̶a̶d̶ ̶a̶l̶r̶e̶a̶d̶y̶ ̶b̶e̶e̶n̶ ̶b̶o̶u̶n̶d̶ ̶t̶o̶ ̶
̶2̶̶,̶ ̶s̶o̶ ̶t̶h̶e̶ ̶r̶e̶t̶u̶r̶n̶ ̶i̶s̶ ̶t̶h̶e̶ ̶t̶u̶p̶l̶e̶ ̶
̶(̶a̶,̶ ̶b̶ ̶:̶=̶ ̶1̶6̶,̶ ̶1̶9̶)̶ ̶=̶>̶ ̶(̶2̶,̶ ̶(̶1̶6̶,̶ ̶1̶9̶)̶)̶̶.̶
Thanks to @ForceBru who kindly corrected me, what gets assigned to
b is not a tuple, but only the first element after
As a result,
(a, b := 16, 19) is the same as
(a, (b := 16), 19). And that explains why a 3-tuple is returned.
You can verify that by printing the AST .
import ast print(ast.dump(ast.parse("(a, b:= 16, 19)")))
Which produces the following output:
Module( body=[ Expr( value=Tuple( elts=[ Name(id="a", ctx=Load()), NamedExpr( target=Name(id="b", ctx=Store()), value=Constant(value=16, kind=None), ), Constant(value=19, kind=None), ], ctx=Load(), ) ) ], type_ignores=, )
As you can see, the thing is confusing!
ais not defined?
In : (a, b := 16, 19) --------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-14-bd6265d8ca84> in <module> ----> 1 (a, b := 16, 19) NameError: name 'a' is not defined
When the first variable is not defined, a
NameError will be raised. Now you know how to avoid that!
Be Careful When Using
+= With Lists
Lists in Python are incredibly nice. You can perform all sorts of stuff like:
- concatenating multiple lists using
- generating a repeated list by using the
- concatenate and assign lists using
Let’s look at an example on how
+ operator works with the
list object. I know you may be tired of such toy examples, but please, bear with me.
In : lst = [3, 4, 5, 6, 7] In : lst_copy = lst In : lst = lst + [8, 9] In : lst Out: [3, 4, 5, 6, 7, 8, 9] In : lst_copy Out: [3, 4, 5, 6, 7]
Cool, we created a list called
lst then we built a new one named
lst_copy by pointing it to
lst. Then we changed
lst by appending
[8, 9] to it. As expected, the
+ operator expanded
lst_copy remained the same.
In Python one can shorten expressions like
a = a + 1 as
a += 1. As I mentioned in the beginning, you can also use the
+= operator with lists. So, let’s give it a shot and re-write our example.
In : lst = [3, 4, 5, 6, 7] In : lst_copy = lst In : lst += [8, 9] In : lst Out: [3, 4, 5, 6, 7, 8, 9] In : lst_copy Out: [3, 4, 5, 6, 7, 8, 9]
WAAATTT!? What happened here?
The reason for this behavior is that, like other Python operators, the implementation of
+= is defined by the class that implements it. That is, to define
list class has defined a
object.__iadd__(self, other) magic method. And the way it works is the same as
So why has
lst_copy been modified?
Because it is not an actual copy of
lst but it points to the in memory.
In : lst Out: [3, 4, 5, 6, 7, 8, 9] In : lst_copy Out: [3, 4, 5, 6, 7, 8, 9] In : lst = [3, 4, 5, 6, 7] In : lst_copy = lst In : lst.extend([8, 9]) In : lst Out: [3, 4, 5, 6, 7, 8, 9] In : lst_copy Out: [3, 4, 5, 6, 7, 8, 9]
The key takeaway is, don't blindly assume operators will have the same semantics across different classes.
Mutable Default Arguments
I understand that this one might not be new to you. However, it’s unquestionably one of the most dangerous. The case I’m talking about is the usage of mutable default arguments on functions. If you don’t know what this is all about, take a look at the following example.
In : def add_fruit(fruit: str, basket: list = ) -> list: ...: basket.append(fruit) ...: return basket ...: In : b = add_fruit("banana") In : b Out: ['banana'] In : c = add_fruit("apple") In : c Out: ['banana', 'apple']
As you can see, we call the function twice without passing a list to it. The ultimate result is a list with two items, how did that happen?
The reason for this behavior is that when the interpreter defines the function, it also creates the default argument. Then, it binds the object created to the function argument.
In our problem, Python allocated an empty list and bound it to the argument
basket. To make things simpler to follow, let’s look at a visual example made with python tutor.
As you can see, the argument
basket is created once and the function points to it during its entire lifetime. The only exception is when you pass another list to it but that won’t change the default. Whenever you call the function again without passing a list to it, it will use the one created when the function was defined.
How can we avoid this, then?
To avoid this, you must set the argument to
None and create a list if none is passed.
In : def add_fruit(fruit: str, basket: Optional[list] = None): ...: if basket is None: ...: basket =  ...: basket.append(fruit) ...: return basket In : b = add_fruit("banana") In : b Out: ['banana'] In : c = add_fruit("apple") In : c Out: ['apple']
Great! Now we create a list whenever no argument is passed to the function, which fixes the bug.
Chained Operations Gone Wrong
Chained operations are an exceptional feature. It makes the code terse without sacrificing readability. I discuss it in more detail in another blog post but to provide you a bit of context let’s see it in action.
10 20 == x == 0 False 25 > x <= 15 Truex =
Let's pay close attention to the first example. If Python didn't have this feature, that statement could be re-written as:
In : 20 == x and x == 0 Out: False
Now, what happens if we add parentheses to enforce some kind of precedence?
In : 20 == x == 0 Out: False In : (20 == x) == 0 Out: True
Wait? What on earth has just happened?
When we added the parentheses,
(20 == x) was evaluated to
False. However, the problem is that
False is then compared to
0. ̶S̶i̶n̶c̶e̶ ̶
̶0̶̶ ̶i̶s̶ ̶c̶o̶n̶s̶i̶d̶e̶r̶e̶d̶ ̶a̶ ̶"̶F̶a̶l̶s̶y̶"̶ ̶v̶a̶l̶u̶e̶,̶ ̶t̶h̶e̶n̶ ̶t̶h̶e̶ ̶c̶o̶m̶p̶a̶r̶i̶s̶o̶n̶ ̶r̶e̶t̶u̶r̶n̶s̶ ̶'̶T̶r̶u̶e̶`̶.̶
As pointed out by @alexmojaki,
False == 0 is
bool is a subclass of
int. For instance, other "Falsy" values such as
"" are not equal to
In : False == 0 Out: True In : bool(0) Out: False In : False == "" Out: False In : False ==  Out: False
The lesson here is, be careful when using parentheses in chained operations.
That’s it for today, folks! I hope you’ve learned something new and useful.
Python has amazing features, but we must use some of them with caution. If we’re not mindful, we may lose tons of time debugging our code. By learning the common pitfall we are much better prepared and not only can prevent these bugs but also avoid them.
Other posts you may like:
See you next time!