TL;DR: Quick overview of Python’s pass-by-name model and things to be aware of.
In contrast to, e.g.,
C, where we have
pass-by-reference when we pass and assign variables,
python uses what is called often referred to as
pass-by-name. This has certain implications and might lead to unexpected side effects and hard-to-identify bugs. Here, I want to quickly summarize the model to hopefully help to avoid some of the pitfalls. See here for a colab workbook to try things yourself.
How python works
The often cited official way
python passes data is
“Object references are passed by value.”
This statement is somewhat useless if one does not know how
python is ultimately working. Without dwelling on unnecessary details, almost everything in
python is an object. Now, whenever you encounter a statement of the form
var1 = "I am a string"
then what is happening is that
python creates a name
var1 for the memory block that holds the string
"I am a string"; in fact it is an object but that is not that important. Suppose now, what we execute
var2 = var1
python creates another name
var2 that is neither (1) pointing at the memory where
var1 is saved, nor (2) does it create a copy of the string
"I am a string". What it does is it creates a second name
var2 for the string
"I am a string". So now we have two names for the same memory block. The way of thinking about this is that whenever a name appears on the right of
= it is resolved to the object it names. So we have now two names for the same memory block.
Now with strings in python it is very hard to see how this is different from other languages as. What do I mean by that?
Consider the following example:
var1 = "I am a string" var2 = var1 print(var1) print(var2) var2 = var2 + "!!" print(var1) print(var2)
I am a string I am a string I am a string I am a string!!
var2 changed but
var1 does not? Not quite. What happens in the statement
var2 = var2 + "!!" is that on the right-hand side we create a new string object
"I am a string!!" and
var2 now is assigned to be the name of this new string object. This is because most (if not all) string operations in
python result in new objects. As such
var2 are not referring to the same memory block anymore.
Lists are lists of names
Let us now compare this to lists in
python. Consider the example:
list1 = [1,2,3] print(list1) list2 = list1 list2 = 2 print(list2) print(list1)
We get the output:
[1, 2, 3] [2, 2, 3] [2, 2, 3]
Confusing? Remember that variables are “names for memory blocks”. So we first execute
list1 = [1,2,3], i.e., we create a list object
[1,2,3] and gave it the name
list1. Then we created a second name
list2 for the same memory block with
list2 = list1. Then write into the memory block referenced by
list2 and set it to
list2 = 2. However this is the same memory block as the one of
list1, so that also
list1 is now equal to
2 as seen also in the output.
Passing values to functions
Let us look at another example involving passing arguments to functions:
def changeVal(varX): varX = 2 varX = 7 return varX var1 = [1,2,3] print(var1) var2 = changeVal(var1) print(var1) print(var2)
What output do we expect as output?
[1, 2, 3] [2, 2, 3] 7
Again following the above principle of thinking of variables as names for memory, we can understand what is going on. We see that
var1 before the function call is
[1, 2, 3]. We then call the function
changeVal we have the variable
varX. Note that
varX is a new name that points to the same memory block as
var1, i.e., when we pass a variable to a function, we pass the name of that memory block by value. This is exactly what the ominous phrase “Object references are passed by value.” refers to. Within the function we then modify the first entry of the memory block with the name
varX, i.e., we write into the block referenced by
varX = 2. As
var1 are up to here two different names for the same memory block, we also have that
var1 is now equal to
2, hence the change of the
2. However, then in the function we execute
varX = 7, which makes
varX the name of a new memory block that holds the
var1 are the names of two different memory locations.
In order to consistently resolve the seemingly two different types of assignments, it is helpful to understand that an assignment of the form
varX = 2 means also a reassignment of a name, however of the name
varX is name for a memory block that holds a list of names:
varX, … .
To provide another illustration of this, consider:
list1 = [1,2,3] print(list1) list2 = list1 list2.append(4) print(list1) print(list2)
[1, 2, 3] [1, 2, 3, 4] [1, 2, 3, 4]
Check whether a function/operation creates a copy or not
Consider the following example:
list1 = [1,2,3,4] print(list1) list2 = list1 +  list3 = list1.append(5) print(list1) print(list2) print(list3)
What output do you expect?
[1, 2, 3, 4] [1, 2, 3, 4, 5] [1, 2, 3, 4, 5] None
list2 = list1 +  as well as
5 to the list. However, there is a significant difference: the
list1 +  = [1, 2, 3, 4, 5] and creates a new object as return which gets the name
list2. On the other hand
list1.append(5) appends the element
list1 but does not return anything, i.e., the return value is
None. This is because
list1.append does not create a copy but modifies
list1. In order to figure out how a function operates it usually suffices to check the specification of the return value. Here are two examples:
For the replace method of a string
str.replace() it reads in the documentation:
str.replace(old, new[, count])
Return a copy of the string with all occurrences of substring old replaced by new. If the optional argument count is given, only the first count occurrences are replaced.
list.append() it reads:
Add an item to the end of the list. Equivalent to
a[len(a):] = [x].