Object References, Mutability and Recyclingđź”—
Variables are not Boxesđź”—
- python variables are like reference variables in Java
- a better metaphor to think of variables as labels with names attached to objects rather than a box
b=a
doesn’t copy contents of array instead attachesb
as a antoher label to array- objects are created before assignment
class Gizmo:
def __init__(self):
print(f"Gizmo id: {id(self)}")
#
x = Gizmo() # Gizmo id: 4301489152
y = Gizmo() * 10
# Gizmo id: 4301489432 # <--- this is printed means object is created first
# Traceback (most recent call last):
# File "<stdin>", line 1, in <module>
# TypeError: unsupported operand type(s) for *: 'Gizmo' and 'int'
# Notice how Gizmo is created even before the assignment of its multiplication with 10
Identity, Equality and Aliasesđź”—
charles = {'name':'Charles L Dodgson', 'born':1832}
lewis = charles # alias
lewis is charles # true
id(lewis) == id(charles) # ture
lewis['balance'] = 950
charles # updated
alex = {'name': 'Charles L. Dodgson', 'born': 1832, 'balance': 950}
alex == charles # true ( Note: how this actually compared objects due to __eq__)
alex is charles # false
An object’s identity never changes once it has been created, more like an address in memory. The is
operator compares identity of two objects, and id
returns an integer representing its identity.
Choosing Between ==
and is
đź”—
If you are comparing a variable to a singleton then it makes sense to use is
, most common case is checking whether a variable is bound to None
, x is None
, rather than ==
We can compare sentinel objects using is
as well.
END_OF_DATA = object()
# ... many lines
def traverse(...):
# ... more lines
if node is END_OF_DATA:
return
# etc.
- is operator is faster than ==, because it can’t be overloaded, so python directly executes it without any checks & comparison of ids is simple
The Relative Immutability of Tuplesđź”—
- tuples are containers, they hold references to objects. If referenced objects are mutable they may change even if the tuple itself doesn’t.
- So immutability of tuple only extends to actuals contents of tuple not the objects that are referenced in it.
Copies are Shallow by Defaultđź”—
- easies way to copy a list(or most built-in mutable collections) is to use built-in constructor for the type itself.
- Note we could also use
l2=l1[:]
to make a shallow copy - remember that only outermost container will be duplicated, but the copy is filled with references to the same items held by original container. This saves memory and causes no problem if all items are immutable. For mutable items don’t do this !!
Deep and Shallow Copies of Arbitrary Objectsđź”—
- to create deep copy use
copy
module to getdeepcopy
(deep) andcopy
(shallow) copy functions - Note: making deep copies is not simple matter in general case. Objects may have cyclic references that would cause naive alogithm to go in infinite loop. But
deepcopy
function remebers the objects already copied to handle cyclic references gracefully.
Function Parameters as Referencesđź”—
The only mode of parameter passing in Python is call by sharing. That is the same mode used in most OOPs languages and JS, Ruby and Java(reference types).
- Call by sharing refers to that each functions receives the copy of each reference in the arguments. In other words parameters inside function become aliasees of the actual arguments.
- Result is that a function may change a mutable objects passed as parameter, but it cannot change identity of those objects.
def f(a, b):
a += b
return a
x = 1
y = 2
f(x, y) # 3
x,y # (1,2)
# now
a = [1, 2]
b = [3, 4]
f(a,b) # [1,2,3,4]
a,b # ([1,2,3,4], [3,4])
t = (10, 20)
u = (30, 40)
f(t, u) # (10, 20, 30, 40)
t,u # ((10, 20), (30, 40))
Mutable Types as Parameter Defaults: Bad Ideađź”—
class HauntedBus:
"""A bus model haunted by ghost passengers"""
def __init__(self, passengers=[]):
self.passengers = passengers
def pick(self, name):
self.passengers.append(name)
def drop(self, name):
self.passengers.remove(name)
# Output
bus1 = HauntedBus(['Alice', 'Bill'])
bus1.passengers # ['Alice', 'Bill']
bus1.pick('Charlie')
bus1.drop('Alice')
bus1.passengers # ['Bill', 'Charlie']
# horror begin when default invoked
bus2 = HauntedBus()
bus2.pick('Carrie')
bus2.passengers # ['Carrie']
bus3 = HauntedBus()
bus3.passengers # ['Carrie']
bus3.pick('Dave')
bus2.passengers # ['Carrie', 'Dave']
bus2.passengers is bus3.passengers # True
bus1.passengers # ['Bill', 'Charlie']
Defensive Programming with Mutable Parametersđź”—
When you are coding a function that receives a mutable parameter, you should carefully consider whether the caller expects the argument passed to be changed.
For example, if your function receives a dict
and needs to modify it while processing it, should this side effect be visible outside of the function or not? Actually it depends on the context. It’s really a matter of aligning the expectation of the coder of the function and that of the caller.
basketball_team = ['Sue', 'Tina', 'Maya', 'Diana', 'Pat']
bus = TwilightBus(basketball_team)
bus.drop('Tina')
bus.drop('Pat')
basketball_team # ['Sue', 'Maya', 'Diana']
Above class violates the Principle of least Astonishment, a best practice of interface design.
class TwilightBus:
"""A bus model that makes passengers vanish"""
def __init__(self, passengers=None):
if passengers is None:
self.passengers = []
else:
self.passengers = passengers # this reference is set at class internally
def pick(self, name):
self.passengers.append(name)
def drop(self, name):
self.passengers.remove(name)
- when drop is called its actually run on the object referenced by paramters which in turn is actual list.
- correct way is to make a copy of the list passed.
def __init__(self, passengers=None):
if passengers is None:
self.passengers = []
else:
self.passengers = list(passengers)
- this makes copy of the list, and extends that our passengers now maybe a tuple or any other iterable type
del and Garbage Collectionđź”—
The first strange fact about del
is that it’s not a function, it’s a statement. We write del x
and not del(x)
—although the latter also works, but only because the expressions x
and (x)
usually mean the same thing in Python.
The second surprising fact is that del
deletes references, not objects. Python’s garbage collector may discard an object from memory as an indirect result of del
, if the deleted variable was the last reference to the object.
a = [1, 2] # bind to a
b = a # label b to a
del a # [1,2] still remains because b references it
b # [1,2]
b = [3] # [1,2] is now collected becuase it available for garbage collection.
In cpython primary algorithm for garbage collection is reference counting. Essentially, each object keeps count of how many reference points to it. As refcount
reaches zero, the object is immediately destroyed.
You can use weakref.finalize
register a callback function to be called when an object is destroyed
Tricks Python Plays with immmutablesđź”—
- tuple
t
,t[:]
doesn’t make copy, but returns a reference to same object, you may also get reference even if you writetuple(t)
- same behaviour can be observed with instances of
str
,bytes
andfrozenset
(note:fs[:]
doesn’t exist, we meant to usefs.copy()
) - The sharing of string literals is an optimization technique called interning. CPython uses a similar technique with small integers to avoid unnecessary duplication of numbers that appear frequently in programs like 0, 1, –1, etc. Note that CPython does not intern all strings or integers, and the criteria it uses to do so is an undocumented implementation detail.
- The tricks discussed in this section, including the behavior of
frozenset.copy()
, are harmless “lies” that save memory and make the interpreter faster. Do not worry about them, they should not give you any trouble because they only apply to immutable types.