Class Metaprogrammingđź”—
Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it? - Brian W. Kernighan and PJ Plauger, The Elements of Programming Style
- Class metaprogramming is the art of creating or customizing classes at runtime.
- Classes are first-class objects in Python, so a function can be used to create a new class at any time, without
class
keyword. - Class decorators are functions, but designed to inspect, change and even replace the decorated class with another class.
- Metaclasses are most advanced tool for class metaprogramming : they let you create whole new categories of classes with special traits such as abstract base classes.
- Metaclasses are powerful, but hard to justify and even harder to get right. Class decorators solve many of the same problems and are easier to understand.
Classes as Objectsđź”—
- classes are
objects
in python. Every class has a number of attributes defined in the Python Data Model. Three of those attributes appeared several times in this book already :__class__
,__name__
, and__mro__
. Other class attributes are :
cls.__bases__
: tuple of base classes of the class
cls.__qualname__
: The qualified name of a class or function which is a dotted path from global scope of the module to the class definition.
-
cls.__subclasses__()
: This method returns a list of immediate subclasses of the class. The implementation uses weak references to avoid circular references between the superclass and its subclasses - which hold a strong reference to the superclasses in their__bases__
attribute.NOTE: The method lists subclasses currently in memory. Subclasses in modules not yet imported will not appear in the result.
cls.mro()
: The interpreter calls this method when building a class to obtain the tuple of superclasses stored in the__mro__
attribute of the class. A metaclass can override this method to customize the method resolution order of the class under construction.
None of the attributes mentioned in this section are listed by the dir(…)
function.
type: The Built-in Class Factoryđź”—
- We usually think
type
returns the class of the object, because thats whattype(my_object)
does : it returnsmy_object.__class__
type
is a class that creates a new class when invoked with three arguments
- Using
type
constructor we can createMyClass
at runtime with this code:
- when Python read
class
statement it calls, type to build the class object with these parameters.name
(MyClass),bases
(tuple of superclasses) anddict
(mapping of attribute names to values. Callable become methods) - The
type
class is a metaclass : a class that builds classes. instance oftype
class are classes.
A Class Factory Functionđź”—
- Standard library has a class factory function that appears several times in book :
collections.namedtuple
, we also sawtyping.NamedTuple
and@dataclass
- a super simple factory for classes of mutable objects - the simplest replacement for
@dataclass
class Dog:
def __init__(self, name, weight, owner):
self.name = name
self.weight = weight
self.owner = owner
- Above boilplate is repititive (appears 3 times), and that doesn’t even buy us a nice
repr
- Let’s create a record_factory that simplifies this
Dog = record_factory('Dog','name weight owner')
rex = Dog('Rex', 30, 'Bob')
rex
# Dog(name='Rex', weight=30, owner='Bob')
name, weight, _ = rex
rex.weight = 32
rex
# Dog(name='Rex', weight=32, owner='Bob')
Dog.__mro__
# (<class 'factories.Dog'>, <class 'object'>)
The code for record_factory
:
from typing import Union, Any
from collections.abc import Iterable, Iterator
FieldNames = Union[str, Iterable[str]] # single string or list of string
def record_factory(cls_name: str, field_names: FieldNames) -> type[tuple]: # first two of collections.namedtuple; return a type
slots = parse_identifiers(field_names) # build a tuple of attribute names
def __init__(self, *args, **kwargs) -> None: # build tuple from slots and args
attrs = dict(zip(self.__slots__, args))
attrs.update(kwargs)
for name, value in attrs.items():
setattr(self, name, value)
def __iter__(self) -> Iterator[Any]: # return iterators
for name in self.__slots__:
yield getattr(self, name)
def __repr__(self): # representation
values = ', '.join(f'{name}={value!r}'
for name, value in zip(self.__slots__, self))
cls_name = self.__class__.__name__
return f'{cls_name}({values})'
cls_attrs = dict( # assemble a dictionary of class attributes
__slots__=slots,
__init__=__init__,
__iter__=__iter__,
__repr__=__repr__,
)
return type(cls_name, (object,), cls_attrs) # return and build class, type constructor
def parse_identifiers(names: FieldNames) -> tuple[str, ...]:
if isinstance(names, str):
names = names.replace(',', ' ').split() # convert names separated by spaces or commas to list of str
if not all(s.isidentifier() for s in names):
raise ValueError('names must all be valid identifiers')
return tuple(names)
Introducing __init_subclass__
đź”—
- Both
__initsubclass__
and__set_name__
(prev chap) came out in PEP487-Simple customization of class creation. - We could use both
typing.NamedTuple
and@dataclass
to let programmers used class statement to specify attributes for a new class, which is enhanced by the class builder with automatic addition of methods like__init__
,__repr__
,__eq__
etc. - Both these class builders read type hints in user’s class to enhance class. Those type hints to allow static type checkes to validate code that sets or gets those attributes. However
NamedTuple
and@dataclass
doesn’t take advantage of type hints for attributes at runtimes. We will implement a Checked class that does
from collections.abc import Callable # Type hints on callable
from typing import Any, NoReturn, get_type_hints
class Field: # descriptor
def __init__(self, name: str, constructor: Callable) -> None: # minimum type hints and return type is Any
if not callable(constructor) or constructor is type(None): # Minimal callable type hint
raise TypeError(f'{name!r} type hint must be callable')
self.name = name
self.constructor = constructor
def __set__(self, instance: Any, value: Any) -> None:
if value is ...: # Checked.__init__ sets value as Ellipses we call the constructor with no arguments.
value = self.constructor()
else:
try:
value = self.constructor(value) # otherwise call with the value
except (TypeError, ValueError) as e:
type_name = self.constructor.__name__
msg = f'{value!r} is not compatible with {self.name}:{type_name}'
raise TypeError(msg) from e
instance.__dict__[self.name] = value # store values in instance dictionary
class Checked:
@classmethod
def _fields(cls) -> dict[str, type]:
return get_type_hints(cls)
def __init_subclass__(subclass) -> None: # is called when subclass of the current class is defined
super().__init_subclass__() # not strictly necessary but should handle classes that invoke .__init_subclass__()
for name, constructor in subclass._fields().items(): # for each field name and constructor
setattr(subclass, name, Field(name, constructor)) # create an attribute on subclass with that name bound to a Field descriptor parameterized with name and constructor
def __init__(self, **kwargs: Any) -> None:
for name in self._fields():
value = kwargs.pop(name, ...) # for each field get value from kwargs and remove from kwargs, ... helps distinguish between arguments given the value None from arguments that wer not given.
setattr(self, name, value) # Calls Checked.__setattr__
if kwargs: # if still items remaining in the kwargs, __init__ fails
self.__flag_unknown_attrs(*kwargs)
def __setattr__(self, name: str, value: Any) -> None: # intercepts all attempsts to set and instance attribute
if name in self._fields(): # if attribute name is konw fetch corresponding descriptor
cls = self.__class__
descriptor = getattr(cls, name)
descriptor.__set__(self, value) # we need it due to above comment
else: # unknown attribute
self.__flag_unknown_attrs(name)
def __flag_unknown_attrs(self, *names: str) -> NoReturn: # rare use of NoReturn type to raise Attribute Error
plural = 's' if len(names) > 1 else ''
extra = ', '.join(f'{name!r}' for name in names)
cls_name = repr(self.__class__.__name__)
raise AttributeError(f'{cls_name} object has no attribute{plural} {extra}')
def _asdict(self) -> dict[str, Any]: # create dict from the attributes of a Movie object
return {
name: getattr(self, name)
for name, attr in self.__class__.__dict__.items()
if isinstance(attr, Field)
}
def __repr__(self) -> str: # implement nice __repr__
kwargs = ', '.join(
f'{key}={value!r}' for key, value in self._asdict().items()
)
return f'{self.__class__.__name__}({kwargs})'
class Movie(Checked):
title: str
year: int
box_office : float
movie = Movie(title='The Godfather', year=1972, box_office=137)
movie.title # The Godfather
movie # Movie(title='The Godfather', year=1972, box_office=137.0)
Movie(title="Life of Brian")
# Movie(title='Life of Brian', year=0, box_office=0.0)
# NOTE : How defaults are picked up based on type
# int(), float(), bool(), str(), list(), dict(), set()
# (0, 0.0, False, '', [], {}, set())
blockbuster = Movie(title='Avatar', year=2009, box_office='billions')
# TypeError: 'billions' is not compatible with box_office:float
Why __init_subclass__
Cannot configure __slots__
đź”—
The __slots__
attribute is only effective if it is one of the entries in the class namespace passed to type.__new__
. Adding __slots__
to an existing class has no effect. Python invokes __init_subclass__
only after the class is built—by then it’s too late to configure __slots__
. A class decorator can’t configure __slots__
either, because it is applied even later than __init_subclass__
.
To configure __slots__
at runtime, your own code must build the class namespace passed as the last argument of type.__new__
. To do that, you can write a class factory function, like record_factory.py, or you can take the nuclear option and implement a metaclass. We will see how to dynamically configure __slots__
in “Metaclasses 101”.
Enhancing Classes with a Class Decoratorđź”—
A class decorator is a callable that behaves similarly to a function decorator.
Probably the most common reason to choose a class decorator over the simpler __init_subclass__
is to avoid interfering with other class features, such as inheritance and metaclasses
def checked(cls: type)-> type:
for name, constructor in _fields(cls).items(): # _fields is a top-level function
setattr(cls, name, Field(name, constructor)) # replace each attribute by Fields descriptor
cls._fields = classmethod(_fields) # type: ignore # build class method from _fields, add to decorated class.
instance_methods = ( # module level functions that will become instance methods of decorated class.
__init__,
__repr__,
__setattr__,
_asdict,
__flag_unknown_attrs,
)
for method in instance_methods: # add each instance methods
setattr(cls, method.__name__, method)
return cls
- Method’s to be injected in decorated class
def _fields(cls: type) -> dict[str, type]:
return get_type_hints(cls)
def __init__(self: Any, **kwargs: Any) -> None:
for name in self._fields():
value = kwargs.pop(name, ...)
setattr(self, name, value)
if kwargs:
self.__flag_unknown_attrs(*kwargs)
def __setattr__(self: Any, name: str, value: Any) -> None:
if name in self._fields():
cls = self.__class__
descriptor = getattr(cls, name)
descriptor.__set__(self, value)
else:
self.__flag_unknown_attrs(name)
def __flag_unknown_attrs(self: Any, *names: str) -> NoReturn:
plural = 's' if len(names) > 1 else ''
extra = ', '.join(f'{name!r}' for name in names)
cls_name = repr(self.__class__.__name__)
raise AttributeError(f'{cls_name} has no attribute{plural} {extra}')
def _asdict(self: Any) -> dict[str, Any]:
return {
name: getattr(self, name)
for name, attr in self.__class__.__dict__.items()
if isinstance(attr, Field)
}
def __repr__(self: Any) -> str:
kwargs = ', '.join(
f'{key}={value!r}' for key, value in self._asdict().items()
)
return f'{self.__class__.__name__}({kwargs})'
What Happens when : Import Time vs Runtimeđź”—
- Python programmers talk about
import time
vsruntime
, but the terms are not strictly defined and there is gray area. - At import time the interpreter:
- Parses the source code of a .py module in one pass from top to bottom. This is when a
SyntaxError
may occur. - Compiles the bytecode to be executed.
- Executes the top-level code of the compiled module.
- Parses the source code of a .py module in one pass from top to bottom. This is when a
- If there is an up-to-date .pyc file available in the local
__pycache__
, parsing and compiling are skipped because the bytecode is ready to run. - parsing and compiling is “import time” activities while executable statements can potentially run and change state of program.
- In particular, the
import
statement is not merely a declaration, but it actually runs all the top-level code of a module when it is imported for the first time in the process. Further imports of same module will use a cache, and then the only effect will be binding the imported objects to names in the client module. - the
import
statement can trigger all sorts of “runtime” behavior. Conversely, “import time” can also happen deep inside runtime, because theimport
statement and the__import__()
built-in can be used inside any regular function.
Evaluation Time Experimentsđź”—
Consider an evaldemo.py script that uses a class decorator, a descriptor, and a class builder based on __init_subclass__
, all defined in a builderlib.py module.
print('@ builderlib module start')
class Builder:
print('@ Builder body')
def __init_subclass__(cls):
print(f'@ Builder.__init_subclass__({cls!r})')
def inner_0(self):
print(f'@ SuperA.__init_subclass__:inner_0({self!r})')
cls.method_a = inner_0
def __init__(self):
super().__init__()
print(f'@ Builder.__init__({self!r})')
def deco(cls): # class decorator
print(f'@ deco({cls!r})')
def inner_1(self): # print the class
print(f'@ deco:inner_1({self!r})')
cls.method_b = inner_1
return cls
class Descriptor:
print('@ Descriptor body')
def __init__(self):
print(f'@ Descriptor.__init__({self!r})')
def __set_name__(self, owner, name):
args = (self, owner, name)
print(f'@ Descriptor.__set_name__{args!r}')
def __set__(self, instance, value):
args = (self, instance, value)
print(f'@ Descriptor.__set__{args!r}')
def __repr__(self):
return '<Descriptor instance>'
print('@ builderlib module end')
Now in terminal type import builderlib
Metaclasses 101đź”—
[Metaclasses] are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).
Tim Peters, inventor of the Timsort algorithm and prolific Python contributor
A Metaclass is a class factory! Metaclass is a class whose instance are classes.
- We know classes are objects therefore each class must be instance of some other class right ?
- Python classes are instances of
type
. In other wordstype
is the metaclass for most built-in and user-defined classes.
- To avoid infinite regress, the class of
type
istype
- NOTE:
str
adnLineItem
are not subclasses oftype
. Bothstr
andLineItem
are instances oftype
. The all are subclasses ofobject
.
The classes object
and type
have a unique relationship: object
is an instance of type
, and type
is a subclass of object
. This relationship is “magic”: it cannot be expressed in Python because either class would have to exist before the other could be defined. The fact that type
is an instance of itself is also magical.
The next snippet shows that the class of collections.Iterable
is abc.ABCMeta
. Note that Iterable
is an abstract class, but ABCMeta
is a concrete class—after all, Iterable
is an instance of ABCMeta
>>> from collections.abc import Iterable
>>> Iterable.__class__
<class 'abc.ABCMeta'>
>>> import abc
>>> from abc import ABCMeta
>>> ABCMeta.__class__
<class 'type'>
- Ultimately, the class of
ABCMeta
is alsotype
. Every class is an instance oftype
, directly or indirectly, but only metaclasses are also subclasses oftype
.
How a Metaclass Customizes a Classđź”—
- To use a Metaclasses, its crucial to understand how
__new__
works on any class
- To process
class
statement, Python callsMetaKlass.__new__
with these argumentsmeta_cls
: The metaclass itself(MetaKlass
)cls_name
: The stringKlass
bases
: The single-element tuple (SuperKlass
), with more elements in the case of multiple inheritancecls_dict
: A mapping like :
- When you implement
MetaKlass.__new__
, you can inspect and change those arguments before passing them tosuper().__new__
, which will eventually calltype.__new__
to create the new class object. - After
super().__new__
returns, you can also apply further processing to the newly created class before returning it to Python. Python then callsSuperKlass.__init_subclass__
, passing the class you created, and then applies a class decorator to it, if one is present. Finally, Python binds the class object to its name in the surrounding namespace—usually the global namespace of a module, if theclass
statement was a top-level statement. - The most common processing made in a metaclass
__new__
is to add or replace items in thecls_dict
—the mapping that represents the namespace of the class under construction. For instance, before callingsuper().__new__
, you can inject methods in the class under construction by adding functions tocls_dict
. However, note that adding methods can also be done after the class is built, which is why we were able to do it using__init_subclass__
or a class decorator.
A Nice Metaclass Exampleđź”—
This is example form a book Python in a Nutshell (3rd edition) Chapter 4.
class MetaBunch(type): # to create a metaclass inherit from type
def __new__(meta_cls, cls_name, bases, cls_dict): # new works as classmethod
defaults = {} # holds mapping of attribute names and their defaults
def __init__(self, **kwargs): # injected in new class
for name, default in defaults.items(): # read defaults and inject in dict
setattr(self, name, kwargs.pop(name, default))
if kwargs: # it means there are no slots left where we can place them. We believe in failing fast as best practice, so we don’t want to silently ignore extra items. A quick and effective solution is to pop one item from kwargs and try to set it on the instance, triggering an AttributeError on purpose.
extra = ', '.join(kwargs)
raise AttributeError(f'No slots left for: {extra!r}')
def __repr__(self): # string representation
rep = ', '.join(f'{name}={value!r}'
for name, default in defaults.items()
if (value := getattr(self, name)) != default)
return f'{cls_name}({rep})'
new_dict = dict(__slots__=[], __init__=__init__, __repr__=__repr__) # namespace for new class
for name, value in cls_dict.items(): # iterate over namespace's of user class
if name.startswith('__') and name.endswith('__'): # dunder name is found, copy the item to the new class namespace unless its already there. This prevents users from overwriting __init__, __repr__, ..etc
if name in new_dict:
raise AttributeError(f"Can't set {name!r} in {cls_name!r}")
new_dict[name] = value
else: # not a dunder append to slots
new_dict['__slots__'].append(name)
defaults[name] = value
return super().__new__(meta_cls, cls_name, bases, new_dict) 12
class Bunch(metaclass=MetaBunch): # provide a base class, so users don’t need to see MetaBunch.
pass
Metaclass Evaluation Time Experimentđź”—
A Metaclass Solution for Checkedđź”—
read in book
Metaclasses in the Real Worldđź”—
Modern Features Simplify or Replace Metaclassesđź”—
Over time, several common use cases of metaclasses were made redundant by new language features:
- Class decorators : Simpler to understand than metaclasses, and less likely to cause conflicts with base classes and metaclasses.
__set_name__
: Avoids the need for custom metaclass logic to automatically set the name of a descriptor.
__init_subclass__
: Provides a way to customize class creation that is transparent to the end user and even simpler than a decorator—but may introduce conflicts in a complex class hierarchy.
-
Built-in
dict
preserving key insertion orderEliminated the #1 reason to use
__prepare__
: to provide anOrderedDict
to store the namespace of the class under construction. Python only calls__prepare__
on metaclasses, so if you needed to process the class namespace in the order it appears in the source code, you had to use a metaclass before Python 3.6.
I keep advocating these features because I see too much unnecessary complexity in our profession, and metaclasses are a gateway to complexity.
Metaclasses Are Stable Language Featuresđź”—
Metaclasses were introduced in Python 2.2 in 2002, together with so-called “new-style classes,” descriptors, and properties.
It is remarkable that the MetaBunch
example, first posted by Alex Martelli in July 2002, still works in Python 3.9—the only change being the way to specify the metaclass to use, which in Python 3 is done with the syntax class Bunch(metaclass=MetaBunch):
.
A Class Can Only Have One Metaclassđź”—
If your class declaration involves two or more metaclasses, you will see this puzzling error message:
TypeError: metaclass conflict: the metaclass of a derived class
must be a (non-strict) subclass of the metaclasses of all its bases
This may happen even without multiple inheritance. For example, a declaration like this could trigger that TypeError
:
We saw that abc.ABC
is an instance of the abc.ABCMeta
metaclass. If that Persistent
metaclass is not itself a subclass of abc.ABCMeta
, you get a metaclass conflict.
There are two ways of dealing with that error:
- Find some other way of doing what you need to do, while avoiding at least one of the metaclasses involved.
- Write your own
PersistentABCMeta
metaclass as a subclass of bothabc.ABCMeta
andPersistentMeta
, using multiple inheritance, and use that as the only metaclass forRecord
Metaclasses Should Be Implementation Detailsđź”—
Besides type
, there are only six metaclasses in the entire Python 3.9 standard library. The better known metaclasses are probably abc.ABCMeta
, typing.NamedTupleMeta
, and enum.EnumMeta
. None of them are intended to appear explicitly in user code. We may consider them implementation details.
In recent years, some metaclasses in the Python standard library were replaced by other mechanisms, without breaking the public API of their packages. The simplest way to future-proof such APIs is to offer a regular class that users subclass to access the functionality provided by the metaclass, as we’ve done in our examples.
Wrapping Upđź”—
Metaclasses, as well as class decorators and __init_subclass__
are useful for:
- Subclass registration
- Subclass structural validation
- Applying decorators to many methods at once
- Object serialization
- Object-relational mapping
- Object-based persistence
- Implementing special methods at the class level
- Implementing class features found in other languages, such as traits and aspect-oriented programming
Class metaprogramming can also help with performance issues in some cases, by performing tasks at import time that otherwise would execute repeatedly at runtime.
To wrap up, let’s recall Alex Martelli’s final advice from his essay “Waterfowl and ABCs”:
And, don’t define custom ABCs (or metaclasses) in production code… if you feel the urge to do so, I’d bet it’s likely to be a case of “all problems look like a nail”-syndrome for somebody who just got a shiny new hammer—you (and future maintainers of your code) will be much happier sticking with straightforward and simple code, eschewing such depths.
Those powerful tools exist primarily to support library and framework development. Applications naturally should use those tools, as provided by the Python standard library or external packages. But implementing them in application code is often premature abstraction.
Good frameworks are extracted, not invented. - David Heinemeier Hansson, creator of Ruby on Rails