Skip to content

Class Metaprogrammingđź”—

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it? - Brian W. Kernighan and PJ Plauger, The Elements of Programming Style

  • Class metaprogramming is the art of creating or customizing classes at runtime.
  • Classes are first-class objects in Python, so a function can be used to create a new class at any time, without class keyword.
  • Class decorators are functions, but designed to inspect, change and even replace the decorated class with another class.
  • Metaclasses are most advanced tool for class metaprogramming : they let you create whole new categories of classes with special traits such as abstract base classes.
  • Metaclasses are powerful, but hard to justify and even harder to get right. Class decorators solve many of the same problems and are easier to understand.

Classes as Objectsđź”—

  • classes are objects in python. Every class has a number of attributes defined in the Python Data Model. Three of those attributes appeared several times in this book already : __class__, __name__, and __mro__. Other class attributes are :
  • cls.__bases__ : tuple of base classes of the class
  • cls.__qualname__ : The qualified name of a class or function which is a dotted path from global scope of the module to the class definition.
  • cls.__subclasses__() : This method returns a list of immediate subclasses of the class. The implementation uses weak references to avoid circular references between the superclass and its subclasses - which hold a strong reference to the superclasses in their __bases__ attribute.

    NOTE: The method lists subclasses currently in memory. Subclasses in modules not yet imported will not appear in the result.

  • cls.mro() : The interpreter calls this method when building a class to obtain the tuple of superclasses stored in the __mro__ attribute of the class. A metaclass can override this method to customize the method resolution order of the class under construction.

None of the attributes mentioned in this section are listed by the dir(…) function.

type: The Built-in Class Factoryđź”—

  • We usually think type returns the class of the object, because thats what type(my_object) does : it returns my_object.__class__
  • type is a class that creates a new class when invoked with three arguments
class MyClass(MySuperClass, MyMixin):
  x = 42
  def x2(self):
    return self.x * 2
  • Using type constructor we can create MyClass at runtime with this code:
MyClass = type('MyClass',
               (MySuperClass, MyMixin),
               {'x': 42, 'x2': lambda self: self.x * 2},
          )
  • when Python read class statement it calls, type to build the class object with these parameters. name (MyClass), bases (tuple of superclasses) and dict (mapping of attribute names to values. Callable become methods)
  • The type class is a metaclass : a class that builds classes. instance of type class are classes.

A Class Factory Functionđź”—

  • Standard library has a class factory function that appears several times in book : collections.namedtuple, we also saw typing.NamedTuple and @dataclass
  • a super simple factory for classes of mutable objects - the simplest replacement for @dataclass
class Dog:
  def __init__(self, name, weight, owner):
    self.name = name
    self.weight = weight
    self.owner = owner
  • Above boilplate is repititive (appears 3 times), and that doesn’t even buy us a nice repr
rex = Dog('Rex', 30, 'Bob')
rex
# <__main__.Dog object at 0x2865..>
  • Let’s create a record_factory that simplifies this
Dog = record_factory('Dog','name weight owner')
rex = Dog('Rex', 30, 'Bob')
rex
# Dog(name='Rex', weight=30, owner='Bob')
name, weight, _ = rex
rex.weight = 32
rex
# Dog(name='Rex', weight=32, owner='Bob')
Dog.__mro__
# (<class 'factories.Dog'>, <class 'object'>)

The code for record_factory :

from typing import Union, Any
from collections.abc import Iterable, Iterator

FieldNames = Union[str, Iterable[str]] # single string or list of string

def record_factory(cls_name: str, field_names: FieldNames) -> type[tuple]: # first two of collections.namedtuple; return a type

    slots = parse_identifiers(field_names) # build a tuple of attribute names

    def __init__(self, *args, **kwargs) -> None: # build tuple from slots and args
        attrs = dict(zip(self.__slots__, args))
        attrs.update(kwargs)
        for name, value in attrs.items():
            setattr(self, name, value)

    def __iter__(self) -> Iterator[Any]: # return iterators
        for name in self.__slots__:
            yield getattr(self, name)

    def __repr__(self): # representation
        values = ', '.join(f'{name}={value!r}'
            for name, value in zip(self.__slots__, self))
        cls_name = self.__class__.__name__
        return f'{cls_name}({values})'

    cls_attrs = dict(  # assemble a dictionary of class attributes
        __slots__=slots,
        __init__=__init__,
        __iter__=__iter__,
        __repr__=__repr__,
    )

    return type(cls_name, (object,), cls_attrs) # return and build class, type constructor


def parse_identifiers(names: FieldNames) -> tuple[str, ...]:
    if isinstance(names, str):
        names = names.replace(',', ' ').split() # convert names separated by spaces or commas to list of str
    if not all(s.isidentifier() for s in names):
        raise ValueError('names must all be valid identifiers')
    return tuple(names)

Introducing __init_subclass__đź”—

  • Both __initsubclass__ and __set_name__ (prev chap) came out in PEP487-Simple customization of class creation.
  • We could use both typing.NamedTuple and @dataclass to let programmers used class statement to specify attributes for a new class, which is enhanced by the class builder with automatic addition of methods like __init__, __repr__ , __eq__ etc.
  • Both these class builders read type hints in user’s class to enhance class. Those type hints to allow static type checkes to validate code that sets or gets those attributes. However NamedTuple and @dataclass doesn’t take advantage of type hints for attributes at runtimes. We will implement a Checked class that does
from collections.abc import Callable  # Type hints on callable
from typing import Any, NoReturn, get_type_hints


class Field:    # descriptor
    def __init__(self, name: str, constructor: Callable) -> None: # minimum type hints and return type is Any
        if not callable(constructor) or constructor is type(None): # Minimal callable type hint
            raise TypeError(f'{name!r} type hint must be callable')
        self.name = name
        self.constructor = constructor

    def __set__(self, instance: Any, value: Any) -> None:
        if value is ...: # Checked.__init__ sets value as Ellipses we call the constructor with no arguments.
            value = self.constructor()
        else:
            try:
                value = self.constructor(value) # otherwise call with the value
            except (TypeError, ValueError) as e:
                type_name = self.constructor.__name__
                msg = f'{value!r} is not compatible with {self.name}:{type_name}'
                raise TypeError(msg) from e
        instance.__dict__[self.name] = value # store values in instance dictionary

class Checked:
    @classmethod
    def _fields(cls) -> dict[str, type]:
        return get_type_hints(cls)

    def __init_subclass__(subclass) -> None: # is called when subclass of the current class is defined
        super().__init_subclass__() # not strictly necessary but should handle classes that invoke .__init_subclass__()
        for name, constructor in subclass._fields().items(): # for each field name and constructor
            setattr(subclass, name, Field(name, constructor)) # create an attribute on subclass with that name bound to a Field descriptor parameterized with name and constructor

    def __init__(self, **kwargs: Any) -> None:
        for name in self._fields():
            value = kwargs.pop(name, ...) # for each field get value from kwargs and remove from kwargs, ... helps distinguish between arguments given the value None from arguments that wer not given.
            setattr(self, name, value)  # Calls Checked.__setattr__
        if kwargs: # if still items remaining in the kwargs, __init__ fails
            self.__flag_unknown_attrs(*kwargs)
    def __setattr__(self, name: str, value: Any) -> None: # intercepts all attempsts to set and instance attribute
        if name in self._fields(): # if attribute name is konw fetch corresponding descriptor
            cls = self.__class__
            descriptor = getattr(cls, name)
            descriptor.__set__(self, value) # we need it due to above comment
        else: # unknown attribute
            self.__flag_unknown_attrs(name)

    def __flag_unknown_attrs(self, *names: str) -> NoReturn: # rare use of NoReturn type to raise Attribute Error
        plural = 's' if len(names) > 1 else ''
        extra = ', '.join(f'{name!r}' for name in names)
        cls_name = repr(self.__class__.__name__)
        raise AttributeError(f'{cls_name} object has no attribute{plural} {extra}')

    def _asdict(self) -> dict[str, Any]: # create dict from the attributes of a Movie object
        return {
            name: getattr(self, name)
            for name, attr in self.__class__.__dict__.items()
            if isinstance(attr, Field)
        }

    def __repr__(self) -> str: # implement nice __repr__
        kwargs = ', '.join(
            f'{key}={value!r}' for key, value in self._asdict().items()
        )
        return f'{self.__class__.__name__}({kwargs})'
class Movie(Checked):
    title: str
  year: int
  box_office : float


movie = Movie(title='The Godfather', year=1972, box_office=137)
movie.title # The Godfather
movie # Movie(title='The Godfather', year=1972, box_office=137.0)

Movie(title="Life of Brian")
# Movie(title='Life of Brian', year=0, box_office=0.0)

# NOTE : How defaults are picked up based on type
# int(), float(), bool(), str(), list(), dict(), set()
# (0, 0.0, False, '', [], {}, set())

blockbuster = Movie(title='Avatar', year=2009, box_office='billions')
# TypeError: 'billions' is not compatible with box_office:float

Why __init_subclass__ Cannot configure __slots__đź”—

The __slots__ attribute is only effective if it is one of the entries in the class namespace passed to type.__new__. Adding __slots__ to an existing class has no effect. Python invokes __init_subclass__ only after the class is built—by then it’s too late to configure __slots__. A class decorator can’t configure __slots__ either, because it is applied even later than __init_subclass__.

To configure __slots__ at runtime, your own code must build the class namespace passed as the last argument of type.__new__. To do that, you can write a class factory function, like record_factory.py, or you can take the nuclear option and implement a metaclass. We will see how to dynamically configure __slots__ in “Metaclasses 101”.

Enhancing Classes with a Class Decoratorđź”—

A class decorator is a callable that behaves similarly to a function decorator.

Probably the most common reason to choose a class decorator over the simpler __init_subclass__ is to avoid interfering with other class features, such as inheritance and metaclasses

def checked(cls: type)-> type:
    for name, constructor in _fields(cls).items(): # _fields is a top-level function
        setattr(cls, name, Field(name, constructor)) # replace each attribute by Fields descriptor

    cls._fields = classmethod(_fields)  # type: ignore # build class method from _fields, add to decorated class.

    instance_methods = ( # module level functions that will become instance methods of decorated class.
        __init__,
        __repr__,
        __setattr__,
        _asdict,
        __flag_unknown_attrs,
    )
    for method in instance_methods: # add each instance methods
        setattr(cls, method.__name__, method)

    return cls
  • Method’s to be injected in decorated class
def _fields(cls: type) -> dict[str, type]:
    return get_type_hints(cls)

def __init__(self: Any, **kwargs: Any) -> None:
    for name in self._fields():
        value = kwargs.pop(name, ...)
        setattr(self, name, value)
    if kwargs:
        self.__flag_unknown_attrs(*kwargs)

def __setattr__(self: Any, name: str, value: Any) -> None:
    if name in self._fields():
        cls = self.__class__
        descriptor = getattr(cls, name)
        descriptor.__set__(self, value)
    else:
        self.__flag_unknown_attrs(name)

def __flag_unknown_attrs(self: Any, *names: str) -> NoReturn:
    plural = 's' if len(names) > 1 else ''
    extra = ', '.join(f'{name!r}' for name in names)
    cls_name = repr(self.__class__.__name__)
    raise AttributeError(f'{cls_name} has no attribute{plural} {extra}')

def _asdict(self: Any) -> dict[str, Any]:
    return {
        name: getattr(self, name)
        for name, attr in self.__class__.__dict__.items()
        if isinstance(attr, Field)
    }

def __repr__(self: Any) -> str:
    kwargs = ', '.join(
        f'{key}={value!r}' for key, value in self._asdict().items()
    )
    return f'{self.__class__.__name__}({kwargs})'

What Happens when : Import Time vs Runtimeđź”—

  • Python programmers talk about import time vs runtime, but the terms are not strictly defined and there is gray area.
  • At import time the interpreter:
    • Parses the source code of a .py module in one pass from top to bottom. This is when a SyntaxError may occur.
    • Compiles the bytecode to be executed.
    • Executes the top-level code of the compiled module.
  • If there is an up-to-date .pyc file available in the local __pycache__, parsing and compiling are skipped because the bytecode is ready to run.
  • parsing and compiling is “import time” activities while executable statements can potentially run and change state of program.
  • In particular, the import statement is not merely a declaration, but it actually runs all the top-level code of a module when it is imported for the first time in the process. Further imports of same module will use a cache, and then the only effect will be binding the imported objects to names in the client module.
  • the import statement can trigger all sorts of “runtime” behavior. Conversely, “import time” can also happen deep inside runtime, because the import statement and the __import__() built-in can be used inside any regular function.

Evaluation Time Experimentsđź”—

Consider an evaldemo.py script that uses a class decorator, a descriptor, and a class builder based on __init_subclass__, all defined in a builderlib.py module.

print('@ builderlib module start')

class Builder:
    print('@ Builder body')

    def __init_subclass__(cls):
        print(f'@ Builder.__init_subclass__({cls!r})')

        def inner_0(self):
            print(f'@ SuperA.__init_subclass__:inner_0({self!r})')

        cls.method_a = inner_0

    def __init__(self):
        super().__init__()
        print(f'@ Builder.__init__({self!r})')


def deco(cls): # class decorator
    print(f'@ deco({cls!r})')

    def inner_1(self): # print the class
        print(f'@ deco:inner_1({self!r})')

    cls.method_b = inner_1
    return cls

class Descriptor:
    print('@ Descriptor body')

    def __init__(self):
        print(f'@ Descriptor.__init__({self!r})')

    def __set_name__(self, owner, name):
        args = (self, owner, name)
        print(f'@ Descriptor.__set_name__{args!r}')

    def __set__(self, instance, value):
        args = (self, instance, value)
        print(f'@ Descriptor.__set__{args!r}')

    def __repr__(self):
        return '<Descriptor instance>'


print('@ builderlib module end')

Now in terminal type import builderlib

@ builderlib module start
@ Builder body
@ Descriptor body
@ builderlib module end

Metaclasses 101đź”—

[Metaclasses] are deeper magic than 99% of users should ever worry about. If you wonder whether you need them, you don’t (the people who actually need them know with certainty that they need them, and don’t need an explanation about why).

Tim Peters, inventor of the Timsort algorithm and prolific Python contributor

A Metaclass is a class factory! Metaclass is a class whose instance are classes.

  • We know classes are objects therefore each class must be instance of some other class right ?
  • Python classes are instances of type. In other words type is the metaclass for most built-in and user-defined classes.
str.__class__
# <class 'type'>
type.__class__
# <class 'type' >
  • To avoid infinite regress, the class of type is type
  • NOTE: str adn LineItem are not subclasses of type. Both str and LineItem are instances of type. The all are subclasses of object.

UML class diagrams with object and type relationships.

The classes object and type have a unique relationship: object is an instance of type, and type is a subclass of object. This relationship is “magic”: it cannot be expressed in Python because either class would have to exist before the other could be defined. The fact that type is an instance of itself is also magical.

The next snippet shows that the class of collections.Iterable is abc.ABCMeta. Note that Iterable is an abstract class, but ABCMeta is a concrete class—after all, Iterable is an instance of ABCMeta

>>> from collections.abc import Iterable
>>> Iterable.__class__
<class 'abc.ABCMeta'>
>>> import abc
>>> from abc import ABCMeta
>>> ABCMeta.__class__
<class 'type'>
  • Ultimately, the class of ABCMeta is also type. Every class is an instance of type, directly or indirectly, but only metaclasses are also subclasses of type.

UML class diagrams with Iterable and ABCMeta relationships.

How a Metaclass Customizes a Classđź”—

  • To use a Metaclasses, its crucial to understand how __new__ works on any class
class Klass(SuperKlass, metaclass=MetaKlass):
    x = 42
    def __init__(self, y):
        self.y = y
  • To process class statement, Python calls MetaKlass.__new__ with these arguments
    • meta_cls : The metaclass itself(MetaKlass)
    • cls_name : The string Klass
    • bases : The single-element tuple (SuperKlass), with more elements in the case of multiple inheritance
    • cls_dict : A mapping like :
{x: 42, `__init__`: <function __init__ at 0x1009c4040>}
  • When you implement MetaKlass.__new__, you can inspect and change those arguments before passing them to super().__new__, which will eventually call type.__new__ to create the new class object.
  • After super().__new__ returns, you can also apply further processing to the newly created class before returning it to Python. Python then calls SuperKlass.__init_subclass__, passing the class you created, and then applies a class decorator to it, if one is present. Finally, Python binds the class object to its name in the surrounding namespace—usually the global namespace of a module, if the class statement was a top-level statement.
  • The most common processing made in a metaclass __new__ is to add or replace items in the cls_dict—the mapping that represents the namespace of the class under construction. For instance, before calling super().__new__, you can inject methods in the class under construction by adding functions to cls_dict. However, note that adding methods can also be done after the class is built, which is why we were able to do it using __init_subclass__ or a class decorator.

A Nice Metaclass Exampleđź”—

This is example form a book Python in a Nutshell (3rd edition) Chapter 4.

class MetaBunch(type): # to create a metaclass inherit from type
    def __new__(meta_cls, cls_name, bases, cls_dict): # new works as classmethod

        defaults = {} # holds mapping of attribute names and their defaults

        def __init__(self, **kwargs): # injected in new class
            for name, default in defaults.items(): # read defaults and inject in dict
                setattr(self, name, kwargs.pop(name, default))
            if kwargs: # it means there are no slots left where we can place them. We believe in failing fast as best practice, so we don’t want to silently ignore extra items. A quick and effective solution is to pop one item from kwargs and try to set it on the instance, triggering an AttributeError on purpose.
                extra = ', '.join(kwargs)
                raise AttributeError(f'No slots left for: {extra!r}')

        def __repr__(self):  # string representation
            rep = ', '.join(f'{name}={value!r}'
                            for name, default in defaults.items()
                            if (value := getattr(self, name)) != default)
            return f'{cls_name}({rep})'

        new_dict = dict(__slots__=[], __init__=__init__, __repr__=__repr__)  # namespace for new class

        for name, value in cls_dict.items(): # iterate over namespace's of user class
            if name.startswith('__') and name.endswith('__'): # dunder name is found, copy the item to the new class namespace unless its already there. This prevents users from overwriting __init__, __repr__, ..etc
                if name in new_dict:
                    raise AttributeError(f"Can't set {name!r} in {cls_name!r}")
                new_dict[name] = value
            else: # not a dunder append to slots
                new_dict['__slots__'].append(name)
                defaults[name] = value
        return super().__new__(meta_cls, cls_name, bases, new_dict)  12


class Bunch(metaclass=MetaBunch):  # provide a base class, so users don’t need to see MetaBunch.
    pass

Metaclass Evaluation Time Experimentđź”—

A Metaclass Solution for Checkedđź”—

read in book

Metaclasses in the Real Worldđź”—

Modern Features Simplify or Replace Metaclassesđź”—

Over time, several common use cases of metaclasses were made redundant by new language features:

  • Class decorators : Simpler to understand than metaclasses, and less likely to cause conflicts with base classes and metaclasses.
  • __set_name__ : Avoids the need for custom metaclass logic to automatically set the name of a descriptor.
  • __init_subclass__ : Provides a way to customize class creation that is transparent to the end user and even simpler than a decorator—but may introduce conflicts in a complex class hierarchy.
  • Built-in dict preserving key insertion order

    Eliminated the #1 reason to use __prepare__: to provide an OrderedDict to store the namespace of the class under construction. Python only calls __prepare__ on metaclasses, so if you needed to process the class namespace in the order it appears in the source code, you had to use a metaclass before Python 3.6.

I keep advocating these features because I see too much unnecessary complexity in our profession, and metaclasses are a gateway to complexity.

Metaclasses Are Stable Language Featuresđź”—

Metaclasses were introduced in Python 2.2 in 2002, together with so-called “new-style classes,” descriptors, and properties.

It is remarkable that the MetaBunch example, first posted by Alex Martelli in July 2002, still works in Python 3.9—the only change being the way to specify the metaclass to use, which in Python 3 is done with the syntax class Bunch(metaclass=MetaBunch):.

A Class Can Only Have One Metaclassđź”—

If your class declaration involves two or more metaclasses, you will see this puzzling error message:

TypeError: metaclass conflict: the metaclass of a derived class
must be a (non-strict) subclass of the metaclasses of all its bases

This may happen even without multiple inheritance. For example, a declaration like this could trigger that TypeError:

class Record(abc.ABC, metaclass=PersistentMeta):
    pass

We saw that abc.ABC is an instance of the abc.ABCMeta metaclass. If that Persistent metaclass is not itself a subclass of abc.ABCMeta, you get a metaclass conflict.

There are two ways of dealing with that error:

  • Find some other way of doing what you need to do, while avoiding at least one of the metaclasses involved.
  • Write your own PersistentABCMeta metaclass as a subclass of both abc.ABCMeta and PersistentMeta, using multiple inheritance, and use that as the only metaclass for Record

Metaclasses Should Be Implementation Detailsđź”—

Besides type, there are only six metaclasses in the entire Python 3.9 standard library. The better known metaclasses are probably abc.ABCMeta, typing.NamedTupleMeta, and enum.EnumMeta. None of them are intended to appear explicitly in user code. We may consider them implementation details.

In recent years, some metaclasses in the Python standard library were replaced by other mechanisms, without breaking the public API of their packages. The simplest way to future-proof such APIs is to offer a regular class that users subclass to access the functionality provided by the metaclass, as we’ve done in our examples.

Wrapping Upđź”—

Metaclasses, as well as class decorators and __init_subclass__ are useful for:

  • Subclass registration
  • Subclass structural validation
  • Applying decorators to many methods at once
  • Object serialization
  • Object-relational mapping
  • Object-based persistence
  • Implementing special methods at the class level
  • Implementing class features found in other languages, such as traits and aspect-oriented programming

Class metaprogramming can also help with performance issues in some cases, by performing tasks at import time that otherwise would execute repeatedly at runtime.

To wrap up, let’s recall Alex Martelli’s final advice from his essay “Waterfowl and ABCs”:

And, don’t define custom ABCs (or metaclasses) in production code… if you feel the urge to do so, I’d bet it’s likely to be a case of “all problems look like a nail”-syndrome for somebody who just got a shiny new hammer—you (and future maintainers of your code) will be much happier sticking with straightforward and simple code, eschewing such depths.

Those powerful tools exist primarily to support library and framework development. Applications naturally should use those tools, as provided by the Python standard library or external packages. But implementing them in application code is often premature abstraction.

Good frameworks are extracted, not invented. - David Heinemeier Hansson, creator of Ruby on Rails