Skip to content

Type Hints in Functionsđź”—

Type hints are the biggest change in the history of Python since the unification of types and classes in Python 2.2, released in 2001. However, type hints do not benefit all Python users equally. That’s why they should always be optional.

Goal for introduction is to help dev tools find bugs in python codebases via static analysis, i.e. without actually running the code through tests.

About Gradual Typingđź”—

PEP 484 introduced gradual type system to Python. The Mypy type checker itself started as a language: a gradually typed dialect of Python with its own interpreter. Guido van Rossum convinced the creator of Mypy, Jukka Lehtosalo, to make it a tool for checking annotated Python code.

A gradual type system:

  • Is optional
    • By default, the type checker should not emit warnings for code that has no type hints. Instead, type checker assumes the Any type when it cannot determine the type of an object. The Any type is considered compatibly with all other types.
  • Does not catch type errors at runtime
    • Type hints are used by static type checkers, linters and IDEs to raise warnings. THey do not prevent incosistent values from being passed to function at runtime
  • Does not enhance performance
    • provide data that could, in theory, allow optimizations in the generated byte code

The best usability feature of gradual typing is that annotation are always optional.

Gradual Typing in Practiceđź”—

def show_count(count, word):
  if count == 1:
    return f'1{word}'
  count_str = str(count) if count else 'no'
  return f'{count_str}{word}s'

pip install mypy Installing mypy

Now writing mypy file.py gives no error for untyped defs. To make it strict use --disallow-untyped-defs

For the first steps with gradual typing, use another option : --disallow-incomplete-defs.

def show_count(count: int, word: str)->str:

Allowing user to provide optional plural parameters.

def show_count(count: int, singular: str, plural: str = '') -> str:
    if count == 1:
        return f'1 {singular}'
    count_str = str(count) if count else 'no'
    if not plural:
        plural = singular + 's'
    return f'{count_str} {plural}'

Using None as a Defaultđź”—

If the optional parmeter expects a mutable type, then None is the only sensible default.

from typing import Optional

def show_count(count: int, singular: str, plural: Optional[str] = None) -> str:

Types are defined by supported operationsđź”—

In practice, it’s more useful to consider the set of supported operations as the defining characteristic of a type.

def double(x):
    return x * 2

The x parameter type may be numeric (int, complex, Fraction, numpy.uint32, etc.) but it may also be a sequence (str, tuple, list, array), an N-dimensional numpy.array, or any other type that implements or inherits a __mul__ method that accepts an int argument.

However consider this annotated double

from collections import abc

def double(x: abc.Sequence):
    return x * 2

Type Checker will reject this code since __mul__ is not implemented by abc.Sequence but at runtime this code works fine with concrete sequences such as str, tuple, list, array, etc.. as well numbers.

In a gradual type system, we have the interplay of two different views of types:

  • Duck Typing
    • object have type but variables are untyped. In practice, it doesn’t matter what the declared type is, only what operations it supports. If I can invoke birdie.quack() then birdie is a duck in this context.
    • view adopted by Smalltalk, Javascript, and Ruby
  • Nominal Typing
    • Objects and variables have types but objects only exist at runtime and type checker only the source code where variables are annotated with type hints.
    • If Duck is a subclass of Bird, you can assign a Duck instance to a parameter annotated as birdie: Bird. But in the body of the function, the type checker considers the call birdie.quack() illegal, because birdie is nominally a Bird, and that class does not provide the .quack() method. It doesn’t matter if the actual argument at runtime is a Duck, because nominal typing is enforced statically.
    • The view adopted by C++, Java, and C#, supported by annotated Python.

Types Usable in Annotationsđź”—

The Any Typeđź”—

  • Aka dynamic type
  • when type checker sees an untyped function like above assumes following
def double(x: Any) -> Any:
  return x * 2

Contrast Any with object

def double(x: object) -> object:

This function accepts arguments of every type, because every type is subtype of object

However type checker rejects this because objectdoesn’t implement __mul__

Simple Types and Classesđź”—

Simple types like int, float, str, and bytes maybe used directly in type hints. concrete classes from the standard library, can also be used in type hints.

Abstract base classes are also useful in type hints.

Among classes, consistent-with is defined like subtype-of: a subclass is consistent-with all its superclasses.

However, “practicality beats purity,” so there is an important exception, which I discuss in the following tip.

Optional and Union Typesđź”—

The construct Optional[str] is actually a shortcut for Union[str, None], which means the type of plural may be str or None.

We can write str | bytes instead of Union[str, bytes] since Python 3.10.

plural: Optional[str] = None    # before
plural: str | None = None       # after

Union[] requires at least two types. Nested Union types have the same effect as a flattened Union. So this type hint:

Union[A, B, Union[C, D, E]]

is the same as:

Union[A, B, C, D, E]

Generic Collectionsđź”—

  • Most Python collections are heterogeneous.
  • Generic types can be declared with type parameters to specify the type of the items they can handle.
def tokenize(text: str) -> list[str]:
    return text.upper().split()
  • Introduce from future import annotations in Python 3.7 to enable the use of standard library classes as generics with list[str] notation.
  • Make that behavior the default in Python 3.9: list[str] now works without the future import.
  • Deprecate all the redundant generic types from the typing module. Deprecation warnings will not be issued by the Python interpreter because type checkers should flag the deprecated types when the checked program targets Python 3.9 or newer.
  • Remove those redundant generic types in the first version of Python released five years after Python 3.9. At the current cadence, that could be Python 3.14, a.k.a Python Pi.

Tuple Typesđź”—

Three way to annotate tuple types :

  • Tuples as records
    • If you’re using a tuple as a record, use the tuple built-in and declare the types of the fields within [].
    • e.g. tuple[str, float, str]
  • Tuples as recrods with named fields
    • To annotate a tuple with many fields, or specific types of tuple your code uses in many places, use NamedTuple
from typing import NamedTuple

from geolib import geohash as gh  # type: ignore

PRECISION = 9

class Coordinate(NamedTuple):
    lat: float
    lon: float

def geohash(lat_lon: Coordinate) -> str:
    return gh.encode(*lat_lon, PRECISION)
  • Tuples as immutable sequences
    • To annotate tuples of unspecified length that are used as immutable lists, you must specify a single type, followed by a comma and ...
    • The ellipsis indicates that any number of elements >= 1 is acceptable. There is no way to specify fields of different types for tuples of arbitrary length.
    • The annotations stuff: tuple[Any, ...] and stuff: tuple mean the same thing: stuff is a tuple of unspecified length with objects of any type.

Generic Mappingsđź”—

Generic mapping types are annotated as MappingType[KeyType, ValueType]. The built-in dict and the mapping types in collections and collections.abc accept that notation in Python ≥ 3.9.

  • dict[str, set[str]]

Abstract Base Classesđź”—

Be conservative in what you send, be liberal in what you accept. Postel’s law, a.k.a. the Robustness Principle.

from collections.abc import Mapping

def name2hex(name: str, color_map: Mapping[str, int]) -> str:
  • Above allows the caller to provide instance of dict, defaultdict, ChainMap, a UserDict subclass, or any other type that is a subtype-of Mapping.
def name2hex(name: str, color_map: dict[str, int]) -> str:
  • This one limits to only dict type
  • therefore, in general it’s better to use abc.Mapping or abc.MutableMapping in parameter type hints, instead of dict

Postel’s law also tells us to be conservative in what we send. The return value of a function is always a concrete object, so the return type hint should be a concrete type

The fall of numeric towerđź”—

The numbers package defines the so-called numeric tower in order linear hierarchy of ABCs

  • Number
  • Complex
  • Real
  • Rational
  • Integral

The “Numeric Tower” section of PEP 484 rejects the numbers ABCs and dictates that the built-in types complex, float, and int should be treated as special cases

Iterableđź”—

The typing.List documentation I just quoted recommends Sequence and Iterable for function parameter type hints.

from collections.abc import Iterable

FromTo = tuple[str, str]    # type alias

def zip_replace(text: str, changes: Iterable[FromTo]) -> str:   # Iterable[tuple[str, str]]
    for from_, to in changes:
        text = text.replace(from_, to)
    return text

from python 3.10 we should use this for type aliases

from typing import TypeAlias

FromTo: TypeAlias = tuple[str, str]

abc.Iterable versus abc.Sequenceđź”—

Both math.fsum and replacer.zip_replace must iterate over the entire Iterable arguments to return a result. Given an endless iterable such as the itertools.cycle generator as input, these functions would consume all memory and crash the Python process. Despite this potential danger, it is fairly common in modern Python to offer functions that accept an Iterable input even if they must process it completely to return a result. That gives the caller the option of providing input data as a generator instead of a prebuilt sequence, potentially saving a lot of memory if the number of input items is large.

Parameterized Generics and TypeVarđź”—

  • A parameterized generic is a generic type, written as list[T], where T is a type variable that will be bound to a specific type with each usage. This allows a parameter type to be reflected on the result type.

Example Illustration for mode

from collections import Counter
from collections.abc import Iterable

def mode(data: Iterable[float]) -> float:
    pairs = Counter(data).most_common(1)
    if len(pairs) == 0:
        raise ValueError('no mode for empty data')
    return pairs[0][0]
  • Many uses of mode involve float or int values but python has other numerical types so its desirable to have return type similar to iterable used.
from collections.abc import Iterable
from typing import TypeVar

T = TypeVar('T')

def mode(data: Iterable[T]) -> T:
  • When it first appears in the signature, the type parameter T can be any type. The second time it appears, it will mean the same type as the first.

Restricted TypeVarđź”—

TypeVar accepts extra positional arguments to restrict the type parameter. We can improve the signature of mode to accept specific number types, like this:

from collections.abc import Iterable
from decimal import Decimal
from fractions import Fraction
from typing import TypeVar

NumberT = TypeVar('NumberT', float, Decimal, Fraction)

def mode(data: Iterable[NumberT]) -> NumberT:

Bounded TypeVarđź”—

from collections.abc import Iterable, Hashable

def mode(data: Iterable[Hashable]) -> Hashable:
  • A restricted type variable will be set to one of the types named in the TypeVar declaration.
  • A bounded type variable will be set to the inferred type of the expression—as long as the inferred type is consistent-with the boundary declared in the bound= keyword argument of TypeVar.
  • The typing module includes a predefined TypeVar named AnyStr. It’s defined like this:

    AnyStr = TypeVar('AnyStr', bytes, str)
    

Static Protocolsđź”—

In OOPs, the concept of a protocol as an info

Callableđź”—

  • to annotate callback parameters or callable objects returned by higher-order-functions, collections.abc module provides Callable type, available in typing module for < 3.9
Callable[[ParamType1, ParamType2], ReturnType]
def repl(input_fn: Callable[[Any], str] = input]) -> None:
  • during normal usage, the repl function uses Python’s input built-in to read expression from user. However for automated testing with other input sources, repl accepts an optional input_fn parameter (Callable) with same parameter and return types as input

Variance in Callable typesđź”—

  • it’s OK to provide a callback that returns an int when the code expects a callback that returns a float, because an int value can always be used where a float is expected.
  • we say that Callable[[], int] is subtype-of Callable[[], float]—as int is subtype-of float. This means that Callable is covariant on the return type because the subtype-of relationship of the types int and float is in the same direction as the relationship of the Callable types that use them as return types.
  • But, it’s a type error to provide a callback that takes a int argument when a callback that handles a float is required.
  • Callable[[int], None] is not a subtype-of Callable[[float], None]. Although int is subtype-of float, in the parameterized Callable type the relationship is reversed: Callable[[float], None] is subtype-of Callable[[int], None]. Therefore we say that Callable is contravariant on the declared parameter types.

NoReturnđź”—

  • special type used to annotate return type of functions that never return
  • they exist to raise exceptions
  • e.g. sys.exit() raises SystemExit to terminate python process
def exit(__status: object = ...) -> NoReturn: ...
  • __status parameter is positional only, and it has a default value
  • Stub files don’t spell out the default values, they use ... instead.

Annotating Positional Only and Variadic Parametersđź”—

In previous tag example we saw the signature was:

def tag(name, /, *content, class_=None, **attrs):

Here is tag fully annotated, written in several lines - a common convention for long signatures.

from typing import Optional
def tag(
    name: str,
  /,
  *content: str,    # for arbitrary positional parameters, type in local `content` var => tuple[str, ...]
  class_: Optional[str] = None,
  **attrs: str, # type hint : dict[str, str], for floats its dict[str, float], for different types you will need to use a Union or Any
) -> str

Imperfect Typing and Strong Testingđź”—

Maintainers of large corporate codebases report that many bugs are found by static type checkers and fixed more cheaply than if the bugs were discovered only after the code is running in production. However, it’s essential to note that automated testing was standard practice and widely adopted long before static typing was introduced in the companies.

Even in the contexts where they are most beneficial, static typing cannot be trusted as the ultimate arbiter of correctness. It’s not hard to find:

  • False positives

    Tools report type errors on code that is correct.

  • False negatives

    Tools don’t report type errors on code that is incorrect.

Also, if we are forced to type check everything, we lose some of the expressive power of Python:

  • Some handy features can’t be statically checked; for example, argument unpacking like config(**settings).
  • Advanced features like properties, descriptors, metaclasses, and metaprogramming in general are poorly supported or beyond comprehension for type checkers.
  • Type checkers lag behind Python releases, rejecting or even crashing while analyzing code with new language features—for more than a year in some cases.

Common data constraints cannot be expressed in the type system—even simple ones.