Type Hints in Functionsđź”—
Type hints are the biggest change in the history of Python since the unification of types and classes in Python 2.2, released in 2001. However, type hints do not benefit all Python users equally. That’s why they should always be optional.
Goal for introduction is to help dev tools find bugs in python codebases via static analysis, i.e. without actually running the code through tests.
About Gradual Typingđź”—
PEP 484 introduced gradual type system to Python. The Mypy type checker itself started as a language: a gradually typed dialect of Python with its own interpreter. Guido van Rossum convinced the creator of Mypy, Jukka Lehtosalo, to make it a tool for checking annotated Python code.
A gradual type system:
- Is optional
- By default, the type checker should not emit warnings for code that has no type hints. Instead, type checker assumes the
Any
type when it cannot determine the type of an object. TheAny
type is considered compatibly with all other types.
- By default, the type checker should not emit warnings for code that has no type hints. Instead, type checker assumes the
- Does not catch type errors at runtime
- Type hints are used by static type checkers, linters and IDEs to raise warnings. THey do not prevent incosistent values from being passed to function at runtime
- Does not enhance performance
- provide data that could, in theory, allow optimizations in the generated byte code
The best usability feature of gradual typing is that annotation are always optional.
Gradual Typing in Practiceđź”—
def show_count(count, word):
if count == 1:
return f'1{word}'
count_str = str(count) if count else 'no'
return f'{count_str}{word}s'
pip install mypy
Installing mypy
Now writing mypy file.py
gives no error for untyped defs. To make it strict use --disallow-untyped-defs
For the first steps with gradual typing, use another option : --disallow-incomplete-defs
.
Allowing user to provide optional plural
parameters.
def show_count(count: int, singular: str, plural: str = '') -> str:
if count == 1:
return f'1 {singular}'
count_str = str(count) if count else 'no'
if not plural:
plural = singular + 's'
return f'{count_str} {plural}'
Using None as a Defaultđź”—
If the optional parmeter expects a mutable type, then None
is the only sensible default.
from typing import Optional
def show_count(count: int, singular: str, plural: Optional[str] = None) -> str:
Types are defined by supported operationsđź”—
In practice, it’s more useful to consider the set of supported operations as the defining characteristic of a type.
The x
parameter type may be numeric (int
, complex
, Fraction
, numpy.uint32
, etc.) but it may also be a sequence (str
, tuple
, list
, array
), an N-dimensional numpy.array
, or any other type that implements or inherits a __mul__
method that accepts an int
argument.
However consider this annotated double
Type Checker will reject this code since __mul__
is not implemented by abc.Sequence but at runtime this code works fine with concrete sequences such as str, tuple, list, array, etc.. as well numbers.
In a gradual type system, we have the interplay of two different views of types:
- Duck Typing
- object have type but variables are untyped. In practice, it doesn’t matter what the declared type is, only what operations it supports. If I can invoke
birdie.quack()
thenbirdie
is a duck in this context. - view adopted by Smalltalk, Javascript, and Ruby
- object have type but variables are untyped. In practice, it doesn’t matter what the declared type is, only what operations it supports. If I can invoke
- Nominal Typing
- Objects and variables have types but objects only exist at runtime and type checker only the source code where variables are annotated with type hints.
- If
Duck
is a subclass ofBird
, you can assign aDuck
instance to a parameter annotated asbirdie: Bird
. But in the body of the function, the type checker considers the callbirdie.quack()
illegal, becausebirdie
is nominally aBird
, and that class does not provide the.quack()
method. It doesn’t matter if the actual argument at runtime is aDuck
, because nominal typing is enforced statically. - The view adopted by C++, Java, and C#, supported by annotated Python.
Types Usable in Annotationsđź”—
The Any Typeđź”—
- Aka
dynamic type
- when type checker sees an untyped function like above assumes following
Contrast Any
with object
This function accepts arguments of every type, because every type is subtype of object
However type checker rejects this because object
doesn’t implement __mul__
Simple Types and Classesđź”—
Simple types like int
, float
, str
, and bytes
maybe used directly in type hints. concrete classes from the standard library, can also be used in type hints.
Abstract base classes are also useful in type hints.
Among classes, consistent-with is defined like subtype-of: a subclass is consistent-with all its superclasses.
However, “practicality beats purity,” so there is an important exception, which I discuss in the following tip.
Optional and Union Typesđź”—
The construct Optional[str]
is actually a shortcut for Union[str, None]
, which means the type of plural
may be str
or None
.
We can write str | bytes
instead of Union[str, bytes]
since Python 3.10.
Union[]
requires at least two types. Nested Union
types have the same effect as a flattened Union
. So this type hint:
is the same as:
Generic Collectionsđź”—
- Most Python collections are heterogeneous.
- Generic types can be declared with type parameters to specify the type of the items they can handle.
- Introduce from future import annotations in Python 3.7 to enable the use of standard library classes as generics with list[str] notation.
- Make that behavior the default in Python 3.9: list[str] now works without the future import.
- Deprecate all the redundant generic types from the typing module. Deprecation warnings will not be issued by the Python interpreter because type checkers should flag the deprecated types when the checked program targets Python 3.9 or newer.
- Remove those redundant generic types in the first version of Python released five years after Python 3.9. At the current cadence, that could be Python 3.14, a.k.a Python Pi.
Tuple Typesđź”—
Three way to annotate tuple types :
- Tuples as records
- If you’re using a
tuple
as a record, use thetuple
built-in and declare the types of the fields within[]
. - e.g.
tuple[str, float, str]
- If you’re using a
- Tuples as recrods with named fields
- To annotate a tuple with many fields, or specific types of tuple your code uses in many places, use NamedTuple
from typing import NamedTuple
from geolib import geohash as gh # type: ignore
PRECISION = 9
class Coordinate(NamedTuple):
lat: float
lon: float
def geohash(lat_lon: Coordinate) -> str:
return gh.encode(*lat_lon, PRECISION)
- Tuples as immutable sequences
- To annotate tuples of unspecified length that are used as immutable lists, you must specify a single type, followed by a comma and
...
- The ellipsis indicates that any number of elements >= 1 is acceptable. There is no way to specify fields of different types for tuples of arbitrary length.
- The annotations
stuff: tuple[Any, ...]
andstuff: tuple
mean the same thing:stuff
is a tuple of unspecified length with objects of any type.
- To annotate tuples of unspecified length that are used as immutable lists, you must specify a single type, followed by a comma and
Generic Mappingsđź”—
Generic mapping types are annotated as MappingType[KeyType, ValueType]
. The built-in dict
and the mapping types in collections
and collections.abc
accept that notation in Python ≥ 3.9.
dict[str, set[str]]
Abstract Base Classesđź”—
Be conservative in what you send, be liberal in what you accept. Postel’s law, a.k.a. the Robustness Principle.
- Above allows the caller to provide instance of
dict
,defaultdict
,ChainMap
, aUserDict
subclass, or any other type that is a subtype-ofMapping
.
- This one limits to only dict type
- therefore, in general it’s better to use
abc.Mapping
orabc.MutableMapping
in parameter type hints, instead ofdict
Postel’s law also tells us to be conservative in what we send. The return value of a function is always a concrete object, so the return type hint should be a concrete type
The fall of numeric towerđź”—
The numbers
package defines the so-called numeric tower in order linear hierarchy of ABCs
Number
Complex
Real
Rational
Integral
The “Numeric Tower” section of PEP 484 rejects the numbers ABCs and dictates that the built-in types complex
, float
, and int
should be treated as special cases
Iterableđź”—
The typing.List
documentation I just quoted recommends Sequence
and Iterable
for function parameter type hints.
from collections.abc import Iterable
FromTo = tuple[str, str] # type alias
def zip_replace(text: str, changes: Iterable[FromTo]) -> str: # Iterable[tuple[str, str]]
for from_, to in changes:
text = text.replace(from_, to)
return text
from python 3.10 we should use this for type aliases
abc.Iterable versus abc.Sequenceđź”—
Both math.fsum
and replacer.zip_replace
must iterate over the entire Iterable
arguments to return a result. Given an endless iterable such as the itertools.cycle
generator as input, these functions would consume all memory and crash the Python process. Despite this potential danger, it is fairly common in modern Python to offer functions that accept an Iterable
input even if they must process it completely to return a result. That gives the caller the option of providing input data as a generator instead of a prebuilt sequence, potentially saving a lot of memory if the number of input items is large.
Parameterized Generics and TypeVarđź”—
- A parameterized generic is a generic type, written as
list[T]
, whereT
is a type variable that will be bound to a specific type with each usage. This allows a parameter type to be reflected on the result type.
Example Illustration for mode
from collections import Counter
from collections.abc import Iterable
def mode(data: Iterable[float]) -> float:
pairs = Counter(data).most_common(1)
if len(pairs) == 0:
raise ValueError('no mode for empty data')
return pairs[0][0]
- Many uses of mode involve
float
orint
values but python has other numerical types so its desirable to have return type similar toiterable
used.
from collections.abc import Iterable
from typing import TypeVar
T = TypeVar('T')
def mode(data: Iterable[T]) -> T:
- When it first appears in the signature, the type parameter
T
can be any type. The second time it appears, it will mean the same type as the first.
Restricted TypeVarđź”—
TypeVar
accepts extra positional arguments to restrict the type parameter. We can improve the signature of mode
to accept specific number types, like this:
from collections.abc import Iterable
from decimal import Decimal
from fractions import Fraction
from typing import TypeVar
NumberT = TypeVar('NumberT', float, Decimal, Fraction)
def mode(data: Iterable[NumberT]) -> NumberT:
Bounded TypeVarđź”—
- A restricted type variable will be set to one of the types named in the
TypeVar
declaration.
- A bounded type variable will be set to the inferred type of the expression—as long as the inferred type is consistent-with the boundary declared in the
bound=
keyword argument ofTypeVar
.
-
The
typing
module includes a predefinedTypeVar
namedAnyStr
. It’s defined like this:
Static Protocolsđź”—
In OOPs, the concept of a protocol
as an info
Callableđź”—
- to annotate callback parameters or callable objects returned by higher-order-functions,
collections.abc
module providesCallable
type, available intyping
module for < 3.9
- during normal usage, the
repl
function uses Python’sinput
built-in to read expression from user. However for automated testing with other input sources,repl
accepts an optionalinput_fn
parameter (Callable) with same parameter and return types as input
Variance in Callable typesđź”—
- it’s OK to provide a callback that returns an
int
when the code expects a callback that returns afloat
, because anint
value can always be used where afloat
is expected. - we say that
Callable[[], int]
is subtype-ofCallable[[], float]
—asint
is subtype-offloat
. This means thatCallable
is covariant on the return type because the subtype-of relationship of the typesint
andfloat
is in the same direction as the relationship of theCallable
types that use them as return types. - But, it’s a type error to provide a callback that takes a
int
argument when a callback that handles afloat
is required. Callable[[int], None]
is not a subtype-ofCallable[[float], None]
. Althoughint
is subtype-offloat
, in the parameterizedCallable
type the relationship is reversed:Callable[[float], None]
is subtype-ofCallable[[int], None]
. Therefore we say thatCallable
is contravariant on the declared parameter types.
NoReturnđź”—
- special type used to annotate return type of functions that never return
- they exist to raise exceptions
- e.g.
sys.exit()
raisesSystemExit
to terminate python process
__status
parameter is positional only, and it has a default value- Stub files don’t spell out the default values, they use ... instead.
Annotating Positional Only and Variadic Parametersđź”—
In previous tag example we saw the signature was:
Here is tag
fully annotated, written in several lines - a common convention for long signatures.
from typing import Optional
def tag(
name: str,
/,
*content: str, # for arbitrary positional parameters, type in local `content` var => tuple[str, ...]
class_: Optional[str] = None,
**attrs: str, # type hint : dict[str, str], for floats its dict[str, float], for different types you will need to use a Union or Any
) -> str
Imperfect Typing and Strong Testingđź”—
Maintainers of large corporate codebases report that many bugs are found by static type checkers and fixed more cheaply than if the bugs were discovered only after the code is running in production. However, it’s essential to note that automated testing was standard practice and widely adopted long before static typing was introduced in the companies.
Even in the contexts where they are most beneficial, static typing cannot be trusted as the ultimate arbiter of correctness. It’s not hard to find:
-
False positives
Tools report type errors on code that is correct.
-
False negatives
Tools don’t report type errors on code that is incorrect.
Also, if we are forced to type check everything, we lose some of the expressive power of Python:
- Some handy features can’t be statically checked; for example, argument unpacking like
config(**settings)
. - Advanced features like properties, descriptors, metaclasses, and metaprogramming in general are poorly supported or beyond comprehension for type checkers.
- Type checkers lag behind Python releases, rejecting or even crashing while analyzing code with new language features—for more than a year in some cases.
Common data constraints cannot be expressed in the type system—even simple ones.