Jelle Zijlstra

Implementing PEP 695

PEP 695, Type Parameter Syntax, is a major change to Python’s syntax and scoping rules. This document is a discussion of how I implemented this change at runtime in PR #103764. It is not a discussion of whether the PEP is a good idea, or a deep dive into the type system concepts that the PEP supports. Instead, I will discuss how the PEP’s new syntax behaves at runtime, why it behaves that way, and how we are implementing it. To motivate that behavior, I will also briefly discuss the typing constructs for which the PEP provides syntax.

I’ll first recapitulate the new syntax added by the PEP, then go over each of the major areas that the implementation touches:

The parser: How do we make the new syntax work in a backward-compatible way and with good error messages?
Scoping: How do we make the new constructs behave in an intuitive way that is consistent with the rest of the language?
Lazy evaluation: How do we avoid the pitfalls with eager evaluation of type annotations for the new syntax?
The interpreter: How does the new syntax work at runtime?
Runtime objects: How are TypeVar, Generic, etc. represented now that they are created from C code rather than Python code?

User-visible semantics

The PEP’s motivation is to improve the syntax for generic functions, classes, and type aliases in Python.

Generic functions

A generic function, broadly, is one where there is a relation between the types of the arguments and the return type, or between multiple of the arguments.

As an example, let’s consider the Python builtin max function:

>>> max(["a", "b"])
'b'
>>> max([1, 2])
2
>>> max([1.0, 2.0])
2.0

It takes an iterable of values of the same type, let’s call it T, and returns one of them, also of type T.

Under PEP 695, we could write this signature as:

def max[T](args: Iterable[T]) -> T:
    ...

(The real signature is more complicated, but we’ll leave that to typeshed.)

Here, T is a type parameter that parameterizes the type of the max function. When a type checker sees a usage of this function, it essentially replaces the type parameter with a more concrete type, such as int or str.

Generic classes

In addition to functions, classes can be generic. Generic classes are often containers of some sort: they contain elements, and to describe them in the type system we want to say what kinds of elements they contain. For example, [1, 2, 3] and ["a", "b", "c"] are both lists, but one contains ints and the other strings. We write the two types as list[int] and list[str].

If we were to define the list class in Python, we could write it (simplified) like this using the PEP’s syntax:

class list[T]:
    def __getitem__(self, index: int, /) -> T:
        ...

    def append(self, element: T) -> None:
        ...

Type aliases

Last, type aliases can be generic. A type alias is an alternative name for a complex type. For example, if you often have functions that take either lists or sets containing some particular type, you might write:

type ListOrSet[T] = list[T] | set[T]

The PEP also introduces syntax for non-generic aliases. For example, many functions accept either a str or a path-like object representing a path. We can represent this type as:

type StrPath = str | os.PathLike[str]

`TypeVarTuple` and `ParamSpec`

So far we have only seen the simplest kind of type parameter: an unconstrained type variable (or TypeVar). The Python type system supports a few more kinds of type parameters. In addition to plain TypeVar, there are TypeVarTuple and ParamSpec, and TypeVar (but not the other two) can have bounds and constraints.

TypeVarTuple is created by writing a single asterisk (*) before the name of a type parameter (e.g., def func[*Ts](): ...). This object was introduced in PEP 646 and is intended primarily for representing the types of multidimensional arrays, such as numpy arrays. ParamSpec is created with two asterisks (**), e.g. def func[**P](): .... It was introduced by PEP 612 and is most useful for typing complex decorators. Both of these have very subtle behavior in the type system, but their runtime behavior is quite simple, so we won’t cover them in much detail here.

Bounds and constraints

Bounds and constraints exist because many generics don’t work with all types, but only with some specific set of types. For example, one of the ways the above signature for max is incomplete is that max only works on types that support comparison:

>>> max([2j, 3j])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '>' not supported between instances of 'complex' and 'complex'

If we have a type SupportsRichComparison that represents types that support operators like >, we can type max as:

def max[T: SupportsRichComparison](args: Iterable[T]) -> T:
    ...

Now type checkers will reject calls like max([2j, 3j]).

In addition to bounds, type variables support constraints, which express that a type variable can only represent some specific set of types:

def add[T: (int, str, bytes)](a: T, b: T) -> T:
    return a + b

Parser and AST

Grammar

The PEP gives the word type a specific syntactic meaning: it introduces a type alias. Before Python 3.9, this would have been very disruptive: we would have had to make type into a full-fledged keyword, and break all of the existing code that uses the type() builtin. However, Python’s new PEG-based parser has great support for soft keywords: keywords that apply only in specific contexts. Python already had a few soft keywords to support pattern matching, and this PEP adds type.

Python’s parser is written in a nice, declarative way, so the new syntax only required a few dozen lines of new code: a new case in the simple_stmt rule to represent type aliases, a new optional type parameters list on the grammar rules for classes, functions, and async functions, and new grammar production to parse aliases and type parameter lists.

Still, there are a few subtleties. First, let’s talk about the new statement rule:

simple_stmt[stmt_ty] (memo):
    | assignment
    | &"type" type_alias
    | e=star_expressions { _PyAST_Expr(e, EXTRA) }
    | &'return' return_stmt

(Many more options omitted.) The &"type" here is a performance optimization: it tells the parser to only bother looking at this option if the next token is "type". Notice that we have "type" in double quotes but 'return' in single quotes: that is because type is a soft keyword and return is a hard keyword.

The order of the rules also matters. We might want to put type aliases further down the list because they are expected to be relatively rare compared to say return statements, but the grammar doesn’t work out if we put the type alias rule below the star_expressions rule, because that rule will throw an error on type alias syntax before the parser gets a chance to try out the type_alias rule. That is why type_alias is in the second position in the list.

Improving error messages has been a major emphasis in recent Python releases. I tried to produce informative errors for some plausible mistakes:

>>> def f[*Ts: int](): pass
  File "<stdin>", line 1
    def f[*Ts: int](): pass
             ^^^^^
SyntaxError: cannot use bound with TypeVarTuple

In the future, we may add more specific error rules for other cases as we learn more about common mistakes users make with this syntax.

AST

The output of Python’s parser is an abstract syntax tree (AST). Implementing the PEP required the creation of a few new AST nodes to represent type aliases and type parameters. There was one area of disagreement here: should the name of a type alias be represented as a plain identifier in the AST, or as a full-fledged ast.Name node? The former is simpler and represents the grammar more closely, in that the name is only allowed to be a single identifier, but the latter allows the AST to retain precise location information about the name, which makes the job of some static analysis tools easier. Therefore, we chose to go with ast.Name, and the AST representation of type aliases is TypeAlias(expr name, typeparam* typeparams, expr value).

Distinguishing bounds and constraints

The definition of a TypeVar in the AST is as follows: TypeVar(identifier name, expr? bound). But as we discussed above, TypeVars can also have constraints, represented syntactically as a tuple of types.

How do we distinguish between them? One option could be to push the distinction all the way to the runtime: we evaluate the value, and if it is a tuple, we treat it as a set of constraints, and otherwise we treat it as a bound. This option would lead to a subtle incompatibility between static and runtime type checking:

types = (bytes, str)

def func[T: types](): ...

A static type checker would reject this code, because types looks like a bound but is not a valid type. However, tool that looks at func at runtime would have no good way to distinguish this invalid function from a valid declaration such as def func[T: (bytes, str)](): ....

Therefore, we want to distinguish between bounds and constraints at compile time, without evaluating the expression. One way to do that would be to push the distinction to the grammar and have separate grammar rules for bounds and constraints. I didn’t try to implement that, but I expect it would be relatively complex, as we’d have to re-implement tuple parsing just for this case.

Instead, we distinguish between the two cases later, in the compiler: if the expression in the bound field is syntactically a tuple (an instance of ast.Tuple), we treat it as a set of constraints, otherwise we treat it as a bound.

Lazy evaluation

When annotations were introduced in Python 3.0, they were evaluated eagerly in the scope in which they are defined. This is easy to understand and implement, but it caused problems when annotations became widely used for typing. Users with large codebases saw that a big chunk of their import time was spent evaluating type annotations that they’d never look at again, because they were used only for static typing. In addition, type annotations often refer to names that haven’t yet been defined when the annotation is evaluated. For example, users expect to be able to use the name of a class as a type annotation within the class itself, but when the class body executes, the name of the class is not yet defined. The workaround for this problem was to write the class name in a string: "Class".

To fix these problems, PEP 563 introduced a mechanism that turns all annotations into strings in the compiler, so that annotations become much cheaper and users don’t have to worry about whether names they use in annotations are defined at runtime. This was scheduled to become the only behavior in Python 3.10, but it was belatedly discovered that the change would break some runtime uses of type annotations, which had become unexpectedly popular. An alternative solution, PEP 649 was introduced. It essentially wraps all annotations in a function that is evaluated only on demand. This gets us the best of both worlds: we don’t pay the cost to evaluate all the annotations at import time, we can use forward references without issues, and we can still introspect type annotations at runtime. This behavior is slated to be implemented in Python 3.13.

PEP 695 introduces two syntactic contexts that are similar to annotations in that they are expected to contain types: the value of type aliases and the bounds or constraints of type variables. Initially, the PEP proposed to evaluate these eagerly, with a special-cased mechanism to support evaluation of recursive type aliases. However, when I started implementing the PEP, I felt this risked repeating the same mistakes that we made with type annotations, and those mistakes are expected to take many more years to fully fix. By this time, it was clear that some variation of PEP 649 was likely to be the long-term future for type annotations, so it made sense to use PEP 649-like semantics for these new syntactic constructs from the beginning.

For both type aliases and bounds, there are common use cases where forward references are required. Most obviously, type aliases may be recursive:

type Expr = int | Add[Expr, Expr] | Subtract[Expr, Expr]

The original version of the PEP handled this with a special case, where the name of the type alias was already defined during the evaluation of the value. However, this trick would not work if there are multiple mutually recursive type aliases:

from typing import Literal

type BinOp = Literal["+", "-"]
type LeftParen = Literal["("]
type RightParen = Literal[")"]
type SimpleExpr = int | Parenthesized
type Parenthesized = tuple[LeftParen, Expr, RightParen]
type Expr = SimpleExpr | tuple[SimpleExpr, BinOp, Expr]

This set of types represents a simple grammar supporting integers literals, binary operators (a + b and a - b), and parenthesized expressions (a - (b - c)).

Forward references have historically been very common in TypeVar bounds to represent methods that return instances of the current class. The Self special form from PEP 673 has made many of these bounds obsolete, but forward references in bounds remain occasionally useful. For example, consider interconnected Connection and Cursor classes implementing a database engine:

class Connection:
    # Allow the user to provide a subclass of Cursor to control cursor creation
    def cursor[CursorT: Cursor](self, cursor_class: type[CursorT]) -> CursorT:
        ...

class Cursor:
    @property
    def connection(self) -> Connection:
        ...

Both classes have type annotations that refer to each other, and lazy evaluation of the bound and the type annotations leaves the programmer free to order the two classes in whatever way they prefer.

The initial implementation of lazy evaluation is relatively simple: when a type alias object is created, we create a function that encapsulates the evaluation of the value of the alias. This function is stored on the type alias object, and called when the user requests the value of the type alias (e.g., by accessing the .__value__ attribute). The same idea applies to TypeVar bounds and constraints.

PEP 649 introduces more advanced mechanisms for evaluating annotations that deal with cases where some of the names may not be defined and with use cases that prefer a stringified version of the annotations. When PEP 649 is implemented, we will add similar evaluation mechanisms to PEP 695’s lazily evaluated values, so that users can treat them in the same way as they treat annotations.

Scoping

Requirements

The new scoping rules are the most subtle aspect of the runtime implementation of the PEP. Indeed, the PEP says that “The lexical scope introduced by the new type parameter syntax is unlike traditional scopes”. The scoping semantics are motivated by several important use cases:

Multiple objects in the same module must be able to use the same type parameter names without interfering with each other. If we define both def max[T](...): ... and def min[T](...): ... in one module, the two type parameters T should be independent of each other and scoped only to the two function declarations. This helps users understand the semantics that type checkers use for type parameters.
Type parameters must be usable within the scope of the generic. A function that is generic over T must be able to use T not only in its type annotations, but also in annotations for its local variables or for nested functions. A class that is generic over T may use T in its list of bases, in the annotations for class variables and methods, and for local variables and nested functions within its methods.
As discussed above, the values of type aliases and the bounds/constraints of type variables should be lazily evaluated, but evaluating them later should behave as expected.
Type parameters must be usable at runtime: code within the lexical scope of a type parameter must be able to use the parameter’s name and get a usable object back. This is important for tools that access types at runtime.

I wrote a new section of the PEP, Scoping Behavior, that describes the intended scoping semantics in detail. It was informed heavily by the implementation strategy I ended up adopting, which is what we’ll discuss next.

Implementation strategies

While implementing the PEP, I went through several possible implementation strategies before I landed on one that worked well:

The PEP itself suggests treating type parameters as an “overlay” over the scope in which they are defined, which temporarily adds some new names that are only visible within the “overlay” scope. I had trouble getting this to work because to be visible in inner scopes, the names needed to be cell variables, and the overlay scope didn’t provide an obvious place to put the cell variables.
So perhaps instead we should put the type parameters in the enclosing scope, but mangle their names so that they are unique. I tried this, but because of the design of Python’s symbol table implementation, it proved difficult to get the mangling right in all nested scopes. There are also edge cases where I’m not even sure this could be implemented correctly, such as if a name is conditionally defined in a class namespace.

Lambda lifting

I eventually landed on a technique where we use a special, immediately evaluated function to define the type parameters. I later learned this technique is called “lambda lifting”. Essentially, we desugar:

def func[T](arg: T): ...

Into:

def __generic_parameters_of_func():
    T = TypeVar("T")
    def func(arg: T): ...
    return func
func = __generic_parameters_of_func()

This is not an exact equivalence: there are a few differences between the behavior of these functions and normal functions. I am naming the new function-like scopes “annotation scopes”, because in the future (with PEP 649) they will also be used for annotations. The behavior of annotation scopes is described in the PEP and also outlined below.

The implementation of the PEP uses annotation scopes in a number of contexts:

The type parameters of generic functions. Here, the annotations of the function (but not the defaults or the decorators) are evaluated inside the annotation scope.
The type parameters of generic classes. The class’s bases and keyword arguments to the metaclass are evaluated inside the annotation scope, but the decorators are not.
The type parameters of generic type aliases.
The lazily evaluated values: TypeVar bounds and constraints and type alias values. Unlike the other annotation scopes, these are not immediately called upon creation.

For the most part, this gives us the semantics we need without the introduction of complex new concepts into the symtable and compiler: the semantics for function scopes and nonlocals already work in the way that we want scopes for type parameters to work.

However, there was one big wrinkle and a few smaller ones. Let’s start with the big one: class scopes.

Class scopes

Class scopes are special in Python: names defined in a class scope are not visible in other scopes nested within them. For example, this fails:

>>> class X:
...     T = int
...     def f(self): print(T)
...
>>> X().f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 3, in f
NameError: name 'T' is not defined. Did you mean: 'self.T'?

This creates a problem for our implementation, because we rely so much on hidden functions. For example, this code would fail:

class Outer:
    class Nested:
        pass

    type Alias = Nested  # NameError when evaluated

    def meth1[T: Nested](self): pass  # NameError when the bound of T is evaluated

    def meth2[T](self, arg: Nested): pass  # NameError when the annotations are evaluated

It’s a bit of an edge case, but people really do use nested classes and type aliases defined at class scope, so we have to make the above code work and resolve the name Nested to the nested class. Therefore, we have to make annotation scopes work a little differently from normal functions: if an annotation scope is immediately within a class, it needs to have access to names defined within the class scope. An analogous problem occurs with lazily evaluated annotations under PEP 649, so Larry Hastings had already thought about this problem, and the eventual solution owes a lot to his ideas and Carl Meyer’s.

Implementing this change requires modifications in three places:

In the symbol table (symtable.c), where we track the names defined in each scope and figure out whether each name is a global, a local, or something else, we mark the scopes that need this special treatment with a flag, ste_type_params_in_class.
In the compiler (compile.c), we emit special opcodes if we encounter free variables in ste_type_params_in_class scopes. These opcodes first check the class dictionary for a name, and if the name is not present there, they check the globals.
In the bytecode interpreter (ceval.c), we add implementations of the new opcodes.

But wait! We said that we would look in the class dictionary first. How do we get to that class dictionary? There is no existing way to get to it from an arbitrary function. Larry’s initial idea was to add a new field, __locals__, to the function. Then when we create a function that needs access to this extra dict, we do the equivalent of func.__locals__ = locals(), and inside the implementation of the new opcodes we look at func.__locals__ to resolve names. This works and I implemented it, but we realized there was a problem. What should the following code do?

class Cls:
    T = "before"
    type Alias = T
Cls.T = "after"
print(Cls.Alias.__value__)

Recall that the __value__ of a type alias is lazily evaluated, so when we access Cls.Alias.__value__, we resolve the name T for the first time. You might expect, then, that this will print “after”: that’s what’s in the class namespace at the time the alias is evaluated.

In fact, this would have printed “before” in this initial implementation, for a subtle reason: the class namespace that exists while the class body is evaluated is not the same as the namespace that exists when the class is fully created. The default metaclass, type, creates a copy of the class namespace that it then uses as the __dict__ for the newly created class. I believe this was done as a result of the addition of the __prepare__ method (PEP 3115), which allows the class namespace to be some sort of non-dict mapping. Whatever the origins of this behavior, we decided we had to work around it to make class namespaces behave in a consistent manner.

Happily, Larry and Carl also suggested a solution, and even more happily, this solution turned out to be simpler than the previous one! Instead of an attribute on the function object, we use a new cell variable, __classdict__. When the class body executes, we set this cell variable to the current class namespace (in pseudocode: __classdict__ = locals()). Inside the annotation scopes, we make name accesses use the mapping from the __classdict__ cell. Roughly speaking, we implement loads as __classdict__.get(NAME, globals().get(NAME)). To point the __classdict__ to the namespace of the real class after it is created, we have code in the implementation of type that updates the value in the __classdict__ cell.

If you’re familiar with CPython internals, you might have noticed that this is similar to how the __class__ cell works, which is what powers zero-argument super(). Like __classdict__, this cell is created by the compiler when it encounters certain function scopes inside a class (in this case, scopes that use the name super). However, __class__ holds a reference to the class itself, not to the class’s dict, and it is set only when the class is fully evaluated, not while the class namespace is executing. Therefore, we were not quite able to reuse the __class__ cell, but a lot of the implementation is similar between the two cells.

A few more edge cases came up later. One concerned names that are bound in the class body, but used before they are bound:

x = "global"
def outer():
    x = "nonlocal"
    class Inner:
        print(x)
        x = "class"
        print(x)
outer()

This prints “global”, then “class”: if the name is ever bound in the class namespace, we ignore any bindings in enclosing scopes and go straight to the globals. I find this behavior odd enough that I am tempted to call it a bug, but it is probably too late to change it for existing classes. My initial implementation of annotation scopes would instead have looked in the class scope and then in the outer cell variable, so I had to make a change to the symtable to make annotation scopes follow the same rule as existing classes.

I also belatedly realized that it is legal to use global and nonlocal statements in class bodies. I don’t know if anyone has ever actually used these statements in a class body outside of testing code, but I had to support them to maintain the principle that name resolution within an annotation scope is exactly as if the code was directly within the enclosing class body. Fortunately, this was also fixable with some relatively small changes to the compiler.

Below, we’ll talk more about how the class scope behavior is implemented at runtime in the interpreter.

Other issues

There are a few more subtleties to deal with. First, what should the following code do?

type T = (yield 3)

def func[T: (x := 4)](arg: (y := 5)):
    pass

async def async_func():
    def inner[T: await asyncio.sleep(1)]():
        pass

Nobody wants to have to think about that, so we disallow yield, yield from, await, and := (the walrus operator) in all annotation scopes.

That did lead to some implementation complexity, however. We want to give good error messages for all these cases, without using an obscure term like “annotation scope” (or “hidden function”, the term I was using at the time):

>>> type T = (yield 3)
  File "<stdin>", line 1
SyntaxError: 'yield expression' can not be used within a type alias
>>> def func[T: (x:=3)](): pass
...
  File "<stdin>", line 1
SyntaxError: 'named expression' can not be used within a TypeVar bound

To do that, I had to create separate scope types for type alias values, generic parameters, and TypeVar bounds, so that the error messages can distinguish between these three cases. Apart from the error messages, all these scopes behave identically.

After the initial implementation I realized that comprehensions could also be abused to inject names into the annotation scope, because uses of the walrus operator inside a comprehension bind a name in the enclosing scope. Therefore, I also had to disallow usages of the walrus operator within comprehensions within annotation scopes.

Second, if def func[T](): pass now actually creates func within a hidden nested function, its qualified name would now become something like <generic parameters of func>.<locals>.func. We don’t want that, so we adjust the compiler mechanism to compute the qualname so that it produces simply func.

Last, the PEP specifies that the nonlocal statement cannot be used to rebind type parameters. In other words, this is a syntax error:

def func[T]():
    nonlocal T
    T = "not a TypeVar any more"

I personally might have preferred to allow this at runtime: while it doesn’t make any sense to rebind a type parameter using nonlocal, it also doesn’t break any invariants in the runtime, and in Tim Peters’s words, “special cases aren’t special enough to break the rules”. Nevertheless, this rule was in the PEP as accepted, so I set out to implement it. This required changes in symtable.c, where we process scoping information. My initial implementation had a few problems, but fortunately Carl Meyer suggested a better approach.

Interpreter

Python is a compiled language, at least when run through CPython. When we run Python code, we are really running bytecode that is interpreted by an interpreter. This interpreter is a simple stack machine, executing a series of instructions like “put a constant on top of the stack” or “add together the two objects at the top of the stack”. Implementing PEP 695 required a number of new interpreter instructions, or opcodes.

New opcodes

There are three new top-level instructions, all of which have to do with class scopes:

LOAD_LOCALS, which loads the current locals() onto the stack
LOAD_CLASSDICT_OR_GLOBAL, which loads a name from either a local dict or the globals
LOAD_CLASSDICT_OR_DEREF, which loads a name from either a local dict or a cell variable

The two new LOAD_* opcodes are analogous to the existing LOAD_NAME and LOAD_CLASSDEREF opcodes, but while the existing opcodes look in the frame’s locals directly, the new opcodes look in a mapping that is present on the stack. In practice, the current implementation always loads the mapping from the __classdict__ cell.

This example shows why we need the two different opcodes:

>>> dis.dis("""
... class Cls[T]:
...     type Alias = T | int
... """)
<snip>
Disassembly of <code object Alias at 0x101699c60, file "<dis>", line 3>:
              0 COPY_FREE_VARS           2

  3           2 RESUME                   0
              4 LOAD_DEREF               1 (__classdict__)
              6 LOAD_CLASSDICT_OR_DEREF     0 (T)
              8 LOAD_DEREF               1 (__classdict__)
             10 LOAD_CLASSDICT_OR_GLOBAL     0 (int)
             12 BINARY_OP                7 (|)
             16 RETURN_VALUE

When Alias is evaluated, we’ll expect to resolve T from a cell variable (created by the annotation scope for the class’s type parameters), so we use LOAD_CLASSDICT_OR_DEREF, and int from the builtins, so we use LOAD_CLASSDICT_OR_GLOBAL. But in both cases, we need to account for the possibility that the class namespace was modified to inject a value for T or int before we evaluate the value of the type alias; that is why we need to first look at __classdict__.

Carl Meyer observed that LOAD_NAME could be replaced with LOAD_LOCALS; LOAD_CLASSDICT_OR_GLOBAL, and LOAD_CLASSDEREF with LOAD_LOCALS; LOAD_CLASSDICT_OR_DEREF. We didn’t make this change in Python 3.12 for stability reasons, but we can think about this again in the future. The case is stronger for LOAD_CLASSDEREF, which is only used in relatively rare circumstances (classes nested within functions), than for LOAD_NAME, which is used for module-level name lookups.

Intrinsics

In addition to the top-level instructions, there are eight new intrinsics. Intrinsics are a new concept in the Python 3.12 interpreter. There are two general instructions, CALL_INTRINSIC_1 and CALL_INTRINSIC_2, that call a function with either one or two arguments. The functions are determined by the argument to the opcode. They are used for certain rare operations that need to be executed by the interpreter, but that aren’t worth the cost of a separate top-level instruction. Every new top-level instruction has some cost, because it increases the size of the main interpreter loop, which is a huge function of very hot code that is sensitive to slight shifts in the behavior of the C compiler. For example, the unary + operator is implemented as an intrinsic, because nobody ever uses it:

>>> dis.dis('+a')
  0           0 RESUME                   0

  1           2 LOAD_NAME                0 (a)
              4 CALL_INTRINSIC_1         5 (INTRINSIC_UNARY_POSITIVE)
              6 RETURN_VALUE

Intrinsics are a good fit for many PEP 695 operations, because in the vast majority of cases, type aliases and type parameters will only be created once, when a module is imported, so we don’t gain too much from speeding it up. We add the following one-argument intrinsics:

INTRINSIC_TYPEVAR (create a TypeVar without bounds or constraints; the argument is the name)
INTRINSIC_PARAMSPEC (create a ParamSpec)
INTRINSIC_TYPEVARTUPLE (create a TypeVarTuple)
INTRINSIC_SUBSCRIPT_GENERIC (return typing.Generic[params], where params is a tuple of type parameters)
INTRINSIC_TYPEALIAS (create a TypeAlias; the argument is a tuple of the alias’s name, type parameters, and lazily evaluated value)

As well as these two-argument intrinsics:

INTRINSIC_TYPEVAR_WITH_BOUND (the arguments are the TypeVar name and the lazily evaluated bound)
INTRINSIC_TYPEVAR_WITH_CONSTRAINTS (same, but for constraints)
INTRINSIC_SET_FUNCTION_TYPE_PARAMS (sets the __type_params__ attribute on a function; the arguments are the function and the type parameters tuple)

Bytecode examples

This section shows the disassembly for a few pieces of sample code using the new syntax with explanatory comments.

A non-generic type alias:

>>> dis.dis("type T = int")
  0           0 RESUME                   0

  1           2 LOAD_CONST               0 ('T')
              4 LOAD_CONST               1 (None)  # the type parameters; there aren't any
              6 LOAD_CONST               2 (<code object T at 0x1016984f0, file "<dis>", line 1>)
              8 MAKE_FUNCTION            0  # create the lazy evaluation function for the type alias
             10 BUILD_TUPLE              3  # INTRINSIC_TYPEALIAS takes a 3-tuple (name, type params, value)
             12 CALL_INTRINSIC_1        11 (INTRINSIC_TYPEALIAS)
             14 STORE_NAME               0 (T)
             16 RETURN_CONST             1 (None)

# Evaluation function for the type alias value
Disassembly of <code object T at 0x1016984f0, file "<dis>", line 1>:
  1           0 RESUME                   0
              2 LOAD_GLOBAL              0 (int)
             12 RETURN_VALUE

A generic type alias:

>>> dis.dis("type Alias[T] = list[T]")
  0           0 RESUME                   0

# Code executed in the global scope
# - create the function for the annotation scope
# - call it
# - store the result in the name "Alias"
  1           2 PUSH_NULL
              4 LOAD_CONST               0 (<code object <generic parameters of Alias> at 0x101666640, file "<dis>", line 1>)
              6 MAKE_FUNCTION            0
              8 CALL                     0
             16 STORE_NAME               0 (Alias)
             18 RETURN_CONST             1 (None)

# Annotation scope for executing the type parameters
Disassembly of <code object <generic parameters of Alias> at 0x101666640, file "<dis>", line 1>:
              0 MAKE_CELL                0 (T)

  1           2 RESUME                   0
            # Argument 1 to TypeAlias: the name
              4 LOAD_CONST               0 ('Alias')
            # Argument 2 to TypeAlias: the type parameters
              6 LOAD_CONST               1 ('T')
              8 CALL_INTRINSIC_1         7 (INTRINSIC_TYPEVAR)
             10 COPY                     1  # One copy for storing a variable, another for the type_params tuple
            # Store "T" in a cell variable so that the inner scope for the type alias value can access it
             12 STORE_DEREF              0 (T)
             14 BUILD_TUPLE              1
            # Argument 3 to TypeAlias: the evaluation function for the value
             16 LOAD_CLOSURE             0 (T)  # Bookkeeping so that the function has access to the closure variable
             18 BUILD_TUPLE              1
             20 LOAD_CONST               2 (<code object Alias at 0x1016984f0, file "<dis>", line 1>)
             22 MAKE_FUNCTION            8 (closure)
            # Now we're ready to create the type alias
             24 BUILD_TUPLE              3
             26 CALL_INTRINSIC_1        11 (INTRINSIC_TYPEALIAS)
             28 RETURN_VALUE

# Evaluation function for the type alias value
Disassembly of <code object Alias at 0x1016984f0, file "<dis>", line 1>:
              0 COPY_FREE_VARS           1

  1           2 RESUME                   0
              4 LOAD_GLOBAL              0 (list)
             14 LOAD_DEREF               0 (T)
             16 BINARY_SUBSCR
             20 RETURN_VALUE

A generic function:

>>> dis.dis("def func[T](a: T, b: int = 1): pass")
  0           0 RESUME                   0

  1           2 PUSH_NULL
            # The defaults for the function, which must be executed outside the annotation scope
              4 LOAD_CONST               3 ((1,))
              6 LOAD_CONST               1 (<code object <generic parameters of func> at 0x101628480, file "<dis>", line 1>)
              8 MAKE_FUNCTION            0
             10 SWAP                     2
            # One argument: We pass the defaults tuple as an argument to the annotation scope's function
             12 CALL                     1
             20 STORE_NAME               0 (func)
             22 RETURN_CONST             2 (None)

Disassembly of <code object <generic parameters of func> at 0x101628480, file "<dis>", line 1>:
  1           0 RESUME                   0
              2 LOAD_CONST               0 ('T')
              4 CALL_INTRINSIC_1         7 (INTRINSIC_TYPEVAR)
              6 COPY                     1
            # Unlike in the previous example, we use STORE_FAST here because the TypeVar is not
            # used in any inner scope.
              8 STORE_FAST               1 (T)
             10 BUILD_TUPLE              1
            # The defaults tuple was passed an argument to this function, so here we load it from the locals.
             12 LOAD_FAST                0 (.defaults)
            # Build up the __annotations__ tuple. (It gets turned into a dict only later.)
             14 LOAD_CONST               1 ('a')
             16 LOAD_FAST                1 (T)
             18 LOAD_CONST               2 ('b')
             20 LOAD_GLOBAL              0 (int)
             30 BUILD_TUPLE              4
             32 LOAD_CONST               3 (<code object func at 0x1016b6880, file "<dis>", line 1>)
            # Create the function, consuming the defaults and annotations tuples off the stack.
             34 MAKE_FUNCTION            5 (defaults, annotations)
             36 SWAP                     2
             38 CALL_INTRINSIC_2         4 (INTRINSIC_SET_FUNCTION_TYPE_PARAMS)
             40 RETURN_VALUE

# The actual function body, which does nothing.
Disassembly of <code object func at 0x1016b6880, file "<dis>", line 1>:
  1           0 RESUME                   0
              2 RETURN_CONST             0 (None)

A generic class:

>>> dis.dis("class Cls[T](Base): pass")
  0           0 RESUME                   0

  1           2 PUSH_NULL
              4 LOAD_CONST               0 (<code object <generic parameters of Cls> at 0x10175e7a0, file "<dis>", line 1>)
              6 MAKE_FUNCTION            0
              8 CALL                     0
             16 STORE_NAME               0 (Cls)
             18 RETURN_CONST             1 (None)

Disassembly of <code object <generic parameters of Cls> at 0x10175e7a0, file "<dis>", line 1>:
              0 MAKE_CELL                2 (.type_params)

  1           2 RESUME                   0
              4 LOAD_CONST               0 ('T')
              6 CALL_INTRINSIC_1         7 (INTRINSIC_TYPEVAR)
              8 COPY                     1
             10 STORE_FAST               0 (T)
             12 BUILD_TUPLE              1
            # We store the tuple (T,) in a cell variable here because we'll need it again
            # in the class body.
             14 STORE_DEREF              2 (.type_params)
             16 PUSH_NULL
            # Creating a class is done by calling __build_class__. Here, we pass four arguments:
            # the class body code, the name, the "Base" base class, and Generic[T].
             18 LOAD_BUILD_CLASS
            # Argument 1: Class body
             20 LOAD_CLOSURE             2 (.type_params)
             22 BUILD_TUPLE              1
             24 LOAD_CONST               1 (<code object Cls at 0x1016984f0, file "<dis>", line 1>)
             26 MAKE_FUNCTION            8 (closure)
            # Argument 2: Class name
             28 LOAD_CONST               2 ('Cls')
            # Argument 3 and 4: bases. We first create Generic[T], then store it temporarily
            # in a local, then move it to the right place in the bases list. This makes for
            # the most straightforward implementation in the compiler, but the generated code
            # could be better.
             30 LOAD_DEREF               2 (.type_params)
             32 CALL_INTRINSIC_1        10 (INTRINSIC_SUBSCRIPT_GENERIC)
             34 STORE_FAST               1 (.generic_base)
             36 LOAD_GLOBAL              0 (Base)
             46 LOAD_FAST                1 (.generic_base)
            # Now we can finally call __build_class__ and create the class.
             48 CALL                     4
             56 RETURN_VALUE

# Code object representing the execution of the class body. Empty classes
# aren't really empty!
Disassembly of <code object Cls at 0x1016984f0, file "<dis>", line 1>:
              0 COPY_FREE_VARS           1

  1           2 RESUME                   0
              4 LOAD_NAME                0 (__name__)
              6 STORE_NAME               1 (__module__)
              8 LOAD_CONST               0 ('Cls')
             10 STORE_NAME               2 (__qualname__)
            # This holds (T,), as created in the annotation scope for the generic parameters.
             12 LOAD_CLASSDEREF          0 (.type_params)
            # Makes it so we'll be able to use Cls.__type_params__ when the class is created.
             14 STORE_NAME               3 (__type_params__)
             16 RETURN_CONST             1 (None)

Runtime objects

Now that we have syntax to create type parameters and generics, we need to make these objects accessible from the core interpreter, which generally means implementing them in C. Concretely, that means:

Type aliases are instances of a new typing.TypeAliasType, also implemented in C.
Generic classes must inherit from typing.Generic, so it also needs a C implementation.
typing.TypeVar, ParamSpec, and TypeVarTuple need to be implemented in C.

All of these are implemented in a new file Objects/typevarobject.c and exposed in the existing _typing extension module. Previously, this module only held an accelerator for NewType.__call__.

TypeAliasType

TypeAliasType is a new type, which made my life easier: there is no backward compatibility to consider. For the most part, the implementation is straightforward. (At least, to the extent any C code is straightforward: getting the interactions with the garbage collector right is always is a little tricky.)

We did find one tricky question: How should TypeAliasType.__repr__ behave? Guido van Rossum recommended having the repr() include the value of the type alias:

>>> type T = int
>>> T
<type alias T: int>

That looks nice enough, but I realized that it can quickly become unreadable with complex aliases:

>>> from typing import Literal
>>> type BinOp = Literal["+", "-"]
>>> type LeftParen = Literal["("]
>>> type RightParen = Literal[")"]
>>> type SimpleExpr = int | Parenthesized
>>> type Parenthesized = tuple[LeftParen, Expr, RightParen]
>>> type Expr = SimpleExpr | tuple[SimpleExpr, BinOp, Expr]
>>>
>>> Expr
<type alias Expr: <type alias SimpleExpr: int | <type alias Parenthesized: tuple[<type alias LeftParen: typing.Literal['(']>, ..., <type alias RightParen: typing.Literal[')']>]>> | tuple[<type alias SimpleExpr: int | <type alias Parenthesized: tuple[<type alias LeftParen: typing.Literal['(']>, ..., <type alias RightParen: typing.Literal[')']>]>>, <type alias BinOp: typing.Literal['+', '-']>, ...]>

Therefore, I decided to go back to showing only the alias name in repr(), similar to the behavior for TypeVar:

>>> Expr
Expr
>>> Expr.__value__
SimpleExpr | tuple[SimpleExpr, BinOp, Expr]

This should make type aliases more readable when displayed in places like Sphinx documentation, but it looks a bit odd when a single alias is displayed in isolation. Another option could be to follow what classes do:

>>> Expr
<type alias 'Expr'>
>>> Expr.__value__
<type alias 'SimpleExpr'> | tuple[<type alias 'SimpleExpr'>, <type alias 'BinOp'>, <type alias 'Expr'>]

I went with the simpler solution of just returning the name as the repr() for now, but I would be open to changing it if the community consensus is that a different repr() is more useful.

Initially, TypeAliasType provided no Python-accessible constructor, so it could only be created through the type statement. After users reported that it might be useful to backport TypeAliasType to older versions, I decided to add a way to construct TypeAliasType from Python, so that users who want it can construct type aliases programmatically.

Generic

Generic classes were previously created by subclassing typing.Generic, and for backward compatibility we wanted it to stay that way. That meant we had to reimplement Generic in C, but that turned out to be problematic: the implementation of Generic is intricately connected to the rest of typing.py. To avoid referring to Python code from within it, we’d have to rewrite not only Generic in C, but also Protocol, typing._GenericAlias (not to be confused with types.GenericAlias), typing._TypedDictMeta, and virtually every other special form. Not only would that have been a lot of work, it would also have made future maintenance of the typing module a lot harder.

Therefore, we decided to write only the skeleton of the class in C, and delegate the rest of the work to Python. For Generic, that meant the __class_getitem__ and __init_subclass__ methods. Therefore, creating a generic class still involves some Python code in typing.py.

C types in the core are usually written as static types, meaning that there is a static C struct representing the type. It proved difficult to use a static type and support multiple inheritance, which is essential for Generic, so I switched to a heap type, which works more like a Python class.

TypeVar, ParamSpec, and TypeVarTuple

I also reimplemented TypeVar, ParamSpec, and TypeVarTuple in C. This was mostly straightforward if tedious, and as with Generic, in a few cases I had to delegate to Python code. The Python code mostly involves parameter substitution, which happens when a generic is subscripted and we have to match up the type parameters with the arguments.

A few interesting issues came up:

A test turned out to rely on setting an attribute blah on a TypeVar, apparently to test some behavior related to reference cycles. To accommodate potential user code that sets attributes on TypeVars, I gave them a __dict__ with the Py_TPFLAGS_MANAGED_DICT flag. This also required making the type into a heap type, or at least it was a lot easier to get working after I made it a heap type.
I discovered that ParamSpec, but not TypeVarTuple, supports the __or__ operator, though in the type system, it makes sense for neither of them. For the sake of compatibility, I preserved that behavior.
At first I used _Py_union_type_or to implement TypeVar.__or__, the same function used by types and builtin generic aliases. However, this failed a test that did T | "ForwardRef": the builtin union operator, for whatever reason, doesn’t support string forward references, while the ones implemented in typing.py do. Therefore, I made TypeVar.__or__ delegate to the Python version too, so now using | on a TypeVar still produces a typing.Union, not a types.Union.

Future work

PEP 695 was accepted not long before the deadline for new features in Python 3.12, so I had limited time to polish the implementation. Therefore, I focused on getting a robust, working implementation, leaving possible performance improvements for later.

Here are some of the possible changes we can make in 3.13:

PEP 649 is expected to be implemented in 3.13. This will provide a more complete framework for lazy evaluation, including different formats to evaluate annotations for different needs. We should expand TypeAlias and TypeVar to provide similar APIs.
PEP 709 was also accepted late in the Python 3.12 release cycle. It implements an innovative mechanism for inlining comprehensions into their containing scopes without overwriting that scope’s variables. We could perhaps use the same technique to inline the hidden functions used for evaluating type parameters.
PEP 649 is expected to provide lazy evaluation without actually creating function objects for every annotation at import time, which is relatively slow. Instead, we’ll store the raw code object with some other necessary information, and create the function only on demand. This technique could also be used for PEP 695’s lazily evaluated values.
Creating a generic class now calls into several typing.py functions implemented in Python, which would likely make it quite slow. We should look into reimplementing a few more pieces of typing.py in C so that generic classes can be created without a need for calling into Python code. Perhaps typing._GenericAlias could be rewritten in C and merged with types.GenericAlias.
I hope that PEP 696 will eventually be accepted, which will require a few more syntax changes and new lazily evaluated scopes, very similar to the existing support for TypeVar bounds.
More improvements could be made to the bytecode. For example, we could replace the rare LOAD_CLASSDEREF opcode, freeing up a bit of space in the interpreter. In some cases, we could generate more efficient bytecode by removing redundant instructions.

Acknowledgments

Eric Traut wrote PEP 695 and implemented an initial prototype. He distilled all the discussion in the typing community into a concrete proposal. He also contributed the key insight that we don’t need to be able to specify TypeVar variance with this new syntax, because type checkers can infer variance. As a result, I haven’t had to mention “variance” at all in this document. Guido van Rossum sponsored the PEP and guided the discussion and implementation. Larry Hastings contributed valuable insights into scoping from his PEP 649 work. Carl Meyer reviewed the whole implementation and suggested many improvements. Many others contributed to the discussion that led to the PEP, reviewed parts of the implementation, or served as sounding boards at PyCon while I was thinking about how to implement the PEP.