Jelle Zijlstra

Improving the ast module in Python 3.13 and beyond

The Python ast module is a useful tool for any application that needs to interact with Python code programmatically, such as a linter or formatter. However, its implementation is not always intuitive and ergonomic. Starting with Python 3.13, I pushed for a few changes to make the module easier to use. This article describes these behaviors and what we’re doing to improve the situation.

Constructing AST nodes

The most common way to get access to an AST node is by parsing some Python code with ast.parse. But it is also possible to construct AST nodes manually:

>>> ast.Call(func=ast.Name(id="print", ctx=ast.Load()), args=[ast.Constant(value="Hello World")], keywords=[])
<ast.Call object at 0x104a0bf10>
>>> ast.unparse(_)
"print('Hello World')"

Old behavior

But what happens if you omit an argument, like that empty list of keywords? It seems to work fine:

>>> ast.Call(func=ast.Name(id="print", ctx=ast.Load()), args=[ast.Constant(value="Hello World")])
<ast.Call object at 0x104a1d390>

Until you try to do something with this AST node:

>>> ast.unparse(_)
Traceback (most recent call last):
  ...
  File ".../lib/python3.12/ast.py", line 1565, in visit_Call
    for e in node.keywords:
             ^^^^^^^^^^^^^
AttributeError: 'Call' object has no attribute 'keywords'

Maybe you know about this problem and you tried to pass all arguments, but you didn’t get the spelling quite right:

>>> ast.Call(func=ast.Name(id="print", ctx=ast.Load()), args=[ast.Constant(value="Hello World")], key_words=[])
<ast.Call object at 0x104a1d510>

Once again, this seems to work, but if you call ast.unparse(), you’ll once again get an AttributeError for the node.keywords attribute. However, Python dutifully created our misspelled attribute:

>>> _.key_words
[]

Why change this?

The old behavior is very permissive: you can pass virtually anything to the constructor without getting an exception. That might seem nice: no stack traces, everything works. But in practice, this sort of behavior mostly defers errors and makes bugs appear in places where they are harder to debug.

I have run into these problems myself working on tools like pyanalyze, ast_decompiler, and flake8-pyi, so I am motivated to fix them.

The new behavior

In Python 3.13, I made a change to make the AST constructor stricter, so that problems are caught earlier. For compatibility reasons, we still allow omitting required parameters and adding additional keyword arguments, but we now raise a DeprecationWarning, and in Python 3.15 we expect to turn these conditions into errors.

Here are some examples:

>>> import ast
>>> ast.Name()
<python-input-1>:1: DeprecationWarning: Name.__init__ missing 1 required positional argument: 'id'. This will become an error in Python 3.15.
  ast.Name()
<ast.Name object at 0x101660740>
>>> ast.Name(id="print", random_attribute="x")
<python-input-2>:1: DeprecationWarning: Name.__init__ got an unexpected keyword argument 'random_attribute'. Support for arbitrary keyword arguments is deprecated and will be removed in Python 3.15.
  ast.Name(id="print", random_attribute="x")
<ast.Name object at 0x10167f210>

But some AST nodes take a lot of arguments, and it can be tedious to pass every argument. To make that easier, we added default values in some common cases:

>>> print(ast.Return().value)
None
>>> print(ast.Tuple().elts)
[]
>>> print(ast.Name(id="hi").ctx)
<ast.Load object at 0x1014ab6c0>

ast.dump()

The ast.dump() function is an indispensable tool for checking out the structure of an AST node, but its output can be overly verbose:

>>> ast.dump(ast.parse("class X: pass"))
"Module(body=[ClassDef(name='X', bases=[], keywords=[], body=[Pass()], decorator_list=[], type_params=[])], type_ignores=[])"

This adds five different empty lists for language features that this snippet doesn’t use. These add up to make the AST much harder to understand.

To simplify this, I proposed and Nikita Sobolev implemented a change to omit empty fields. In Python 3.13, you will now see this:

>>> ast.dump(ast.parse("class X: pass"))
"Module(body=[ClassDef(name='X', body=[Pass()])])"

Future changes

There is more we can do to make the ast module even more pleasant to work with.

repr()

Currently, AST objects do not have a custom repr(), so they inherit Python’s default:

>>> ast.parse("x")
<ast.Module object at 0x103ddb030>
>>> ast.parse("x = 1")
<ast.Module object at 0x103be6c20>
>>> ast.parse("class X: pass")
<ast.Module object at 0x103a92810>

Wouldn’t it be much nicer if the repr() looked a bit more like the output of ast.dump() and told us something more useful than the memory address?

This change is being tracked in issue 116022, and I hope we’ll get it done for Python 3.14.

Comparing AST nodes

Just like AST nodes don’t have a useful repr(), they don’t implement any comparison functionality:

>>> ast.Load() == ast.Load()
False

This is a longstanding feature request that I think we should fix somehow, either by providing an __eq__ method or by adding standalone functionality for comparing AST nodes.