ref: 68f15d65942da7e30cf9cbae7362f778fe5da1d2
dir: /sys/src/cmd/python/Doc/lib/libast.tex/
% XXX Label can't be _ast? % XXX Where should this section/chapter go? \chapter{Abstract Syntax Trees\label{ast}} \sectionauthor{Martin v. L\"owis}{[email protected]} \versionadded{2.5} The \code{_ast} module helps Python applications to process trees of the Python abstract syntax grammar. The Python compiler currently provides read-only access to such trees, meaning that applications can only create a tree for a given piece of Python source code; generating byte code from a (potentially modified) tree is not supported. The abstract syntax itself might change with each Python release; this module helps to find out programmatically what the current grammar looks like. An abstract syntax tree can be generated by passing \code{_ast.PyCF_ONLY_AST} as a flag to the \function{compile} builtin function. The result will be a tree of objects whose classes all inherit from \code{_ast.AST}. The actual classes are derived from the \code{Parser/Python.asdl} file, which is reproduced below. There is one class defined for each left-hand side symbol in the abstract grammar (for example, \code{_ast.stmt} or \code{_ast.expr}). In addition, there is one class defined for each constructor on the right-hand side; these classes inherit from the classes for the left-hand side trees. For example, \code{_ast.BinOp} inherits from \code{_ast.expr}. For production rules with alternatives (aka "sums"), the left-hand side class is abstract: only instances of specific constructor nodes are ever created. Each concrete class has an attribute \code{_fields} which gives the names of all child nodes. Each instance of a concrete class has one attribute for each child node, of the type as defined in the grammar. For example, \code{_ast.BinOp} instances have an attribute \code{left} of type \code{_ast.expr}. Instances of \code{_ast.expr} and \code{_ast.stmt} subclasses also have lineno and col_offset attributes. The lineno is the line number of source text (1 indexed so the first line is line 1) and the col_offset is the utf8 byte offset of the first token that generated the node. The utf8 offset is recorded because the parser uses utf8 internally. If these attributes are marked as optional in the grammar (using a question mark), the value might be \code{None}. If the attributes can have zero-or-more values (marked with an asterisk), the values are represented as Python lists. \section{Abstract Grammar} The module defines a string constant \code{__version__} which is the decimal subversion revision number of the file shown below. The abstract grammar is currently defined as follows: \verbatiminput{../../Parser/Python.asdl}