Two ASTs?¶
Tip
SOLP currently has two forms of AST: AST1 and AST2. If you’re wondering which one you should use, the short of it is AST2.
Preface¶
Simply put, AST2 gives you a more consistent and easy to use set of nodes, but it has a couple of limitations:
You need to load the full project into SOLP, which might not be possible (e.g., missing dependencies) or desirable (e.g., too large).
It takes longer to create as it requires extra builder passes.
In the unlikely case, the builder fails due to an unexpected error.
It’s not as closely linked to the original source code as AST1 (e.g., some synthetic nodes are generated that don’t have source location data linked to them).
If your specific project needs to be able to modify the exact source code, check out this guide, which provides tips on working on Solidity source-code transformers.
The rest of this document highlights some features specific to AST2 to show the reason why it exists and the ways in which it’s different to AST1.
TopLevelUnits vs SourceUnits¶
All nodes have a parent
attribute, right? So let’s say we have a function, modifier, event (etc.) definition, and we want to get the contract it
was declared in. We just take the parent and use it like it’s a contract.
Hold on! There’s a couple of assumptions there. Consider the following Solidity valid code:
pragma solidity ^0.8.0;
error SafeCast__NegativeValue();
library IntHelper {
function toUint256(int256 value) internal pure returns (uint256) {
if (value < 0) revert SafeCast__NegativeValue();
return uint256(value);
}
}
There are two things to note here.
1. The Parent of toUint256
Is a LibraryDefinition
¶
This point is solved by changing our mental model slightly. Does it really matter that the parent is a library or a contract? Usually, not really. Instead, we generalize the parent of something like a function or a state variable to be a TopLevelUnit in AST2. Without needing to know the specific definition type, we can
get type information (
as_type()
)find contract parts/members (
find_named_parts()
)check the inheritance hierarchy (
is_subttype_of()
,get_supers()
,get_subtypes()
)get common data (
descriptor()
,name
)
Look at the list of TopLevelUnits:
These make sense. All of these Solidity types usually contain related parts all grouped together. Additionally, none of them are marked as ContractParts (see below), meaning they can’t be nested inside other TopLevelUnits; they are top-level nodes (parentless).
The equivalent in AST1 are SourceUnits, which are defined based on the allowable Solidity grammar rules.
Solidity allows free-floating definitions for functions and events as well as nesting (e.g., putting a library inside of a contract). This makes traversing AST1 nodes more difficult as you don’t have a guarantee that the SourceUnit is a root node or if it is part of another SourceUnit.
2. FileDefinitions Can Contain ContractParts¶
Ask the virtual file system to load and parse the file above. You’ll get a list of source units:
PragmaDirective(name=Ident(text='solidity'), value='^0.8.0')
ErrorDefinition(name=Ident(text='SafeCast__NegativeValue'), parameters=[])
LibraryDefinition(name=Ident(text='IntHelper'), parts=[...])
See how SafeCast__NegativeValue
acts as a SourceUnit rather than a pure ContractPart? That’s because it was declared at the
top level of the file. As a result, the parent
of SafeCast__NegativeValue
is None
.
In AST2, a FileDefinition
is created as a kind of
psuedo-contract to hold free-floating contract parts like the error definition.
Compare the source units above to the output of get_top_level_units()
from the AST2 builder:
FileDefinition(source_unit_name='FloatingFunc.sol', name=Ident(text='FloatingFunc.sol'), parts=[ErrorDefinition(name=Ident(text='SafeCast__NegativeValue'), inputs=[])])
LibraryDefinition(source_unit_name='FloatingFunc.sol', name=Ident(text='IntHelper'), parts=[...])
The error can now be referenced like any other contract part — with a base (the file definition) and a name. For example,
in the AST2 function.code
for toUint256
, the revert node is this:
RevertWithError(error=<REF(FloatingFunc.sol.SafeCast__NegativeValue)>, args=[])
Imports, Pragmas, Usings¶
AST1 has a bunch of SourceUnit subclasses such as PragmaDirective
, ImportDirective
, and UsingDirective
. We don’t see them in AST2; what’s going on?
These constructs in Solidity require compiler support for the Solidity code to make sense. For example,
Imports need to be resolved using path resolution rules.
Pragmas influence the compiler version.
Using statements change which members are available for a type in a given scope.
These are complicated details that aren’t useful to most people who need to the use the AST; they just want to deal with a simple AST interface that lets them easily navigate the Solidity code.
The AST2 builder handles these complications and embeds them into the AST2 nodes.
Consider the contracts:
1// AdderLib.sol
2pragma solidity ^0.8.0;
3
4library Adder {
5 function add(uint256 a, uint256 b) public pure returns (uint256) {
6 return a + b;
7 }
8}
9
10// MyContract.sol
11pragma solidity ^0.8.0;
12
13import "AdderLib.sol";
14
15contract MyContract {
16 Adder private adder;
17 uint256 public myVariable;
18
19 function addToVariable(uint256 value) public {
20 myVariable = adder.add(myVariable, value);
21 }
22
23 function notALibraryCall() public {
24 addToVariable(50);
25 }
26}
Import Resolution¶
The import on line 13 is removed in AST2. The LibraryDefinition
generated from AdderLib.sol is directly referenced
on line 16 as a ResolvedUserType
, which, as the name suggests, is a Type
containing a reference to the library definition. However, the AST1 UserType
only knows the textual name of the type
used in the Solidity source code.
# AST1
StateVariableDeclaration(name=Ident(text='adder'), var_type=UserType(name=Ident(text='Adder')), modifiers=[...])
# AST2, Adder is a Ref[LibraryDefinition]
StateVariableDeclaration(name=Ident(text='adder'), ttype=ResolvedUserType(Adder), modifiers=[...])
Using Directives¶
In a similar vein, the library call on line 20 is made explicit in AST2. As shown by the code_str
of the node below,
the previous 2-ary function call now takes takes the base as the first argument, matching the signature of add
as defined in the library.
Adder.add(this.adder, this.myVariable, value)
Final Words¶
This document aimed to clarify why SOLP are two forms of AST. They look similar, but there are important details that make AST2 better for most developers.
There is a lot more you can do with SOLPs ASTs; there are other components and use cases of SOLP that will be documented
more in the future. In the meantime, check out the API reference
to see
what types are available.