Many source code tools (linters, formatters, and others) work with CST, a tree-structured representation of source code (like AST, but it also retains nodes such as whitespace and comments). This library is a wrapper around such trees, designed for convenient iterative traversal and replacement of nodes.
You can install cstvis with pip:
pip install cstvisYou can also use instld to quickly try this package and others without installing them.
This package is built on top of libcst.
The basic workflow is very simple:
- Create an object of the
Changerclass. - Register converter functions with the
@<changer object>.converterdecorator. Each function converts oneCSTnode type into another. It takes a node object as its first argument, and that argument must have a type annotation that tells the system which node types the converter should be applied to. - If needed, register filters to prevent changes to certain nodes.
- Iterate over individual changes and apply them as needed.
Let me show you a simple example:
from libcst import Subtract, Add
from cstvis import Changer, Context
from pathlib import Path
# Content of the file:
# a = 4 + 5
# b = 15 - a
# c = b + a # kek
changer = Changer(Path('tests/some_code/simple_sum.py').read_text())
@changer.converter
def change_add(node: Add, context: Context):
return Subtract(
whitespace_before=node.whitespace_before,
whitespace_after=node.whitespace_after,
)
for x in changer.iterate_coordinates():
print(x)
print(changer.apply_coordinate(x))
#> Coordinate(file=None, class_name='Add', start_line=1, start_column=6, end_line=1, end_column=7)
#> a = 4 - 5
#> b = 15 - a
#> c = b + a # kek
#>
#> Coordinate(file=None, class_name='Add', start_line=3, start_column=6, end_line=3, end_column=7)
#> a = 4 + 5
#> b = 15 - a
#> c = b - a # kekThe key part of this example is the last two lines, where we iterate over the coordinates. What does that mean? The fact is that any code change made by this library happens in two stages: identify the coordinates of the change and then apply it. This separation makes it possible to distribute the work across multiple threads or even multiple machines. However, this design also has limitations. If you apply one coordinate change, the resulting code will differ from the original and the remaining coordinates will no longer be valid. You can only apply one change at a time.
A filter is a special function with the same signature as a converter, registered with the @<changer object>.filter decorator. It decides whether a specific CST node should be changed, and returns True if yes, or False if no. The filter applies to all nodes if the node parameter has no type annotation, or if the parameter is annotated as Any or CSTNode. If you specify a node type in the annotation, the filter will be applied only to nodes of that type. Any other annotations are not allowed.
Let's look at another example (part of the code is omitted):
count_adds = 0
@changer.filter
def only_first(node: Add, context: Context) -> bool:
global count_adds
count_adds += 1
return True if count_adds <= 1 else False
for x in changer.iterate_coordinates():
print(x)
print(changer.apply_coordinate(x))
#> Coordinate(file=None, class_name='Add', start_line=1, start_column=6, end_line=1, end_column=7)
#> a = 4 - 5
#> b = 15 - a
#> c = b + a # kekYou see? Now the iteration yields only the first possible change, the rest are filtered out automatically because the filter returns False for them.
At this point, the basic usage should be clear. But what is the context parameter passed to converters and filters? It has two fields and one useful method:
coordinatewith fieldsstart_line: int,start_column: int,end_line: int,end_column: intand some others. This identifies the current location in the code.comment- the comment on the first line of the node, if there is one, without the leading#, orNoneif there is no comment.get_metacodes(key: Union[str, List[str]]) -> List[ParsedComment]- a method that returns a list of parsed comments in metacode format associated with this line of code.