Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upRe-enable AST_COMPARE #17
Conversation
| @@ -21,5 +21,4 @@ before_install: | |||
| - pyenv install -s 3.6.3 | |||
| - pyenv global 2.7.11 3.6.3 | |||
| script: | |||
| - yarn lint | |||
This comment has been minimized.
This comment has been minimized.
| "test": "jest" | ||
| "lint": "jest -c jest.eslint.config.js", | ||
| "test": "jest", | ||
| "prettier": "prettier --plugin=. --parser=python" |
This comment has been minimized.
This comment has been minimized.
azz
Jan 3, 2018
Author
Member
Useful for testing. yarn prettier test.py or echo 'foo' | yarn prettier
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| "test": "jest" | ||
| "lint": "jest -c jest.eslint.config.js", | ||
| "test": "jest", | ||
| "prettier": "prettier --plugin=. --parser=python" |
This comment has been minimized.
This comment has been minimized.
| const parent = path.getParentNode(); | ||
| // This is overly cautious, we may want to revisit and normalize more | ||
| // cases here. | ||
| return /[ex.]/i.test(parent.source) ? parent.source : `${n.n}`; |
This comment has been minimized.
This comment has been minimized.
taion
Jan 3, 2018
Contributor
there is probably no reason to ever normalize these. consider e.g. PEP 515 section dividers, plus in general you can't represent all python numbers exactly with the JS number type anyway
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
taion
Jan 4, 2018
Contributor
probably just apply this check for Num then, and don't bother with the special-case number handling
e.g. you can't differentiate 1.0j from 1j in the AST anyway, but those should not be reprinted
This comment has been minimized.
This comment has been minimized.
| @@ -305,7 +305,7 @@ function genericPrint(path, options, print) { | |||
| const parent = path.getParentNode(); | |||
This comment has been minimized.
This comment has been minimized.
taion
Jan 4, 2018
Contributor
Instead of reaching to the parent here, just drop visit_field_Num_n from astexport, then just print path.source in the handler for Num. As noted above, this is the right thing to do for complex values too.
In general it's probably best not to try to normalize numbers in Python. People do lots of numerical work, and if I'm writing out something like 1.00, it's probably for a good reason – like maybe it's np.array([0.99, 1.00, 1.01]) or something.
This comment has been minimized.
This comment has been minimized.
azz
Jan 4, 2018
•
Author
Member
Can't quite do that as complex is not JSON serializable. Having the source type information may be useful too, so I'll just coerce them to str on the Python side (assuming that is safe?).
|
modulo that one comment on |
| @@ -87,18 +87,18 @@ def visit_field_Num_n(self, val): | |||
| if isinstance(val, int): | |||
This comment has been minimized.
This comment has been minimized.
taion
Jan 4, 2018
Contributor
ah, okay – this should probably just be
def visit_field_Num_n(val):
return str(val)this value won't actually be used in the printer (which as with the JS printer will only look at the raw string value), so anything JSON-serializable that will work for the AST comparison will be fine
This comment has been minimized.
This comment has been minimized.
azz
Jan 4, 2018
Author
Member
You lose a bit of information in doing this, does it hurt to keep around the source type?
This comment has been minimized.
This comment has been minimized.
taion
Jan 4, 2018
Contributor
i don't think you do, actually, relative to what's in the AST:
In [1]: str(1)
Out[1]: '1'
In [2]: str(1.)
Out[2]: '1.0'
In [3]: str(0xDEADBEEF)
Out[3]: '3735928559'
In [4]: str(1_000_001)
Out[4]: '1000001'
In [5]: str(5j)
Out[5]: '5j'
In [6]: str(3-4j)
Out[6]: '(3-4j)'
there might be some edge cases on float rounding, but you'd already encounter them with the above
| let ppastMassaged; | ||
| let pperr = null; | ||
| try { | ||
| const ppast = parse( | ||
| prettyprint(source, path, mergedOptions), | ||
| mergedOptions | ||
| ); | ||
| ppastMassaged = massageAST(ppast); | ||
| ppastMassaged = massageAST(ppast, normalizedOptions); |
This comment has been minimized.
This comment has been minimized.
taion
Jan 4, 2018
Contributor
so this is definitely a value-add, but it's not an ideal API. for purposes of comparison, given a python AST, you can call
ast.dump(tree, annotate_fields=False)this will give you a string representation of the AST – not the most useful thing for what we're formatting, but you can then just do string equality to compare.
e.g. for this python 2 example:
In [15]: print(s)
with A() as a, B() as b:
pass
In [16]: print(rt)
with A() as a:
with B() as b:
pass
In [17]: print(ast.dump(ast.parse(s), annotate_fields=False))
Module([With(Call(Name('A', Load()), [], [], None, None), Name('a', Store()), [With(Call(Name('B', Load()), [], [], None, None), Name('b', Store()), [Pass()])])])
In [18]: print(ast.dump(ast.parse(rt), annotate_fields=False))
Module([With(Call(Name('A', Load()), [], [], None, None), Name('a', Store()), [With(Call(Name('B', Load()), [], [], None, None), Name('b', Store()), [Pass()])])])
(the two dumped AST strings above are identical)
the difference being in this case that we'd be delegating everything to the ground truth AST handler, such that if there were a bug with the AST -> JSON conversion, we would know
the sort of essential problem from a correctness perspective is that we're not really dealing with a canonical python AST here
funny enough you can't actually assert abstract equality between ASTs in python without dumping them – at not least until python/cpython#1368 lands
This comment has been minimized.
This comment has been minimized.
taion
Jan 4, 2018
Contributor
tl;dr it would be a better upstream API to have a parseAndMassage hook instead of a massageAST hook. this is because there's a built-in way to canonicalize a python AST to a string that can be used for equality comparisons, but the functionality to convert the python AST to JSON is code under test here
This comment has been minimized.
This comment has been minimized.
taion
Jan 4, 2018
Contributor
Actually, astor.strip_tree will clean up Python ASTs such that equality checks on them in Python are possible, but I guess there's not a good way to pretty-print the diff for test failures.
This comment has been minimized.
This comment has been minimized.
azz
Jan 4, 2018
Author
Member
The nested with case is something we wouldn't change in Prettier for JS. e.g. we would never change:
if (foo) {
if (bar) {
}
}to
if (foo && bar) {
}There's a reason we massage the AST after we have printed it, as the cleanup often removes properties or nodes that are critical information required to print the code.
This comment has been minimized.
This comment has been minimized.
taion
Jan 4, 2018
Contributor
that was an illustrative example, showing how this process produces identical dumped ASTs even when the source is quite different in terms of concrete syntax
as it happens, for with, the python 2 AST doesn't distinguish at all between those cases – hence the heuristic in https://github.com/prettier/prettier-python/pull/15/files#diff-ab7972325670f1753462f9b0e6c57900R138
like literally those two things above are identical from the perspective of the parser in python 2 (though not so in python 3)
|
If you want to use |
|
It wouldn't be straightforward to use |
|
Not sure I'm following, why would it be difficult to send two strings (input and output) to Python to run parse and |
|
I mean with |
|
Correct. If we want this to work for both |
|
Yes – that's what I meant in prettier/prettier#3610 (comment). Though it turns out the AST here does include the full original source code (usually many times over), but we should probably remove that behavior: 59dcdbe#r26632592. |
azz commentedJan 3, 2018
•
edited
Printers can now specify custom clean functions for AST massaging.
Fixes #12