Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-41887: omit leading whitespaces on ast.literal_eval #22469

Merged
merged 5 commits into from Oct 4, 2020

Conversation

isidentical
Copy link
Sponsor Member

@isidentical isidentical commented Sep 30, 2020

Lib/ast.py Outdated Show resolved Hide resolved
Doc/library/functions.rst Outdated Show resolved Hide resolved
Doc/library/ast.rst Outdated Show resolved Hide resolved
@pablogsal
Copy link
Member

pablogsal commented Oct 1, 2020

I am curious if there is a performance impact due to the need to create a substring. This may be non trivial for large springs...

@gvanrossum
Copy link
Member

gvanrossum commented Oct 1, 2020

Only if there's whitespace, and even then I doubt that you can even measure the time -- surely the parser takes much more time looking at the characters one at a time than copying a string, even if the input is just a single long string literal (note that this would construct another copy for sure).

Copy link
Member

@gvanrossum gvanrossum left a comment

I actually discovered something a little disturbing. Some forms of whitespace cause syntax errors -- trailing for ast.literal_eval(), and both leading and trailing for eval()`. Example:

>>> eval("\n 12+12")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<string>", line 2
    12+12
IndentationError: unexpected indent
>>> ast.literal_eval("123\n ")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ast.py", line 62, in literal_eval
    node_or_string = parse(node_or_string, mode='eval')
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/ast.py", line 50, in parse
    return compile(source, filename, mode, flags,
  File "<unknown>", line 2
    
SyntaxError: unexpected EOF while parsing
>>> 

Doc/library/ast.rst Outdated Show resolved Hide resolved
Doc/library/functions.rst Outdated Show resolved Hide resolved
Lib/ast.py Outdated Show resolved Hide resolved
@isidentical
Copy link
Sponsor Member Author

isidentical commented Oct 1, 2020

Some forms of whitespace cause syntax errors

Isn't this expected, we only strip away the initial spaces/tabs to be compatible with eval. Do you propose to strip all forms of whitespace (including newlines) in both eval and literal_eval for both leading and trailing way?

Copy link
Member

@gvanrossum gvanrossum left a comment

Oh, another comment

@gvanrossum
Copy link
Member

gvanrossum commented Oct 1, 2020

Some forms of whitespace cause syntax errors

Isn't this expected, we only strip away the initial spaces/tabs to be compatible with eval. Do you propose to strip all forms of whitespace (including newlines) in both eval and literal_eval for both leading and trailing way?

No, but I would not want to mislead in the docs. Can we make literal_eval use the same algorithm as eval?

@iritkatriel
Copy link
Member

iritkatriel commented Oct 1, 2020

Can we make literal_eval use the same algorithm as eval?

Re differences between eval and literal_eval, that came up recently here as well: https://bugs.python.org/issue17490 .

The documentation says that literal_eval only understands "strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None." But the reality is that it doesn't understand all python numbers.

@gvanrossum
Copy link
Member

gvanrossum commented Oct 1, 2020

@iritkatriel Can you either open a new issue with a specific example of a number (not an expression!) that you think it should understand, or at least add it to bpo-17490? This PR is purely about how we treat leading whitespace.

@iritkatriel
Copy link
Member

iritkatriel commented Oct 1, 2020

@iritkatriel Can you either open a new issue with a specific example of a number (not an expression!) that you think it should understand, or at least add it to bpo-17490? This PR is purely about how we treat leading whitespace.

No, I can't. I was referring to an expression. Point taken about this being off topic here.

@gvanrossum
Copy link
Member

gvanrossum commented Oct 1, 2020

Hm... There's a slight problem with the way this currently works. It strips leading spaces and tabs, but no other whitespace characters. That's the same as eval(), but it's not what's usually considered "whitespace" -- that refers to a larger category of characters, including newline, FF, and a host of weird Unicode things.

I propose to just change the docs to say "strips leading spaces and tabs" everywhere.

Lib/test/test_ast.py Show resolved Hide resolved
Doc/library/functions.rst Outdated Show resolved Hide resolved
Copy link
Member

@gvanrossum gvanrossum left a comment

One nit, I'll apply myself.

Doc/library/ast.rst Outdated Show resolved Hide resolved
@gvanrossum
Copy link
Member

gvanrossum commented Oct 4, 2020

Will merge when the tests pass.

@gvanrossum gvanrossum merged commit e799aa8 into python:master Oct 4, 2020
3 checks passed
@bedevere-bot
Copy link

bedevere-bot commented Oct 4, 2020

@gvanrossum: Please replace # with GH- in the commit message next time. Thanks!

@miss-islington
Copy link
Contributor

miss-islington commented Oct 4, 2020

Thanks @isidentical for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.9.
🐍🍒🤖

@miss-islington
Copy link
Contributor

miss-islington commented Oct 4, 2020

Thanks @isidentical for the PR, and @gvanrossum for merging it 🌮🎉.. I'm working now to backport this PR to: 3.8.
🐍🍒🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Oct 4, 2020
Also document that eval() does this (the same way).
(cherry picked from commit e799aa8)

Co-authored-by: Batuhan Taskaya <batuhanosmantaskaya@gmail.com>
@bedevere-bot
Copy link

bedevere-bot commented Oct 4, 2020

GH-22533 is a backport of this pull request to the 3.9 branch.

@miss-islington
Copy link
Contributor

miss-islington commented Oct 4, 2020

Sorry, @isidentical and @gvanrossum, I could not cleanly backport this to 3.8 due to a conflict.
Please backport using cherry_picker on command line.
cherry_picker e799aa8b92c195735f379940acd9925961ad04ec 3.8

xzy3 pushed a commit to xzy3/cpython that referenced this pull request Oct 18, 2020
Also document that eval() does this (the same way).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants