|
msg127783 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2011-02-03 05:02 |
You don't want to know why I was thinking about this...
$ ./python
Python 3.2rc2+ (py3k:88302, Feb 1 2011, 19:02:10)
[GCC 4.4.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import ast
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
Segmentation fault
|
|
msg127797 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2011-02-03 17:00 |
Looks like a stack overflow caused by an infinite recursion. I am not sure if it is possible to add cycle detection code without sacrificing performance or setting some arbitrary limits.
I wonder: Why ast nodes need to be mutable?
|
|
msg127798 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2011-02-03 17:08 |
2011/2/3 Alexander Belopolsky <report@bugs.python.org>:
>
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>
> Looks like a stack overflow caused by an infinite recursion. I am not sure if it is possible to add cycle detection code without sacrificing performance or setting some arbitrary limits.
Yes, it's definitely low priority. It's probably easier to crash the
interpreter by producing differently malformed ast anyway.
>
> I wonder: Why ast nodes need to be mutable?
So people can change them.
|
|
msg127799 - (view) |
Author: Alexander Belopolsky (belopolsky) *  |
Date: 2011-02-03 17:21 |
On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
<report@bugs.python.org> wrote:
..
>> I wonder: Why ast nodes need to be mutable?
>
> So people can change them.
Well, they are hashable, so this needs to be done carefully. Is this
necessary for AST-based optimizations? Does Python actually change
AST after it has been created? Note that for some optimizations it
may be more appropriate to build a new tree rather than mutate the old
one. Depending on the algorithm, you may or may not need to change
the nodes after they have been created in the process.
|
|
msg127800 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2011-02-03 17:27 |
2011/2/3 Alexander Belopolsky <report@bugs.python.org>:
>
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>
> On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
> <report@bugs.python.org> wrote:
> ..
>>> I wonder: Why ast nodes need to be mutable?
>>
>> So people can change them.
>
> Well, they are hashable, so this needs to be done carefully. Is this
> necessary for AST-based optimizations? Does Python actually change
> AST after it has been created? Note that for some optimizations it
> may be more appropriate to build a new tree rather than mutate the old
> one. Depending on the algorithm, you may or may not need to change
> the nodes after they have been created in the process.
Other people are, though. The hash is by identity anyway.
|
|
msg127801 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2011-02-03 17:28 |
2011/2/3 Alexander Belopolsky <report@bugs.python.org>:
>
> Alexander Belopolsky <belopolsky@users.sourceforge.net> added the comment:
>
> On Thu, Feb 3, 2011 at 12:08 PM, Benjamin Peterson
> <report@bugs.python.org> wrote:
> ..
>>> I wonder: Why ast nodes need to be mutable?
>>
>> So people can change them.
>
> Well, they are hashable, so this needs to be done carefully. Is this
> necessary for AST-based optimizations? Does Python actually change
> AST after it has been created? Note that for some optimizations it
> may be more appropriate to build a new tree rather than mutate the old
> one. Depending on the algorithm, you may or may not need to change
> the nodes after they have been created in the process.
Other people are, though. The hash is by identity anyway.
|
|
msg127816 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2011-02-03 20:14 |
Alex: If the node attributes were not mutable, it would be extremely awkward, not to say inefficient, to mutate an already existing AST as returned by ast.parse().
The AST objects in the _ast module aren't what Python works with internally, anyway. When calling ast.parse(), the AST is converted to Python objects (these are defined in Python-ast.c), and compile()ing such an object converts them back to the internal tree representation. This conversion is where recursions would need to be handled.
|
|
msg155978 - (view) |
Author: Gregory P. Smith (gregory.p.smith) *  |
Date: 2012-03-16 00:33 |
i haven't confirmed if it is this exact bug but I believe a coworker just ran into something similar. he wrote code to use the ast to remove docstrings from code before passing it to compile() (as that saves a noticable amount of memory). in the case the ast for code like:
def foo():
"""this is a docstring."""
Removing the docstring and passing such a thing to compile triggers a problem. A workaround was to add a pass in such cases:
if (node.body and isinstance(node.body[0], ast.Expr) and
isinstance(node.body[0].value, ast.Str)):
docstring = node.body.pop(0)
if len(node.body) == 0:
# An empty body will sometimes provoke a segfault when you call
# compile on the code object.
node.body.append(ast.Pass(lineno=docstring.lineno,
col_offset=docstring.col_offset))
regardless, it'd be better if compile() *never* crashed on strange input.
|
|
msg155986 - (view) |
Author: Benjamin Peterson (benjamin.peterson) *  |
Date: 2012-03-16 01:20 |
Have him try on 3.3. This should be less of an issue there where there is an AST validator. It doesn't fix this bug, but it does fix most accidental AST construction bugs.
|
|
msg358952 - (view) |
Author: Batuhan Taskaya (BTaskaya) *  |
Date: 2019-12-28 17:41 |
We can probably implement something like this to prevent this happening
diff --git a/Parser/asdl_c.py b/Parser/asdl_c.py
index daac0966f5..f9da52da7f 100755
--- a/Parser/asdl_c.py
+++ b/Parser/asdl_c.py
@@ -559,6 +559,11 @@ class Obj2ModVisitor(PickleVisitor):
self.emit("asdl_seq_SET(%s, i, val);" % field.name, depth+2)
self.emit("}", depth+1)
else:
+ self.emit("if (obj == tmp) {", depth+1)
+ self.emit("PyErr_SetString(PyExc_RuntimeError, \"Recursing fields "
+ "are not supported for AST nodes.\");", depth+2, reflow=False)
+ self.emit("goto failed;", depth+2)
+ self.emit("}", depth+1)
self.emit("res = obj2ast_%s(tmp, &%s, arena);" %
(field.type, field.name), depth+1)
self.emit("if (res != 0) goto failed;", depth+1)
|
|
msg359122 - (view) |
Author: (ppperry) |
Date: 2019-12-31 19:06 |
What about indirect cycles like below:
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> f = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = f
>>> f.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
(I tested, this also crashes)
|
|
msg373078 - (view) |
Author: Terry J. Reedy (terry.reedy) *  |
Date: 2020-07-06 08:40 |
With 3.9 on Windows, using Benjamin's example, I do not get the Windows equivalent of a seg fault. However, execution stops at compile with no exception, including SystemExit.
These examples amount to limited fuzz testing of compile(). I think it should raise something like "SyntaxError: recursive ast" or even 'bad ast' if malformed non-recursive asts have the same issue.
|
|
msg373084 - (view) |
Author: Batuhan Taskaya (BTaskaya) *  |
Date: 2020-07-06 09:31 |
> With 3.9 on Windows, using Benjamin's example, I do not get the Windows equivalent of a seg fault. However, execution stops at compile with no exception, including SystemExit.
I can still reproduce on Linux,
$ python
Python 3.10.0a0 (heads/bpo-xxxxx:f2947e354c, May 21 2020, 18:54:57)
[GCC 9.2.1 20191008] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ast
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
[1] 15320 segmentation fault (core dumped) python
> These examples amount to limited fuzz testing of compile(). I think it should raise something like "SyntaxError: recursive ast" or even 'bad ast' if malformed non-recursive asts have the same issue.
I dont think it would be easy to locate such errors before they happen, instead I propose (actually already proposed in PR 20594) to add recursion guards to places where this might happen. This can prevent crashes on both direct and indirect cycles
>>> import ast
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded while traversing 'expr' node
>>> e = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> f = ast.UnaryOp(op=ast.Not(), lineno=0, col_offset=0)
>>> e.operand = f
>>> f.operand = e
>>> compile(ast.Expression(e), "<test>", "eval")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RecursionError: maximum recursion depth exceeded while traversing 'expr' node
|
|
| Date |
User |
Action |
Args |
| 2020-09-19 19:02:53 | georg.brandl | set | nosy:
- georg.brandl
|
| 2020-07-06 09:31:57 | BTaskaya | set | messages:
+ msg373084 |
| 2020-07-06 08:40:47 | terry.reedy | set | nosy:
+ terry.reedy
messages:
+ msg373078 versions:
+ Python 3.10, - Python 2.7, Python 3.9 |
| 2020-06-02 10:20:26 | BTaskaya | set | keywords:
+ patch stage: test needed -> patch review pull_requests:
+ pull_request19824 |
| 2019-12-31 19:06:17 | ppperry | set | nosy:
+ ppperry messages:
+ msg359122
|
| 2019-12-28 17:41:29 | BTaskaya | set | versions:
+ Python 3.9, - Python 3.2, Python 3.3 |
| 2019-12-28 17:41:12 | BTaskaya | set | nosy:
+ BTaskaya messages:
+ msg358952
|
| 2012-03-16 03:24:55 | eric.snow | set | nosy:
+ eric.snow
|
| 2012-03-16 01:20:34 | benjamin.peterson | set | messages:
+ msg155986 |
| 2012-03-16 00:33:04 | gregory.p.smith | set | nosy:
+ gregory.p.smith messages:
+ msg155978
|
| 2011-12-24 18:53:33 | ezio.melotti | set | stage: test needed components:
+ None versions:
- Python 3.1 |
| 2011-08-12 04:29:27 | meador.inge | set | nosy:
+ meador.inge
|
| 2011-02-03 20:14:53 | georg.brandl | set | nosy:
+ georg.brandl messages:
+ msg127816
|
| 2011-02-03 17:28:06 | benjamin.peterson | set | messages:
+ msg127801 |
| 2011-02-03 17:27:13 | benjamin.peterson | set | messages:
+ msg127800 |
| 2011-02-03 17:21:42 | belopolsky | set | messages:
+ msg127799 |
| 2011-02-03 17:08:02 | benjamin.peterson | set | messages:
+ msg127798 |
| 2011-02-03 17:00:32 | belopolsky | set | nosy:
+ belopolsky messages:
+ msg127797
|
| 2011-02-03 05:02:39 | benjamin.peterson | create | |