Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bytecode interpreter section: provide a full specification of each opcode #1078

Open
Christopher-Chianelli opened this issue Apr 17, 2023 · 1 comment

Comments

@Christopher-Chianelli
Copy link

Describe the enhancement or feature you'd like
The documentation for the dis module provide a summary of what each opcode does. However, the summary is not enough to fully understand what each opcode actually does. For instance, the documentation for SEND:

(https://docs.python.org/3/library/dis.html#opcode-SEND)

Sends None to the sub-generator of this generator. Used in yield from and await statements.
  • It is not immediately obvious how many values are pushed or popped from the stack
  • It is not immediately obvious the fact that SEND has an oparg and branches depending on subgenerator state
  • It is not immediately obvious how one would use SEND to implement yield from (i.e. the context in which SEND is used).

I propose a full spec be given in a format that looks like this:

Opcode Name

Stack Prior: ... [expected stack state]
Stack After: ... [new stack state]

Description of Opcode

Example sources that generate the opcode

For the SEND opcode, it would look like this:

SEND(target_delta)

Stack Prior:                            ... subgenerator, sent_value
Stack if subgenerator is not exhausted: ... subgenerator, yielded_value
Stack if subgenerator is exhausted:     ... subgenerator

Pops off the top of stack, and sends it to the sub-generator of this generator. If the sub-generator is
not exhausted, the yielded value is pushed to the top of the stack. Otherwise, jump forward by
target_delta, leaving subgenerator on the stack. Used to implement yield from and await statements.

Example Sources:
# yield from subgenerator is implemented as the following loop
# (with None initially at the top of the stack)
#
# SEND (sends the top of stack to the subgenerator)
# YIELD_VALUE (returns the yielded value to the caller)
# RESUME
# JUMP_BACKWARD_NO_INTERRUPT (to SEND)
# POP_TOP (target of SEND)
#
# Before the loop, GET_YIELD_FROM_ITER is used to get the generator
# that will act as the subgenerator
yield from subgenerator

This is similar to how the Java virtual machine documents its opcodes (https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html), with an additional section describing sources where the opcode are emitted.

Describe alternatives you've considered

  • Add a full specification for each opcode to the documentation for dis instead. Arguably, since a full specification need to dive deep into specific details, some CPython internals would leak into an otherwise user readable doc. Additionally, it increases the maintenance burden on dis's documentation (which currently only need to list each opcode with a brief description).

Additional context
For the majority of CPython 3.11 bytecodes, I have already written documentation for them using the above format (in Asciidoc): https://github.com/Christopher-Chianelli/optapy/blob/jpyinterpreter-docs/jpyinterpreter-docs/src/modules/ROOT/pages/opcodes/opcodes.adoc . I can convert the documentation to reStructuredText and create a PR to this repo if this issue is accepted.

@encukou
Copy link
Member

encukou commented Apr 18, 2023

IMO, this is changing way too fast to be documented here. The devguide is too version-independent.

AFAIK the stack effect info is nowadays in bytecodes.c, as (inputs -- outputs), as documented with the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants