-
-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: allow multiline and comments in python expressions #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| /// Since we use `exec()` instead of `eval()`, Python naturally handles: | ||
| /// - Comments (they're ignored by the parser) | ||
| /// - Newlines (they're part of normal Python syntax) | ||
| pub fn extract_comments(source: &str) -> Result<Vec<Comment>, String> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately Ruff's Python AST doesn't capture info on comments. So I had to parse the comments manually by searching for # and taking comments as everything from # until the end of line.
But then we naturally want to avoid those # inside strings. But Python strings aren't simply anything between two quotes, because strings can contain escaped quotes like " this is \" escaped".
So this function parses the string, detects strings, while being mindful of escaped quotes inside strings. And only when outside of strings, it captures # as comments.
| }; | ||
|
|
||
| while i < bytes.len() { | ||
| let ch = bytes[i] as char; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic for extracting comments goes character by character and changes the state based on whether we're inside a string, entering/leaving a string, etc.
| /// Then we adjust the range at the end to `byte range 7..8` and add the text at that range: | ||
| /// | ||
| /// `Parse error: Expected an expression or a ']' at byte range 7..8: ')'` | ||
| fn adjust_error_ranges(error_msg: &str, wrap_prefix_len: usize, source: &str) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the Python expression is parsed by the Ruff Python AST parser, it raises a different error message than what Python would have raised if the expression was evaluated with eval().
So if we have a expression (1, 2], in Python we get an error:
>>> (1, 2]
File "<stdin>", line 1
(1, 2]
^
SyntaxError: closing parenthesis ']' does not match opening parenthesis '('
But the AST parser will raise an error message like so:
Parse error: Expected an expression or a ']' at byte range 7..8
Now, to make these error messages easier to interpret, this function modifies the error messages to also show the exact syntax that caused the error:
Parse error: Expected an expression or a ']' at byte range 7..8: ')'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing that this function does is that when the Python expression is passed to Ruff's Python AST parser, the expression is wrapped in extra parentheses and newlines - (\n...\n).
This is necessary to allow the Python expressions to be defined across multiple lines (e.g. imagine a comment on a separate line before or after the actual expression:
{% component "table"
data=(
# bla bla
[1, 2, 3]
)
/ %}But the side effect is that all line and col positions are shifted.
And when there's a syntax error, the Ruff Python AST parser reports also the start/end indices of the problematic syntax.
So this function also fixes the indices so that they refer to the position in the original expression, and NOT in the one wrapped in (\n...\n).
| # Actually execute the code | ||
| eval_func = eval(compiled_code, eval_namespace, {}) | ||
| # Execute the function definition | ||
| exec(lambda_code, eval_namespace, eval_locals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing to enable the Python expressions to span multiple lines was to switch from eval() to exec(), because eval() doesn't allow newlines.
One good thing about eval() was that it allows only a single Python expression. "expression" is something that can be assinged to a variable. On the other hand, "stataments" are syntax that cannot be assigned to variables (e.g. import). So eval() disallowed code with statements or multiple expressions. So something like these would NOT be allowed:
# Statement
eval("""
from os import abc
class X:
...
""")
# Multiple expressions
eval("""
2 + 2
my_var
""")However, the logic is still safe, because the same safety measure is implemented on the level of Ruff's Python AST parser:
- There, when walking the AST, we raise error if we come across statements, effectively banning them.
- The Ruff AST parser also has a setting to parse the code as "expression", which raises when multiple expressions or statements are encountered. See
parse_expression
Follow up to #17
This adds support for multiline Python expressions and comments within these expressions: