Conversation
mathics/builtin/files_io/files.py
Outdated
| result = result[:-1] | ||
|
|
||
| for res in result: | ||
| print("show", String(res).value) |
There was a problem hiding this comment.
Sigh.
I often look at the PR inside GitHub's interface to spot stuff like this.
There was a problem hiding this comment.
Sorry, usually I do a git diff origin/master | grep 'print(' but this escaped to my sight...
| ) | ||
|
|
||
| ascii_operator_to_symbol = NAMED_CHARACTERS_COLLECTION["ascii-operator-to-symbol"] | ||
| CHARACTER_TO_NAME = { |
There was a problem hiding this comment.
Please move this to Mathics3-scanner. Thanks.
There was a problem hiding this comment.
I will, but first I wanted to be sure that this is the right approach. Thoughts?
There was a problem hiding this comment.
I find this approach of putting in a PR, then discussing whatever it is that happens to be in it (you figure it out), really hard to follow.
Many people start with a problem and then go to a solution, instead of writing some code based on something that feels wrong (is this what is meant by "vibe" coding?) and then looking at what's been created and discussing that.
If that's the way you have to work, well. okay. But maybe after all the vibe coding, we can have a discussion (independent of the code) about what's wrong. Then discuss ways to address that.
I had thought we were going to start to do that for #1735, which I had imagined was taking that code and breaking it up into pieces. You know, like option information from a built-in (CharacterEncoding, of ToString ) is not filtering down to rendering routines. How do we do that? Do we add **kwargs parameters to the methods or split out the relevant ones (like encoding)?
I admit that there are bigger issues we want to solve, but I offer this as a specific example of something where we can break off a small, isolated problem (independent of the larger issue) and create a PR for that.
Or decide to hold off on that until the bigger picture is decided.
Instead, we are now on to a related topic with code that is outside of #1735.
So be it.
Okay. Now that you've come across this other thing and written some code so you might be able to understand something about it, can we just forget about the code (for now), and describe what the problem is in human language, and then what the approaches for handling this are?
There was a problem hiding this comment.
Okay. Now that you've come across this other thing and written some code so you might be able to understand something about it, can we just forget about the code (for now), and describe what the problem is in human language, and then what the approaches for handling this are?
I get from the PR comment (probably written after the code) that we should add the option ShowSpecialCharacters, which is used in Style and StyleBox.
Instead of the code, though, describe in human language what the issues or approaches are and what implications those might have.
(I write "human language" because I understand English may be awkward for you (as it is for me)).
If you want to think and describe in Spanish, that's okay, I'll use Google Translate. The main thing is to express the idea independent of specific code.
There was a problem hiding this comment.
I find this approach of putting in a PR, then discussing whatever it is that happens to be in it (you figure it out), really hard to follow.
OK: the wordy version of this PR would be: we need conversion tables for
- Take any str in the Mathics3 inner encoding, and convert it into an ASCII representation in an invertible way. This is required for FullForm.
- Take any str in the Mathics3 inner encoding, and convert it into an ASCII representation that be visually close to the character that the internal character represents.
I believe I mentioned this earlier; if not, my apologies.
Many people start with a problem and then go to a solution, instead of writing some code based on something that feels wrong (is this what is meant by "vibe" coding?) and then looking at what's been created and discussing that.
If that's the way you have to work, well. okay. But maybe after all the vibe coding, we can have a discussion (independent of the code) about what's wrong. Then discuss ways to address that.
Again, I though we have already that discussion. Now I am just proposing an implementation for it. And for it, I feel easier to show the code of the implementation instead of trying to figure out how to translate from Physicist-Spanish to Computer-Science English.
I had thought we were going to start to do that for #1735, which I had imagined was taking that code and breaking it up into pieces. You know, like option information from a built-in (
CharacterEncoding, ofToString) is not filtering down to rendering routines. How do we do that? Do we add**kwargsparameters to the methods or split out the relevant ones (likeencoding)?
I am doing the work of spliting in pieces. Now I put another of these pieces, related to MathMLForm. There are coming more.
I admit that there are bigger issues we want to solve, but I offer this as a specific example of something where we can break off a small, isolated problem (independent of the larger issue) and create a PR for that.
What this PR tries to solve is to have an output from FullForm that can be copies from any front-end, copy to another front end, and produce exactly the same code. Then, we can compare in tests results and expected results disregarding of the encoding.
Or decide to hold off on that until the bigger picture is decided.
That is the bigger picture
Instead, we are now on to a related topic with code that is outside of #1735.
So be it.
Okay. Now that you've come across this other thing and written some code so you might be able to understand something about it, can we just forget about the code (for now), and describe what the problem is in human language, and then what the approaches for handling this are?
I am going to update the PR description to focus more on its central aspect.
mathics/format/form/outputform.py
Outdated
| return value | ||
| # value = expr.value | ||
| # return value | ||
| kwargs["System`ShowStringCharacters"] = SymbolTrue |
There was a problem hiding this comment.
Is setting this to True unconditionally correct? Can't it be overwritten from outside via kwargs?
There was a problem hiding this comment.
This is because we do not pass through boxes here. The alternative would be to add the quotes, and then leave the render function to remove them.
There was a problem hiding this comment.
OK, now I have added a long comment to explain how is this path. Also, I handle the case where kwargs["SystemShowStringCharacters"]was already set toTrue`.
mathics/core/convert/op.py
Outdated
| ) | ||
| # These characters are used in encoding | ||
| # in WMA, and differs from what we have | ||
| # in Mathics3-scanner tables: |
There was a problem hiding this comment.
Not totally accurate. As mentioned before, is listed as the Wolfram-language encoding.
A number of these we choose to not to use by default for input and output because they would need special code pages set up by users, which is generally not done. So instead, we often pick a Unicode symbol that is equivalent and commonly available to users.
However, we always note the corresponding WL unicode, and that too is available in JSON tables.
If the scanner is not accepting as an acceptable character for DifferentialD, that can easily be fixed. In fact, we've done something like that recently.
There was a problem hiding this comment.
The point is that this character appears in some places. This is the kind of things that I would like to fix before moving this to Mathics3-scanner, to avoid doing coordinated changes before have this in clear.
There was a problem hiding this comment.
Sure - that's fine and probably a good idea. But then, let's describe the problem and ideas without reference to specific code. (It's fine for you to have written this for yourself to gain some idea. Just do not hold too tightly, though, on the exact code until we've gone over the problem and ideas at a high level first.
There was a problem hiding this comment.
OK, this part is not strictly needed for this PR. I will propose this for another round.
mathics/builtin/box/layout.py
Outdated
| https://reference.wolfram.com/language/ref/ShowSpecialCharacters.html</url> | ||
| <dl> | ||
| <dt>'ShowSpecialCharacters' | ||
| <dd>is an option for 'Style' and 'Cell' that directs whether non-ANSI characters must be shown as special characters or by escaped sequences. |
There was a problem hiding this comment.
I usually mix them in my head: ASCII==7bits and ANSI==8bits, right?
There was a problem hiding this comment.
See https://stackoverflow.com/questions/701882/what-is-ansi-format and decide what it is that want to convey. Then pick the word that is appropriate. If you decide it is ANSI, then I think you would need to elaborate more on the code page aspect.
Also taking from the link:
The name "ANSI" is a misnomer, since it doesn't correspond to any actual ANSI standard, but the name has stuck. ANSI is not the same as UTF-8.
Do you mean UTF-8?
There was a problem hiding this comment.
I usually mix them in my head: ASCII==7bits and ANSI==8bits, right?
While there is 8-bit ASCII, I am now getting the impression you mean either UTF-8 or Unicode.
There was a problem hiding this comment.
I try to mean, the "default" (American centrier) interpretation of an 8-bit character. But OK, the right name is ASCII.
mathics/builtin/box/layout.py
Outdated
| </ul> | ||
| """ | ||
|
|
||
| summary_text = "cell option directing whether show special characters in a reversible ANSI format." |
Renamed function to reflect ASCII handling and updated docstring for clarity.
|
Do I have this right that this PR:
Is the round-trip or invertibility aspect you find desirable for testing? Or is there a user impact as well? (Cut and paste output from x to feed into y for some x and y). You indicate that it might also be what WMA does, but here, I'd be grateful to get some simple examples or documentation somewhere that show this. |
ShowSpecialCharacters and ShowStringCharacters options and round-trip FullForm output
|
@mmatera I have edited both the PR title and the description. I make mistakes and might not have gotten this correct. Please check. Getting these to be accurate is very helpful to me, especially now, as I will be going over PRs and change logs in order to get release notes done. (I have been putting this off, but I need to start doing now) |
|
LGTM. I am not 100% certain everything here is as it should be, but I guess, let's go with it for now, with the understanding that in the future we may need to make other adjustments or change things slightly. |
|
Note: |
@rocky, thanks for the review and the patience! |
This PR adds support for the options
ShowSpecialCharactersandShowStringCharactersused in StyleBox, Style, and Cell builtin functions. These options control how strings are rendered.In WMA, when this
ShowSpecialCharactersoption is set toFalse, andShowStringCharactersis set toTrue, strings are rendered using an ASCII representation in which any non-ASCII characters are represented by their character names. This provides an "invertible" representation of the internal original String. In WMA, this representation is used inFullForm.This would also provide better grounds for #1735