@@ -5,9 +5,11 @@ title: "XMIR, a Quick Tour"
55author : yegor256
66---
77
8+ _ Last updated at: 17.04.2025_
9+
810XMIR is a dialect of [ XML] ( https://en.wikipedia.org/wiki/XML ) ,
911which we use to represent a parsed
10- [ EO] ( https://www.eolang.org ) program . It is a pretty simple format,
12+ [ EO] ( https://www.eolang.org ) object . It is a pretty simple format,
1113which has a few
1214important tricks, which I share below in this blog post. You may
1315also want to check our [ schema] ( https://en.wikipedia.org/wiki/XML_schema ) :
@@ -17,9 +19,10 @@ which may be more readable for some of you).
1719
1820<!-- more-->
1921
20- Consider this simple EO program that prints ` "Hello, world!" ` :
22+ Consider this simple EO object that prints ` "Hello, world!" ` :
2123
2224```
25+ # App.
2326[] > app
2427 [x] > foo
2528 QQ.io.stdout > @
@@ -34,113 +37,119 @@ If we parse it using `EoSyntax` class from [eo-parser],
3437we will get this XMIR (or very similar):
3538
3639``` xml
37- <program xmlns : xsi =" http://www.w3.org/2001/XMLSchema-instance"
38- dob =" 2024-12-27T11:00:08" ms =" 98" name =" app" revision =" 27abe8b"
39- source =" app.eo" time =" 2025-01-13T09:32:04.455112Z" version =" 0.50.0"
40- xsi : noNamespaceSchemaLocation =" https://www.eolang.org/xsd/XMIR-0.50.0.xsd" >
41- <listing ># Simple app.
40+ <object
41+ xmlns : xsi =" http://www.w3.org/2001/XMLSchema-instance"
42+ dob =" 2024-12-27T11:00:08"
43+ ms =" 98"
44+ revision =" 27abe8b"
45+ time =" 2025-04-17T09:32:04.455112Z"
46+ version =" 0.56.0"
47+ xsi : noNamespaceSchemaLocation =" https://www.eolang.org/xsd/XMIR-0.56.0.xsd" >
48+ <listing ># App.
4249[] > app
4350 [x] > foo
4451 QQ.io.stdout > @
45- QQ.txt.sprintf
52+ QQ.txt.sprintf *1
4653 "Hello, %s\n"
47- * x
54+ x
4855 foo > @
4956 "world!"
5057</listing >
51- <objects >
52- <o line =" 2" name =" app" pos =" 0" >
53- <o line =" 3" name =" foo" pos =" 2" >
54- <o base =" ∅" line =" 3" name =" x" pos =" 3" />
55- <o base =" .stdout" line =" 4" name =" @" pos =" 9" >
56- <o base =" .io" line =" 4" pos =" 6" >
57- <o base =" QQ" line =" 4" pos =" 4" />
58+ <o line =" 2" name =" app" pos =" 0" >
59+ <o line =" 3" name =" foo" pos =" 2" >
60+ <o base =" ∅" line =" 3" name =" x" pos =" 3" />
61+ <o base =" .stdout" line =" 4" name =" @" pos =" 9" >
62+ <o base =" .io" line =" 4" pos =" 6" >
63+ <o base =" QQ" line =" 4" pos =" 4" />
64+ </o >
65+ <o base =" .sprintf" line =" 5" pos =" 12" >
66+ <o base =" .txt" line =" 5" pos =" 8" >
67+ <o base =" QQ" line =" 5" pos =" 6" />
5868 </o >
59- <o base =" .sprintf" line =" 5" pos =" 12" >
60- <o base =" .txt" line =" 5" pos =" 8" >
61- <o base =" QQ" line =" 5" pos =" 6" />
62- </o >
63- <o base =" string" line =" 6" pos =" 8" >48-65-6C-6C-6F-2C-20-25-73-0A</o >
64- <o base =" tuple" line =" 7" pos =" 8" >
65- <o base =" .empty" >
66- <o base =" tuple" />
67- </o >
68- <o base =" x" line =" 7" pos =" 10" />
69+ <o base =" string" line =" 6" pos =" 8" >48-65-6C-6C-6F-2C-20-25-73-0A</o >
70+ <o base =" tuple" line =" 7" pos =" 8" >
71+ <o base =" .empty" >
72+ <o base =" tuple" />
6973 </o >
74+ <o base =" x" line =" 7" pos =" 10" />
7075 </o >
7176 </o >
7277 </o >
73- <o base =" foo" line =" 8" name =" @" pos =" 2" >
74- <o base =" string" line =" 9" pos =" 4" >77-6F-72-6C-64-21</o >
75- </o >
7678 </o >
77- </objects >
78- </program >
79+ <o base =" foo" line =" 8" name =" @" pos =" 2" >
80+ <o base =" string" line =" 9" pos =" 4" >77-6F-72-6C-64-21</o >
81+ </o >
82+ </o >
83+ </object >
7984```
8085
81- The ` <program > ` is the root element, it will always be there, with
86+ The ` <object > ` is the root element, it will always be there, with
8287a few mandatory attributes:
8388
84- * ` ms ` is how much time in milliseconds it took to parse the program
89+ * ` ms ` is how much time in milliseconds it took to parse the object
8590and generate this XMIR file,
86- * ` name ` is the name of the program, as it was given to the parser,
8791* ` time ` is the time in [ ISO 8601] format when the file was generated,
8892* ` version ` is the version of the parser.
8993
90- The ` <listing> ` element contains the source code of the EO program ,
91- which was parsed, without any modifiations , "as is."
94+ The ` <listing> ` element contains the source code of the EO object ,
95+ which was parsed, without any modifications , "as is."
9296
9397## Errors and Warnings
9498
9599The ` <errors> ` element may have a list of problems discovered by the
96- parser or any other optimizers, as ` <error> ` elements.
100+ parser or any other optimizers, as ` <error> ` elements. If there are no
101+ errors, the ` <errors> ` element should not exist in ` <object> ` .
97102For example, it may look like this:
98103
99104``` xml
100- <program >
101- [..]
105+ <object >
106+ [... ]
102107 <errors >
103108 <error severity =" warning" line =" 3" >There is an extra bracket</error >
104109 <error severity =" error" line =" 12" >The object 'x' is not found</error >
110+ [...]
105111 </errors >
106- </program >
112+ </object >
107113```
108114
109115The errors with the ` warning ` severity may more or less safely be ignored. The
110- errors with the ` error ` severity will lead to failures in further compilation
111- and processing. There could also be elements with the ` critical ` severity,
112- which must stop the processing of the document immediately.
116+ errors with the ` error ` severity will lead to failures in further compilation
117+ and processing. There could also be elements with the ` critical ` severity,
118+ which must stop the processing of the document immediately.
113119
114120## Sheets
115121
116- The ` <sheets> ` element will rarely be empty. It contains a list of all
117- post-processors that were applied to the document after is parsing.
118- We process our XMIR documents using dozens of XSL stylesheets. That's why
119- the name of the XML element. You may find something like this over there:
122+ The ` <sheets> ` element contains a list of all
123+ post-processors that were applied to the document after is parsing.
124+ We process our XMIR documents using dozens of XSL stylesheets. That's why
125+ the name of the XML element. You may find something like this over there:
120126
121127``` xml
122- <program >
123- [..]
128+ <object >
129+ [... ]
124130 <sheets >
125- <sheet >not-empty-atoms </sheet >
126- <sheet >middle-varargs </sheet >
127- <sheet >duplicate-names </sheet >
128- <sheet >many-free-attributes </sheet >
131+ <sheet >move-voids-up </sheet >
132+ <sheet >const-to-dataized </sheet >
133+ <sheet >stars-to-tuples </sheet >
134+ <sheet >wrap-method-calls </sheet >
129135 [...]
130136 </sheets >
131- </program >
137+ </object >
132138```
133139
134140The names you see in the ` <sheet> ` elements are the names of the files.
135- For example, ` not-empty-atoms ` represents the
136- [ ` not-empty-atoms.xsl ` ] file
137- in the [ objectionary/eo] ( https://github.com/objectionary/eo ) GitHub repository.
141+ For example, ` wrap-method-calls ` represents the
142+ [ ` wrap-method-calls.xsl ` ] file
143+ in the [ objectionary/eo] ( https://github.com/objectionary/eo ) GitHub repository.
144+
145+ If no XSL stylesheets are applied to XMIR, the ` <sheets> ` element should not exist
146+ in ` <object> ` .
138147
139148## Metas
140149
141150There may be an optional element ` <metas> ` with a list of ` <meta> ` elements.
142- For example, if my source code would have this meta at the 3rd
143- line of the source file:
151+ For example, if my source code would have this meta at the 3rd
152+ line of the source file:
144153
145154```
146155+alias foo com.example.foo
@@ -149,77 +158,89 @@ There may be an optional element `<metas>` with a list of `<meta>` elements.
149158We would see the following in the XMIR:
150159
151160``` xml
152- <program >
153- [..]
154- <metas >
161+ <object >
162+ [... ]
163+ <metas >
155164 <meta line =" 3" >
156165 <head >alias</head >
157- <tail >foo com.example.foo</tail >
166+ <tail >foo Q. com.example.foo</tail >
158167 <part >foo</part >
159- <part >com.example.foo</part >
168+ <part >Q. com.example.foo</part >
160169 </meta >
161- [..]
170+ [... ]
162171 </metas >
163- </program >
172+ </object >
164173```
165174
166175Each ` <meta> ` element contains parts of the meta. The ` <head> `
167- contains everything that goes after the ` + ` until the first space.
168- The ` <tail> ` contains everything after the first space. There could
169- be a number of ` <part> ` elements, each of which containing the parts
170- of the ` <tail> ` separated by spaces.
176+ contains everything that goes after the ` + ` until the first space.
177+ The ` <tail> ` contains everything after the first space. There could
178+ be a number of ` <part> ` elements, each of which containing the parts
179+ of the ` <tail> ` separated by spaces.
171180
172181## Objects
173182
174- The ` <objects/> ` element contains object, as they were found in the source
175- code, where each object is represented by the ` <o/> ` element.
176- Each ` <o/> ` element may have a few optional attributes:
183+ The ` <object> ` element must contain only one ` <o/> ` element which represents an
184+ object being parsed. The ` <o/> ` element may have a few optional attributes:
177185
178186* ` line ` and ` pos ` are the number of the line where the object
179187was found by the parser and the position in the line;
180188* ` name ` is the name of the object, if the object has it;
181189* ` base ` may refer to object formation that is being copied;
182- * ` loc ` may contain a "locator" of the object.
190+ * ` as ` is the name of the attribute which current object is bound to during the
191+ application
183192
184193There could be no other attributes.
185194
186- ## Data Objects
187-
188- Data literals found in the source code are presented with ` <o/> ` XML elements
189- that contain text, for example:
195+ ## Special cases
190196
197+ 1 . The ` <o/> ` elements that have nested ` <o> ` element with ` name ` which
198+ value is ` λ ` are ** atoms** . Atoms must not have ` base ` attribute:
191199``` xml
192- <o base =" string" line =" 6" pos =" 8" >48-65-6C-6C-6F-2C-20-25-73-0A</o >
200+ <o name =" try" >
201+ <o name =" λ" />
202+ </o >
193203```
194204
195- The value of the ` base ` attribute is the "type" of the data found in the
196- sources. It may be one of the following three:
197- ` string ` , ` number ` , and ` bytes ` .
198-
199- ## Locators
200-
201- If you apply [ ` set-locators.xsl ` ] optimization XSL stylesheet to the following
202- XMIR document:
203-
205+ 2 . The ` <o/> ` elements with ` base ` attribute which value is ` ∅ ` are ** void** attributes.
206+ Void attributes also must have ` name ` attribute:
204207``` xml
205- <o base =" .times" name =" x" >
206- <o base =" a" />
207- <o base =" b" />
208+ <o name =" foo" >
209+ <o name =" bar" base =" ∅" />
208210</o >
209211```
210212
211- You will get additional attribute ` loc ` added to each ` <o> ` element:
213+ 3 . ** Data literals** found in the source code are presented with nested ` <o/> ` XML elements
214+ that contain text. Only elements with ` base ` attribute equal to ` Q.org.eolang.bytes ` may contain
215+ nested ` <o> ` element with text.
212216
213217``` xml
214- ```xml
215- <o base =" .times" name =" x" loc =" Φ.x" >
216- <o base =" a" loc =" Φ.x.ρ" />
217- <o base =" b" loc =" Φ.x.α0" />
218+ <o base =" Q.org.eolang.bytes" line =" 6" pos =" 8" >
219+ <o >48-65-6C-6C-6F-2C-20-25-73-0A</o >
218220</o >
219221```
220222
221- Locators are absolute and unique coordinates of any object
222- in the entire object "Universe."
223+ 4 . The ` name ` attribute of ` <o/> ` element may be ** auto generated** by EO parser.
224+ In such case it's look like:
225+ ``` xml
226+ <o name =" a🌵104" />
227+ ```
228+
229+ Such ` name ` consists of several parts:
230+ - char ` a ` (ascii 97) that stands for "auto-generated"
231+ - char ` 🌵 ` that is just a pretty character prohibited by EO grammar
232+ - number ` 104 ` which is joined line and position of the place where
233+ the object is found.
234+
235+ Such names are unique through entire XMIR.
236+
237+ 5 . If object is bound to a specific attribute not by name but by position, the
238+ ` as ` attribute may look like:
239+ ``` xml
240+ <o base =" Q.org.eolang.number" as =" α2" />
241+ ```
242+ Here the first character is ` α ` (alpha), the number ` 2 ` is the position of the
243+ attribute.
223244
224245<hr />
225246
0 commit comments