Generating Mujoco MJCF XSD schema from the parser#3237
Generating Mujoco MJCF XSD schema from the parser#3237julien-blanchon wants to merge 5 commits intogoogle-deepmind:mainfrom
Conversation
Four local refactors to the MJCF XSD generator; all preserve the schema semantics (same validating behaviour, same set of declared types). 1. Move the ~600-line kMjXAttrTable (and the mjXAttr/mjXAttrKind type declarations) into xml_native_reader.cc / xml_native_reader.h, next to MJCF[] and the enum maps. Reverts the earlier `extern const mjMap` sprinkling: the maps go back to internal linkage, and the schema generator just consumes kMjXAttrTable through its declarations. xml_native_schema.cc shrinks from 1653 to 532 lines. 2. Drop the manual kGeomTypesSz / kNDYN / etc. shadow constants; reference the authoritative mjNGEOMTYPES / mjNDYN / ... enum values directly from mjmodel.h / user_composite.h / user_flexcomp.h. No more silent drift. 3. Replace the O(N) linear Lookup() over kMjXAttrTable with a file-static unordered_map<(string_view, string_view), const mjXAttr*>, populated on first call. Schema emission drops from O(N^2) to O(N). 4. Replace the ListTypeName / EmitListType string round-tripping with a ListType struct (kind + size). EmitListType takes the struct directly, no stoi/find_last_of parsing of type names. 5. Deduplicate <xs:simpleType> enum declarations by mjMap* identity: the first (element, attr) pair that references a given map sets the canonical enum type name; every subsequent attribute using the same map reuses it. The generated XSD shrinks from 183 enum simpleTypes (many byte-identical copies of bool / interp / enable enums) down to 39 distinct ones.
Drop the plain python3 shebang in favor of `uv run --script`, add the inline script metadata block pinning requires-python >=3.10, mark the file executable, and update the pipeline snippets in CONTRIBUTING.md and xml_native_schema.cc to invoke it directly. With this change the only prerequisite for regenerating doc/mjcf.xsd is uv on PATH — uv provisions a Python interpreter that actually has expat, which the Homebrew python3 builds lack.
|
You can found the resulting schema here if you want to test it in your IDE directly: https://raw.githubusercontent.com/julien-blanchon/mujoco/refs/heads/schema-autogen-xsd/doc/mjcf.xsd For vscode you need https://github.com/redhat-developer/vscode-xml + In the header |
|
This is a useful direction. I can help test the generated schema against a broader set of MJCFs, especially |
|
Broader testing will be greatly appreciate ! |
|
Any follow up on this ? @devshahofficial do not hesitate to test, expecially on some edge case / minor mujoco feature. I've done some testing already on all the ./model mujoco models from this repo, and some of the Menagerie models. About commiting the |
Adds an auto-generated MJCF XSD, built from the parser tables in
src/xml/and enriched with defaults and docs fromdoc/XMLreference.rst.This is intended for IDE/LLM editing support (elements, attributes, enums), not as a replacement for compiler validation.
What’s included
mj_printSchemaXSD) based onMJCF[]+ attribute-type tablesample/xmlschema.cc)--strict)Closes #6. Refs julien-blanchon/mujoco-schema#1.
Regeneration
Test plan
doc/mjcf.xsd) validates against XMLSchema (valid schema)./model/**/*.xmlvalidate against the schemaDecisions
Where to serve
mjcf.xsd?Should we keep store it in the repo at
doc/mjcf.xsd, or as a Github release artifact ? Or something else ?Adding CI ?
Should we add a lightweight CI job to:
xmlschema--strictgoogle-deepmind/mujoco_menageriemodel too)