[feat] Add {{mustache}} rendering (prompt unification WP-B3)#4393
[feat] Add {{mustache}} rendering (prompt unification WP-B3)#4393junaway wants to merge 19 commits into
{{mustache}} rendering (prompt unification WP-B3)#4393Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR introduces comprehensive RFC and implementation documentation for the prompt-runtime unification effort, including the overarching problem space (RFC), the WP-B3 mustache rendering workstream specification, research foundation, multi-phase implementation plan, QA strategy, and status tracking, plus updates to the main documentation to cross-link and clarify the rollout behavior. ChangesPrompt Runtime Unification and Mustache Rendering
🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsStopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Pull request overview
Adds the WP-B3 design workspace and updates the prompt-runtime-unification documentation set to describe introducing a mustache template mode (nested-only dotted lookup + brace escaping) as the default for newly created apps/prompt configs, while preserving legacy curly behavior for existing configs.
Changes:
- Added a new WP-B3 documentation workspace (RFC, research notes, plan, QA plan, status tracking).
- Added/updated the parent prompt-runtime-unification RFC to include
mustachesemantics and rollout sequencing. - Updated the prompt-runtime-unification README index to reference the new WP-B3 workspace and clarify new-app default behavior.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/README.md | Entry point for the WP-B3 workspace and its scope/files. |
| docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/status.md | Tracks current decisions, open questions, next steps, and validation commands. |
| docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/rfc.md | Defines proposed mustache subset semantics, escaping, compatibility, and dependency evaluation plan. |
| docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/research.md | Maps current runtime touchpoints and evaluates Mustache library options. |
| docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/plan.md | Phased implementation plan for adding mustache support and defaults. |
| docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/qa.md | Test strategy covering renderer behavior, compatibility, and call-site adoption. |
| docs/design/prompt-runtime-unification/rfc.md | Parent RFC describing the broader runtime/frontend unification effort and template-format semantics. |
| docs/design/prompt-runtime-unification/README.md | Index updates to reference WP-B3 and refine wording around defaults/curly visibility. |
Comments suppressed due to low confidence (6)
docs/design/prompt-runtime-unification/rfc.md:46
- The referenced implementation paths here look outdated for this repo checkout:
PromptTemplatelives insdks/python/agenta/sdk/utils/types.py, and the judge helper_format_with_templateis insdks/python/agenta/sdk/engines/running/handlers.py(there is noapi/sdk/agenta/...path). Updating these links would keep the RFC actionable for readers.
* Config lives under `parameters.prompt`: `messages`, `template_format`, `input_keys`, and `llm_config`.
* Rendering goes through `PromptTemplate.format(**inputs)` in `api/sdk/agenta/sdk/types.py`, which supports `curly`, `fstring`, and `jinja2`.
* Completion exposes top-level `inputs` keys as variables. Chat exposes the same keys except `messages`, which is appended as typed messages after rendering (not exposed as a template variable).
**LLM-as-a-judge** is close in behavior but uses a separate runtime path.
* Config is a flat evaluator shape: `prompt_template`, `model`, `response_type`, `json_schema`, `correct_answer_key`, `threshold`, `version`, optional `template_format`.
* Renders messages through `_format_with_template` in `api/sdk/agenta/sdk/workflows/handlers.py`. It supports the same three formats as `PromptTemplate.format`; the default depends on evaluator `version` — `fstring` for v2, `curly` for v3+.
docs/design/prompt-runtime-unification/rfc.md:58
- This “Current State” bullet about rendering behavior is no longer accurate in the current code: the judge path renders
json_schemaviarender_json_like(...)and_format_with_templatedoesn’t ‘return original content with a warning’ for supported formats. Please update these bullets to match the current runtime behavior so the RFC doesn’t contradict the implementation.
* **Provider/model resolution.** Chat and completion use workflow provider settings; the judge manually extracts a fixed provider-key set and therefore cannot reliably use custom or self-hosted models configured in the UI.
* **Rendering.** Each service has different rendering behavior:
* `PromptTemplate.format` raises on Jinja errors; `_format_with_template` returns the original content with a warning.
* Chat and completion recursively render `llm_config.response_format`. The judge builds `response_format` from `response_type` / `json_schema` and does not render variables inside `json_schema`.
docs/design/prompt-runtime-unification/rfc.md:93
- Markdown formatting issue: there are extra
**characters at the end of this bold sentence, which will render incorrectly. It should likely be a single bold span followed by a period.
The basic rule should be: **native JSON stays native until template rendering****.**
docs/design/prompt-runtime-unification/rfc.md:162
- Several typos/grammar issues in this section reduce clarity: “All services (chat, completion, chat)” repeats chat; “providers settings” should be “provider settings”; and “The should all support …” is missing a subject (likely “They”).
* All services (chat, completion, chat) should resolve providers settings using the same path. As such:
* The should all support custom/self-hosted models configured in the UI
docs/design/prompt-runtime-unification/rfc.md:163
- This bullet has mismatched parentheses and is missing a closing parenthesis after “explicitly set”, which makes it hard to read. Consider rephrasing/splitting the sentence to avoid nested parentheses.
* LLM-as-a-judge must not inject unsupported optional parameters such as `temperature` (the default should be None unless explicitly set (just like we currently do in chat/completion).
docs/design/prompt-runtime-unification/rfc.md:176
- Minor punctuation/spacing issues: there’s an extra space before the semicolon in “acceptable ;” and the sentence ends with “welcome..”. Cleaning this up will improve readability.
* The variables panel (right side of the playground) shows:
* variables discovered from the prompt
* variables available from the current testcase or trace context, labeled with source and type
* The prompt editor provides autocomplete for available variables. A degraded solution with only top-level autocomplete is definitely acceptable ; a full solution with full nested autocompletion is surely welcome..
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (6)
docs/design/prompt-runtime-unification/rfc.md (2)
199-290: ⚡ Quick winClarify status of "JP's notes" sections.
The document contains four "JP's notes" sections (at lines 199-215, 225-232, 245-253, 276-290) that appear to be implementation details or developer notes. These inline notes may not belong in the formal RFC, or should be clearly marked as non-normative implementation guidance to distinguish them from requirements.
Consider either:
- Moving these notes to a separate implementation guide or the WP-B3 planning documents
- Clearly marking them as "Non-normative implementation notes" if they should remain
- Removing them if they've served their purpose during draft review
144-145: ⚡ Quick winStrengthen caveat about partial Mustache implementation.
The RFC mentions that mustache "do not implement the full mustache spec (no sections or partials)" but this critical limitation is embedded in a long sentence. Given that the WP-B3 RFC (lines 31-48) explicitly lists all unsupported features, users might be surprised if they expect standard Mustache behavior.
Consider adding a prominent callout or note immediately after introducing the
mustacheformat to highlight that this is "Mustache-compatible variable substitution" rather than full Mustache. This aligns with the WP-B3 RFC's "Agenta's Mustache-compatible variable substitution mode, not full Mustache" language.docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/rfc.md (1)
57-58: 💤 Low valueClarify why chevron is "too old" for SDK dependency.
Line 57 states "Do not use
noahmorrison/chevrondirectly. It is too old for a new SDK runtime dependency." While the directive is clear, explaining why it's too old (unmaintained? incompatible? security issues?) would help future maintainers understand the decision.Consider adding a brief explanation, such as:
Do not use `noahmorrison/chevron` directly. The package is unmaintained (last release 2016) and not suitable for a new SDK runtime dependency.docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/research.md (1)
79-114: ⚡ Quick winConsider flagging the dependency decision more prominently.
The evaluation recommends
langchain_core.utils.mustache(line 98), but the dependency note (line 114) is somewhat buried. Since addinglangchain-coreto the SDK is a significant architectural decision, consider:
- Adding a decision box or callout at the top of the "Library Evaluation" section.
- Specifying concrete acceptance criteria for the Phase 2 spike (package size threshold, import time threshold, transitive dependency count).
- Defining a clear fallback plan if langchain-core is rejected (e.g., "implement local tokenizer using the reference patterns from langchain-core source").
This ensures reviewers and implementers understand the dependency is conditional and has an escape hatch.
docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/plan.md (1)
16-27: ⚡ Quick winConsider whether Phase 2 timing is optimal.
Phase 2 (library spike) is scheduled after Phase 1 (resolver implementation). This ordering could lead to rework if the langchain_core evaluation reveals that its tokenizer or renderer has incompatible behavior that affects resolver design.
Two options:
- Move Phase 2 before Phase 1 - Evaluate the library first, then design resolvers around the chosen tokenizer.
- Keep current order - Design resolvers independently, then adapt the library integration to match Agenta's resolver contract (this is what the research doc recommends at line 98-106).
The current order is defensible if the resolver semantics are non-negotiable product requirements (which they appear to be). If so, consider adding a note in Phase 2 explicitly stating: "Resolver semantics from Phase 1 are fixed requirements; library integration must adapt to them, not vice versa."
docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/status.md (1)
50-57: ⚡ Quick winConsider adding tentative recommendations for open questions.
The open questions are well-identified, but some could benefit from tentative recommendations to guide Phase 2 evaluation:
Line 53 (langchain-core acceptability): Add tentative threshold, e.g., "Proceed if package adds <5MB and <10 transitive deps; otherwise implement local tokenizer."
Line 54 (unsupported constructs): Add tentative direction, e.g., "Preferred: raise explicit
UnsupportedConstructErrorwith helpful message pointing to jinja2 for advanced logic."Line 55 (
{{.}}vs{{$}}): The note already recommends{{$}}and keeping{{.}}invalid - consider promoting this to the Decisions section if it's settled.This would make Phase 2 (library spike) more deterministic without requiring another design review cycle.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: e1e69858-cde0-4168-a9c4-8f86f6b690c5
📒 Files selected for processing (8)
docs/design/prompt-runtime-unification/README.mddocs/design/prompt-runtime-unification/rfc.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/README.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/plan.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/qa.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/research.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/rfc.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/status.md
Addresses three PR #4393 review findings (WPB3-018/019/020): - chatPrompts.ts: extractVariablesFromText missed mustache/curly/jinja2 tags with inner whitespace ({{ name }}). Mustache treats {{ name }} and {{name}} as equivalent, so the {{ }} patterns now allow optional spaces. - TokenPlugin.tsx: the default-branch comment overstated coverage by claiming an "fstring fallback"; the {{ }} regexes do not match fstring's {...} placeholders. Comment corrected to state reality. - types.py: PromptTemplate.template_format defaults to `curly`, but the field description called mustache the default. Reworded so the model default (curly, legacy compat) is distinct from the mustache default that app-creation flows/interfaces set explicitly. Tests: whitespace token-extraction cases added to chatPromptsMustache.test.ts (via the public extractPromptTemplateContext). entity-ui vitest 13 passed; @agenta/shared + @agenta/ui types:check clean; entity-ui lint clean; ruff format + check clean on types.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Railway Preview Environment
|
Addresses two PR #4393 review findings (WPB3-021/022): - _mustache-templates.mdx: the value-coercion table claimed dict/list render as compact JSON with "no extra whitespace" (e.g. {"x": 1}). The renderer uses json.dumps(ensure_ascii=False) with default separators, so the real output is {"x": 1, "y": 2} (spaces after : and ,). Reworded the row to match; renderer unchanged (curly-matching behavior is intended). - Parent RFC + README: the {{$...}} description still used the superseded "pre-rendered as JSONPath ... then the resulting template is rendered" framing, implying JSONPath results are fed back through the engine. WPB3-010 fixed only the wp-b3 doc set; this extends the same correction to the parent docs. Reworded every occurrence (rfc.md / README.md) to shield -> render -> substitute-last (inert data, never re-parsed); also tightened wp-b3 rfc.md. Verification: render-helper + structured-rendering suites 185 passed (covers all four modes incl. jinja2's shared-JSONPath path); ruff clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Make explicit in summary.md and web-handoff.md that adding mustache touched
the other renderers via a shared {{$...}} JSONPath helper:
- curly: functionally equivalent (output unchanged; now the reference behavior).
- jinja2: refactored onto the shared helper, behavior preserved.
- fstring: untouched.
- error-contract change spans all formats but is only newly observable for
mustache/jinja2: the "Unreplaced variables in <format> template" message now
interpolates the real format instead of the hardcoded "curly". curly wording
is identical to before; fstring never raises this error so the branch is
dormant for it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| // Default: {{ }} variable tokens only. Covers "curly" and "mustache" — | ||
| // mustache shares curly's {{name}} delimiters for plain variables, so it | ||
| // tokenizes through this path. (fstring also falls through to here, but its | ||
| // {...} single-brace placeholders are NOT matched by these {{ }} regexes.) | ||
| const full = /\{\{[^{}]*\}\}/ | ||
| const input = /\{\{[^{}]*$/ | ||
| const exact = /^\{\{[^{}]*\}\}$/ |
| except Exception as exc: | ||
| raise MustacheTemplateError( | ||
| f"Mustache template error in content: '{template}'. Error: {exc}" | ||
| ) from exc |
There was a problem hiding this comment.
Actionable comments posted: 3
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 7dad6fd7-de96-4b14-b946-c884619d0c92
⛔ Files ignored due to path filters (1)
api/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (14)
api/oss/src/resources/evaluators/evaluators.pydocs/design/prompt-runtime-unification/README.mddocs/design/prompt-runtime-unification/rfc.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/README.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/escape-analysis.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/findings.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/plan.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/qa.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/research.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/rfc.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/status.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/summary.mddocs/design/prompt-runtime-unification/wp-b3-mustache-rendering/web-handoff.mddocs/docs/prompt-engineering/integrating-prompts/_mustache-templates.mdx
✅ Files skipped from review due to trivial changes (2)
- docs/design/prompt-runtime-unification/wp-b3-mustache-rendering/research.md
- docs/docs/prompt-engineering/integrating-prompts/_mustache-templates.mdx
| Update (2026-05-22): all findings (WPB3-001..017) are fixed and Closed; there are no open findings, and all PR #4393 review threads are resolved. WPB3-014 (escape behavior) was closed via Option 3 — document the per-format reality now (delimiter swap / `{% raw %}` / none for curly; no backslash escape), defer a `\{{` escape pending real demand; full evidence and the tested `\{{` vs `\{\{` result are in `escape-analysis.md`, and the user-facing how-to gained an "Escaping" section. WPB3-017 (frontend `extractTemplateVariables` JSDoc omitted mustache) was fixed. 270 across the four focused suites pass; ruff clean. GitHub: 14 solved-by-content threads resolved (the WPB3-015 RFC cluster `3280747520`/`3280759652`/`3280767036`/`3280770190`/`3280772919`/`3280776786`/`3280781711`/`3280782719`/`3280579226`, the WPB3-016 how-to `3280800719`, and the scope/PR-title threads `3280751193`/`3280761180`/`3280794168`/`3281567723`); the only 3 left unresolved are the escape threads (`3280753760`, `3280788530`, `3280579221`) mapped to the open WPB3-014. | ||
|
|
||
| Finding lineage: WPB3-001..004 from the first scan; WPB3-005..007 from the 2026-05-22 re-scan; WPB3-008..011 (doc-only prose fixes — docstring/qa/pre-render-framing/`+++` markers); WPB3-012..013 (P2 cross-format error-contract bugs, fixed with tests); WPB3-014 (escape, OPEN); WPB3-015 (RFC library/deviation/requirement/security consolidation, `langchain_core` recorded as considered-and-rejected); WPB3-016 (draft `_mustache-templates.mdx` how-to). WPB3-008..016 are all from the PR #4393 sync. | ||
|
|
There was a problem hiding this comment.
Resolve contradictory status statements in the findings summary.
This section says all findings are closed and all review threads are resolved, but also says three threads remain unresolved and mapped to WPB3-014. Please align this status block with the final WPB3-014 state in the document.
| - `cd sdks/python && uv run ruff format` + `uv run ruff check` — clean. | ||
| - `cd sdks/python && uv run pytest oss/tests/pytest/unit -q` — green (mustache coverage across JSONPath resolution, sections, value coercion, partial/empty/JSON-Pointer/NUL rejection, cross-format `{{$...}}` parity, `PromptTemplate`, and LLM-as-a-judge). |
There was a problem hiding this comment.
Use the documented lint command with --fix for SDK/API change flows.
The validation section says uv run ruff check but the repo guideline requires uv run ruff check --fix after formatting. Please align the summary command to avoid drift between docs and expected workflow.
Suggested doc fix
-- `cd sdks/python && uv run ruff format` + `uv run ruff check` — clean.
+- `cd sdks/python && uv run ruff format` + `uv run ruff check --fix` — clean.As per coding guidelines: "Run ruff format and ruff check --fix within the SDK or API folder after making changes to the API or SDK (or from repo root: ruff format then ruff check)".
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - `cd sdks/python && uv run ruff format` + `uv run ruff check` — clean. | |
| - `cd sdks/python && uv run pytest oss/tests/pytest/unit -q` — green (mustache coverage across JSONPath resolution, sections, value coercion, partial/empty/JSON-Pointer/NUL rejection, cross-format `{{$...}}` parity, `PromptTemplate`, and LLM-as-a-judge). | |
| - `cd sdks/python && uv run ruff format` + `uv run ruff check --fix` — clean. | |
| - `cd sdks/python && uv run pytest oss/tests/pytest/unit -q` — green (mustache coverage across JSONPath resolution, sections, value coercion, partial/empty/JSON-Pointer/NUL rejection, cross-format `{{$...}}` parity, `PromptTemplate`, and LLM-as-a-judge). |
| | Default for new chat / completion / LLM apps | `curly` | `mustache` (all three `*_v0` interfaces) | | ||
| | Picker options offered | curly, fstring, jinja2 | **mustache, jinja2** (legacy formats hidden unless already selected) | | ||
| | LLM-as-a-judge evaluator | version `4`, no explicit format | version `5`, `template_format: "mustache"` | | ||
| | Engine (Python) | `mystace>=1,<2` added | | |
There was a problem hiding this comment.
Fix the Engine (Python) before/after table row.
The row currently places mystace>=1,<2 under Before and leaves After blank, which reads as the opposite of the intended change.
Summary
Implements WP-B3 of the prompt runtime unification RFC: adds
mustacheas the fourth prompttemplate_formatand makes it the default for newly created apps, prompt configs, and LLM-as-a-judge evaluators. It builds on the low-level renderer from WP-B1 and the structured renderer from WP-B2.mustacheis real Mustache (via themystaceengine) plus the one Agenta extension every format already carries: tags that start with{{$are resolved as JSONPath against the render context. Existingcurly,fstring, andjinja2prompts are untouched — old apps keep their declared format, and only new creation paths writemustache.This is primarily a backend/SDK package. The frontend changes are the minimal type-and-picker surface needed to load, preserve, and select
mustache; the larger playground/native-JSON work stays in the frontend follow-up packages (WP-F2/F3).What's in it
SDK rendering (
sdks/python/agenta/sdk/utils/)templating.py— new_render_mustache(...);TemplateModewidened to include"mustache". Rendering follows the same shield-and-substitute model the other formats use:{{$...}}JSONPath tags are shielded from the engine, the rest is rendered bymystace, and the resolved JSONPath values are substituted into the output last, as inert text — never re-parsed. Partials ({{>...}}), empty placeholders, JSON-Pointer tags, NUL bytes, and engine parse errors fail clearly.types.py—PromptTemplateacceptsmustacheand keeps its publicTemplateFormatErrorsurface for chat/completion callers.rendering.py— type-widening only;render_messages(...)/render_json_like(...)work unchanged once the mode is accepted.Effect on the other renderers (
curly/jinja2/fstring)Adding
mustachewas done by extracting one shared{{$...}}JSONPath helper (_render_with_jsonpath) rather than a mustache-only path, so the other formats are touched to varying degrees:curly— functionally equivalent. Its output is unchanged: it already resolved{{$...}}as inert data, andresolvers.pyhas zero diff. It is now the reference behavior the other two{{ }}formats match, rather than a special case.jinja2— refactored onto the shared helper, behavior preserved._render_jinja2no longer renders directly; it routes through_render_with_jsonpath, so{{$...}}is shielded from Jinja, the engine runs, and resolved values are substituted last as inert data ({% raw %}/{# #}spans are skipped and left to Jinja). Same rendered output, now sharing curly's JSONPath contract.fstring— untouched. Stilltemplate.format(**context); no JSONPath, no change.TemplateFormatErrormessage for unresolved variables now interpolates the actualtemplate_formatinstead of the hardcoded literal"curly"(types.py, both the chat/completion and structured paths). Forcurlythe wording is identical to before ("…in curly template…"). What changed is that, after the JSONPath unification, an unresolved{{$...}}tag can now raiseUnresolvedVariablesErrorfrom mustache and jinja2 too — so the interpolation is what keeps their error message correctly labeled (previously they would have been mislabeled "curly").fstringnever raises this error (it usesstr.format, surfacingKeyError), so for fstring the branch is dormant — the change applies to it in principle but is not currently triggerable.Engine config (
sdks/python/agenta/sdk/engines/running/)interfaces.py— the mustache default lands here for all three workflow types:llm_v0_interface: thetemplate_formatschema scalar widens its enum to["mustache", "curly", "fstring", "jinja2"]and flipsdefaultfromcurlytomustache(this is what new LLM/completion apps inherit, and the dropdown default).chat_v0_interfaceandcompletion_v0_interface: built-in default config flips"template_format"fromcurlytomustache.handlers.py—auto_ai_critique_v0learns a v5 default ofmustache(v2 →fstring, v3/v4 →curlyunchanged). An explicittemplate_formatalways wins over the version default; old judge revisions keep their original behavior.builtin.py— the built-inauto_ai_critiquetemplate bumps to version5/template_format="mustache".Backend resource (
api/oss/src/resources/evaluators/evaluators.py)5and carry an explicit hiddentemplate_format: "mustache"field, so newly created judges render with mustache.Error contract
MustacheTemplateError— unsupported partial, empty placeholder, JSON-Pointer tag, NUL byte, ormystaceparse error.UnresolvedVariablesError— an unresolvedcurlyplaceholder or a failed{{$...}}JSONPath tag, in any ofmustache/jinja2/curly(cross-format parity).TemplateFormatError— the publicPromptTemplatesurface, preserved.Frontend (type + picker surface only)
template_formatunions widened to include"mustache"across the editor token plugin, chat-message components, prompt schema control, and the shared chat-prompt extractor.mustacheshares curly's{{name}}extraction/highlighting path.templateFormatOptions.ts: the picker now offers onlymustacheandjinja2to new prompts.curly/fstringare legacy — hidden from the picker, but a prompt that already stores one keeps it visible and selectable (no silent coercion). Restores hiding that had regressed; pinned by a unit test.resolveTemplateFormat(...)is reused in the workflow molecule somustacheis preserved instead of coerced.Docs
rfc.md— dependency choice (mystacevschevron, withlangchain_coreconsidered and rejected), the three intentional Mustache deviations, the JSONPath compatibility requirement, and the security note (narrow context, never-re-parse)._mustache-templates.mdx— draft how-to (variables, sections,{{$...}}, value coercion, what's unsupported, and escaping literal{{ }}).escape-analysis.md— standalone analysis of the escape question raised in review: no backslash escape exists inmystaceorlangchain_core/chevron; the canonical literal-brace mechanism is the Mustache delimiter swap (and{% raw %}for jinja2). Decision: document now, defer a backslash escape unless real demand appears for literal{{in curly.findings.md,research.md,plan.md,qa.md,status.md,README.md— design workspace and review-findings record.Compatibility
curly/fstring/jinja2behavior is unchanged.mustache. Old judge revisions keep their per-version default.Validation
cd sdks/python && uv run ruff format+uv run ruff check— clean.cd sdks/python && uv run pytest oss/tests/pytest/unit -q— green (mustache coverage across JSONPath resolution, sections, value coercion, partial/empty/JSON-Pointer/NUL rejection, cross-format{{$...}}parity,PromptTemplate, and LLM-as-a-judge).pnpm --filter @agenta/entity-ui test— picker and mustache-extraction regression pins pass.pnpm lint-fix+types:checkon the touched web packages — clean.