Add Response Model Support for Model Adapters by cemde · Pull Request #49 · parameterlab/MASEval

cemde · 2026-03-22T20:35:48Z

Description

MASEval's simulators (tool, user, agentic user) relied on hand-rolled JSON extraction
and manual retry loops to coerce LLM outputs into structured formats. This was fragile:
Each simulator reimplemented the same parse-validate-retry logic, and failures from
malformed JSON were a recurring source of flaky benchmark runs.

This PR introduces the instructor library as a
core dependency and threads its response_model support through the entire model adapter
stack. Key changes:

ModelAdapter.chat() gains a response_model parameter. Pass any Pydantic
BaseModel class and get a validated instance back in
ChatResponse.structured_response. All four provider adapters (OpenAI, Anthropic,
Google GenAI, LiteLLM) implement _structured_chat() using instructor-patched clients.
Simulators are simplified. LLMSimulator.__call__ now delegates structured output
handling to instructor via the adapter's response_model support, replacing ~150 lines
of manual JSON parsing and retry logic with a single chat() call. Each simulator
declares its expected schema as a class-level _response_model attribute.
New maseval.core.instructor module provides create_instructor_client() for
wrapping provider SDK clients, and flatten_model_schema() for generating
provider-compatible JSON schemas from Pydantic models (used by Tau2's tool parameter
generation, replacing its own _flatten_schema()).

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Documentation update
Code quality improvement (refactoring, formatting, etc.)

Checklist

Contribution

I have read the CONTRIBUTING.md guide.
Commits follow "How to write a good git commit message"

Documentation

Added/updated docstrings for new/modified functions as instructed CONTRIBUTING.md
Updated relevant documentation in docs/ (if applicable)
Tag github issue with this PR (if applicable)

Changelog

Added entry to CHANGELOG.md under [Unreleased] section
- Use Added section for new features
- Use Changed section for modifications to existing functionality
- Use Fixed section for bug fixes
- Use Removed section for deprecated/removed features
OR this is a documentation-only change (no changelog needed)

Example:
- Support for multi-agent tracing (PR:#123)

Architecture (if applicable)

Core/Interface separation: Changes in maseval/core/ do NOT import from maseval/interface/
Dependencies: New core dependencies added sparingly; framework integrations go to optional dependencies

Additional Notes

github-actions · 2026-03-22T21:01:07Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
maseval
__init__.py
maseval/benchmark/tau2
tau2.py
maseval/benchmark/tau2/domains
base.py
maseval/core
instructor.py					89
model.py
simulator.py					224
maseval/interface/inference
anthropic.py					388-390
google_genai.py					334-336
litellm.py					220-221
openai.py					304-306
Project Total

_{This report was generated by python-coverage-comment-action}

cemde added 5 commits March 22, 2026 19:33

using instructor for models

9bc220f

added optional dependency for google + instructor

caab792

improved testing

159a4d5

updated changelog

7f75176

fixed issues

3a3dad6

improved testing

3ca8ad3

cemde marked this pull request as ready for review March 22, 2026 21:46

[skip ci] fixed docstring

4d6b25a

cemde added enhancement New feature or request interface regarding the `maseval/interface` subpackage. core In regards to the core package `maseval/core` labels Mar 22, 2026

cemde merged commit ff413c8 into main Mar 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Response Model Support for Model Adapters#49

Add Response Model Support for Model Adapters#49
cemde merged 7 commits intomainfrom
feature/add-instructor-library

cemde commented Mar 22, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cemde commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Checklist

Contribution

Documentation

Changelog

Architecture (if applicable)

Additional Notes

Uh oh!

github-actions bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cemde commented Mar 22, 2026 •

edited

Loading

github-actions bot commented Mar 22, 2026 •

edited

Loading