feat(intrinsics): introduce Adapter/Identity/IOContract/WeightsBinding scaffolding (Epic #929 Phase 0) by planetf1 · Pull Request #1158 · generative-computing/mellea

planetf1 · 2026-05-26T13:54:46Z

Summary

Closes #1134 — Epic #929 Phase 0, Wave 1.

Why

Today's adapter hierarchy in Mellea is Adapter → LocalHFAdapter → IntrinsicAdapter → EmbeddedIntrinsicAdapter / CustomIntrinsicAdapter. The class that a backend gets handed encodes where the weights live (locally, embedded in the model artefact, server-mediated), so every caller that wants to load, activate, or unload an adapter ends up doing isinstance branching to figure out which storage reality applies. Seven recent fix-up commits trace directly to that structure, and adding new realities (e.g. the work blocked by #1018) means new branches in every caller.

The epic's resolution is to break the adapter into three independent concerns — identity (name, type, role), I/O contract (how to build the prompt and parse the output), and weights binding (how the bytes get loaded). Each lives behind its own type, and the storage-reality variation moves from the class hierarchy into a pluggable WeightsBinding. Callers stop branching; new realities become a new binding subclass.

This PR does only the type scaffolding — the new dataclasses and ABCs, plus three NotImplementedError stubs for the three storage realities. No existing class is modified, no existing caller is migrated. That keeps this PR small and reviewable; the migration is the next phase's work.

Where this fits in Epic #929

This PR is Phase 0, Wave 1 of Epic #929 — paired with #1135 (PR #1157, catalogue revision pinning), which lands in parallel. See #929 for the full phase plan.

Deferred from this PR (do not flag as missing):

Caller migration — issue 1.A, blocked on this PR. Existing IntrinsicAdapter callers are not yet rewritten to construct the new Adapter shape.
Real binding implementations — Phase 2. The three WeightsBinding subclasses raise NotImplementedError here; download/activate/deactivate/release land later.
Shim removal — Phase 4. The legacy Adapter ABC in adapter.py and the Intrinsic → AdapterBasedComponent alias both stay until all callers are migrated.

Reviewers should expect scaffolding only: ABCs that raise on instantiation, three subclasses that raise NotImplementedError, and one alias module that re-exports Intrinsic under its new name. Anything that uses this scaffolding to actually load weights is deliberately out of scope.

What changed

File	Contents
`mellea/backends/adapters/_core.py`	New: `Adapter` dataclass; `Identity` (frozen, role-warning); `IOContract` ABC (`build_prompt`, `parse`); `WeightsBinding` ABC (`prepare`, `activate`, `deactivate`, `release`); `LocalFileBinding`, `EmbeddedBinding`, `ServerMediatedBinding` stubs; `AdapterSchemaMismatchError` (picklable).
`mellea/backends/adapters/roles.py`	New: `KNOWN_ROLES` advisory frozenset, derived from `_INTRINSICS_CATALOG_ENTRIES` so it cannot drift.
`mellea/stdlib/components/adapter_based_component/__init__.py`	New: re-exports `Intrinsic` as `AdapterBasedComponent`. The chosen name; the old import path remains valid.
`mellea/backends/adapters/__init__.py`	Updated: re-exports all new public names.
`test/backends/test_adapters/test_core_types.py`	New: 23 unit tests covering construction, ABC enforcement, frozen / hashable behaviour, stub `NotImplementedError`, exception format and pickle round-trip, role-warning behaviour.
`test/stdlib/components/test_adapter_based_component.py`	New: identity test (`AdapterBasedComponent is Intrinsic`) and old-import-path test.

Naming-collision resolution

The issue called out a potential collision between the existing Adapter ABC (mellea/backends/adapters/adapter.py:24) and the new Adapter dataclass.

In practice there is none: the existing ABC was never exported from mellea/backends/adapters/__init__.py — only AdapterMixin, AdapterType, IntrinsicAdapter, LocalHFAdapter, fetch_intrinsic_metadata, and get_adapter_for_intrinsic were. The new Adapter dataclass lives in _core.py and is the first thing to be exported under that name on the public surface. Both classes coexist inside the package until shim removal in Phase 4; no caller's import resolves differently.

Deviations from the literal issue spec

The review pass surfaced a few quality concerns. I tightened the spec rather than ship the looser version:

Identity and Adapter are @dataclass(frozen=True) — the issue just said "dataclass". Frozen makes both hashable (so they can key into adapter caches) and stops __post_init__ validation being bypassed by post-construction assignment. Phase 0 is the cheapest moment to lock this down.
IOContract.parse -> dict[str, object] — the issue said -> dict. Bare dict is dict[Any, Any] and violates AGENTS.md §5 strict-typing on a public ABC.
IOContract.build_prompt(**kwargs: object) — the issue used (...). Same typing reason.
Stub NotImplementedError raises now name the class and point to Phase 2 ("LocalFileBinding is a Phase 0 stub; implementation lands in Epic #929 Phase 2.") so a confused user gets a useful traceback.
The adapter_based_component docstring asserts AdapterBasedComponent as the chosen name rather than hedging about a pending IBM rename. The "placeholder" framing in feat(intrinsics): introduce Adapter/Identity/IOContract/WeightsBinding scaffolding (Epic #929 Phase 0) #1134 is design-tracker language only, not real ambiguity; if IBM eventually picks a different upstream term, the rename is scheduled in Phase 4. Issue feat(intrinsics): introduce Adapter/Identity/IOContract/WeightsBinding scaffolding (Epic #929 Phase 0) #1134 has a correction note at the top documenting this.

Acceptance criteria

Testing

uv run pytest test/backends/test_adapters/test_core_types.py test/stdlib/components/test_adapter_based_component.py -v

25 tests pass. The full non-qualitative suite (uv run pytest -m "not qualitative") shows no regressions caused by this PR — the only failures are the pre-existing Ollama-connectivity tests that need a running server.

Attribution

AI coding assistants used: Claude Code

…ng (generative-computing#1134) Fixes flagged by the multi-reviewer pass on PR generative-computing#1158: - `AdapterSchemaMismatchError` now passes structured fields (not the formatted message) to `Exception.__init__`, so `pickle.dumps(err)` round-trips through worker / process boundaries. `__str__` is overridden to keep the existing user-visible message format. Adds a `pickle` round-trip test. - `IOContract.parse` now returns `dict[str, object]` (was bare `dict`) and `IOContract.build_prompt` typed as `**kwargs: object` to satisfy the AGENTS.md §5 strict-typing rule on the public ABC. - `Identity` and `Adapter` are now `@dataclass(frozen=True)` so the `__post_init__` validation cannot be bypassed by post-construction assignment, and both are hashable (usable as dict keys / set members). New tests cover the frozen and hashable behaviour. - `WeightsBinding` docstring now documents the lifecycle as an informal state machine (prepare → activate → deactivate → release) so Phase 2 implementations share a contract. - Stub bindings (`LocalFileBinding`, `EmbeddedBinding`, `ServerMediatedBinding`) now raise `NotImplementedError` with a message pointing to Epic generative-computing#929 Phase 2. - `KNOWN_ROLES` is now derived from `_INTRINSICS_CATALOG_ENTRIES` rather than hand-copied, eliminating the silent drift risk. - `Identity.__post_init__` carries an inline comment explaining why the runtime check is needed alongside the `Literal` annotation. - The placeholder docstring on `adapter_based_component/__init__.py` no longer hedges — the module asserts `AdapterBasedComponent` as the chosen name. - `test_stub_binding_subclasses_raise_not_implemented` parametrises over verbs for clearer per-verb failure attribution. - `test_identity_known_role_no_warning` no longer uses `simplefilter("error")`, which would have failed on any unrelated `DeprecationWarning` from imports. All 25 unit tests pass; ruff and mypy clean on the new files. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

…g scaffolding (generative-computing#1134) Implements Epic generative-computing#929 Phase 0 (Wave 1): - `mellea/backends/adapters/_core.py`: new composable `Adapter` dataclass, `Identity`, `IOContract` ABC, `WeightsBinding` ABC, three stub bindings (`LocalFileBinding`, `EmbeddedBinding`, `ServerMediatedBinding`), and `AdapterSchemaMismatchError` exception. - `mellea/backends/adapters/roles.py`: `KNOWN_ROLES` advisory registry (frozenset of known role strings seeded from the intrinsics catalog). - `mellea/stdlib/components/adapter_based_component/__init__.py`: placeholder module re-exporting `Intrinsic` as `AdapterBasedComponent`; old import path remains valid. - `mellea/backends/adapters/__init__.py`: all new public names exported. Naming-collision resolution: the existing `Adapter` ABC in `adapter.py` was already absent from `__init__.py__'s public surface. The new `Adapter` dataclass is introduced in `_core.py` and re-exported, so no namespace collision exists on the public API. Both coexist until shim removal in 4.1. Closes generative-computing#1134 Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

…ng (generative-computing#1134) Fixes flagged by the multi-reviewer pass on PR generative-computing#1158: - `AdapterSchemaMismatchError` now passes structured fields (not the formatted message) to `Exception.__init__`, so `pickle.dumps(err)` round-trips through worker / process boundaries. `__str__` is overridden to keep the existing user-visible message format. Adds a `pickle` round-trip test. - `IOContract.parse` now returns `dict[str, object]` (was bare `dict`) and `IOContract.build_prompt` typed as `**kwargs: object` to satisfy the AGENTS.md §5 strict-typing rule on the public ABC. - `Identity` and `Adapter` are now `@dataclass(frozen=True)` so the `__post_init__` validation cannot be bypassed by post-construction assignment, and both are hashable (usable as dict keys / set members). New tests cover the frozen and hashable behaviour. - `WeightsBinding` docstring now documents the lifecycle as an informal state machine (prepare → activate → deactivate → release) so Phase 2 implementations share a contract. - Stub bindings (`LocalFileBinding`, `EmbeddedBinding`, `ServerMediatedBinding`) now raise `NotImplementedError` with a message pointing to Epic generative-computing#929 Phase 2. - `KNOWN_ROLES` is now derived from `_INTRINSICS_CATALOG_ENTRIES` rather than hand-copied, eliminating the silent drift risk. - `Identity.__post_init__` carries an inline comment explaining why the runtime check is needed alongside the `Literal` annotation. - The placeholder docstring on `adapter_based_component/__init__.py` no longer hedges — the module asserts `AdapterBasedComponent` as the chosen name. - `test_stub_binding_subclasses_raise_not_implemented` parametrises over verbs for clearer per-verb failure attribution. - `test_identity_known_role_no_warning` no longer uses `simplefilter("error")`, which would have failed on any unrelated `DeprecationWarning` from imports. All 25 unit tests pass; ruff and mypy clean on the new files. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>

jakelorocco

@planetf1, I also think that we should just have one big feature branch in generative-computing/mellea that these are merged into. I think it's super helpful to be able to review individual PRs that correspond to phases, but I don't think we want unfinished versions of these things in the code base.

jakelorocco · 2026-05-27T15:26:44Z

+    def build_prompt(self, **kwargs: object) -> Component:
+        """Build the prompt component for this adapter.
+
+        Args:
+            **kwargs: Adapter-specific keyword arguments (e.g. ``documents=...``,
+                ``requirement=...``). Concrete subclasses define the keys they
+                accept.
+
+        Returns:
+            Component: The constructed prompt component.
+        """
+        ...


I think this is actually the wrong signature. When processing an intrinsic, we take in an Intrinsic + Context. Then, we convert the context into a series of chat messages. Finally, we utilize the intrinsic's name / role to transform those chat messages into a modified series of chat messages.

OpenAI => pass these messages directly to the sdk like we usually would

HuggingFace => utilize an intrinsics specific function that formats the messages into a format the transformers sdk can use

I think this raises two problems:

we need to be able to support this current flow (ie change messages and add kwargs)

we need to be able to process different types of adapters; meaning, we currently only support utilizing adapters through this granite-common / intrinsic formatters code path. Our design should be able to accommodate adapters that don't need this.

I think the IO Contract is the correct place for this information, ie how to process the input and output. However, I struggle to see how the backends can support this with the current structure. This may not actually be an issue though.

You're right that build_prompt(**kwargs) → Component doesn't capture the full flow — the real input is a Context, not loose kwargs. This is a deliberate Phase 0 deferral (same principle as the earlier discussion): establish the right abstraction point now, pin the signature in #1136 when the call_intrinsic rewrite gives us the implementation context to get it right.

Direction for #1136: build_prompt(context: ChatContext, **kwargs) -> Component, with backend-specific formatting (the in-tree granite_formatters / IntrinsicsRewriter for HF, pass-through for OpenAI) staying in the backend layer — that's what keeps an adapter portable.

The second point — adapters that skip intrinsic formatters entirely — falls out of the same separation naturally: IOContract declares what the adapter needs, the backend decides how to format it.

(Edit: replaced an earlier reference to "granite-common" — the actual library mellea uses internally is the in-tree granite_formatters module.)

Okay, I think I'm in agreement on this. I believe we may need to add some additional complexity to the backend to handle how different adapters actually "get generated from" but it seems like this design approach can accommodate that.

jakelorocco · 2026-05-27T15:31:08Z

+    def prepare(self) -> None:
+        raise NotImplementedError(
+            _PHASE_2_NOT_IMPLEMENTED.format(cls="EmbeddedBinding")
+        )
+
+    def activate(self) -> None:
+        raise NotImplementedError(
+            _PHASE_2_NOT_IMPLEMENTED.format(cls="EmbeddedBinding")
+        )
+
+    def deactivate(self) -> None:
+        raise NotImplementedError(
+            _PHASE_2_NOT_IMPLEMENTED.format(cls="EmbeddedBinding")
+        )
+
+    def release(self) -> None:
+        raise NotImplementedError(
+            _PHASE_2_NOT_IMPLEMENTED.format(cls="EmbeddedBinding")
+        )


I believe I mentioned this in the design document, but I'd like more clarification on how this works for EmbeddedBindings. It makes sense to me that prepare / release will grab / delete just the io.yaml file for these embedded adapters.

For activate / deactivate, it seems like they will just be no-ops. But how then do we indicate that to utilize the embedded adapter we need to pass along a special chat template kwarg? Would that live in the IOContract? If so, then the IOContract and EmbeddedBinding are linked in some way and not quite as mix-and-matchable as it originally seems. Is that expected?

Good question — activate()/deactivate() for EmbeddedBinding are indeed no-ops. The chat template kwarg (intrinsic_name injected into chat_template_kwargs) is already handled by the backend, not by IOContract. Today the backend dispatches on adapter shape (isinstance(adapter, EmbeddedIntrinsicAdapter) in openai.py); in the new architecture that maps to the WeightsBinding type — i.e. EmbeddedBinding — and the backend stays the glue between IOContract and WeightsBinding, so the two don't need to know about each other.

It's the same separation we discussed above for build_prompt: IOContract declares what the adapter needs, the backend decides how to format it. Mix-and-matchability holds at the contract level.

(Edit: corrected an earlier claim that the dispatch keys off Identity.adapter_type == "alora" — it's actually keyed off the embedded-adapter shape, which maps to EmbeddedBinding in the new architecture. Both LoRA and aLoRA can be embedded.)

Okay, so when a new adapter type is introduced, all backends (that support adapters or that specific type of adapter) need to get updated to handle that new adapter? Is that the contract / process we are buying into here?

jakelorocco · 2026-05-27T15:34:17Z

+"""Advisory registry of known adapter roles.
+
+:data:`KNOWN_ROLES` is a frozenset of role strings derived from the intrinsics
+catalog. It is advisory only: callers are warned (not rejected) when a role
+outside this set is used, so that custom adapters and pre-release intrinsics
+are not blocked.
+
+Deriving from the catalog (rather than hand-copying) keeps the two registries
+in sync automatically — adding a new entry to ``catalog.py`` automatically
+registers it as a known role.
+"""
+
+from .catalog import _INTRINSICS_CATALOG_ENTRIES
+
+KNOWN_ROLES: frozenset[str] = frozenset(e.name for e in _INTRINSICS_CATALOG_ENTRIES)


I think it's worth explicitly mentioning then that the names of the intrinsic catalog entrees are then the defacto standard for the names of the roles.

I also think this is somewhat binding. I wonder if it's worthwhile to specifically give each catalog entry a role so that their name can differ from the role? This would be helpful when the names of our own IBM adapters change (like what we saw with "requirement-check" vs "requirement_check").

Another way to put this: there is no standard set of roles. Mellea should just create a standard way to declare the role and force others to follow (or consult us for changes) since nobody is doing so already.

Our top-level wrapper functions like check_requirement(...) are the accepted standard for what a given "role" should input/output at a given point in time (a Mellea release).

By giving all catalog entries an explicit role as well, we allow for teams we rely on to change their naming without us having to update our code. (For example, we had to make several changes to our code when "requirement-check" changed since we hardcoded off the name).

I am now going on a bit of a tangent (and feel free to correct me if we've already rejected these ideas) but I think it may actually aid our design: we may want to change the field name from "role" to adapter function or functionality. Then, when generating from an AdapterBasedComponent, we could actually allow the user to specify distinct options:

The user specifically wants an adapter with the given name; if it's not that adapter, we fail

The user wants any adapter that fulfills a given functionality, they don't care about the specific version / implementation

Good points across this thread. Agreed that deriving KNOWN_ROLES directly from catalog entry names creates an implicit coupling we've already been burned by (the requirement-check / requirement_check situation is a perfect example).

On naming — "capability" feels cleaner than "role" or "functionality". It doesn't imply exact matching, it's already an established term in software design, and it naturally supports the idea of multiple adapters sharing a capability. So the proposal is:

Rename Identity.role → Identity.capability (while it's still in Phase 0 scaffolding and nothing downstream depends on it)

Add capability: str | None to CatalogEntry, defaulting to name when not set

KNOWN_ROLES becomes KNOWN_CAPABILITIES, derived from e.capability or e.name

For now this is advisory only — name-based resolution continues as-is. The capability field is metadata that says "this adapter does X", which sets up the right vocabulary without changing any resolution behaviour.

On the exact vs capability matching idea — that's the part that needs a dedicated design, specifically around precedence and clash resolution (what wins when multiple adapters claim the same capability?). That feels like a Phase 1.5 / Phase 2 sub-issue under #929 rather than something to fold into the current scaffolding.

Does the naming and phasing sound right before we move forward?

This sounds correct. It might change slightly in the context of the larger versioning discussion, but @psschwei and I should have a clear proposal cemented very soon.

psschwei · 2026-05-28T01:11:48Z

@planetf1, I also think that we should just have one big feature branch in generative-computing/mellea that these are merged into. I think it's super helpful to be able to review individual PRs that correspond to phases, but I don't think we want unfinished versions of these things in the code base.

Are the things we're implementing here stand-alone enough that they could live in contribs until they're finished? Would be a nice way to dogfood the idea of contribs for experiments (+ eventual migration to core), but could also be way more trouble than it's worth (I haven't really thought it through so feel free to just say no)

planetf1 · 2026-05-28T20:34:58Z

A feature branch does add some complexity — you'd want automated builds/CI running on it, a rebase strategy, and if it's a shared branch you need cross-team coordination that's not unlike managing main itself. Then there's the question of when to promote. I'm not sure what it actually buys here.

Feature branches for a specific feature make sense when the work is long-running and owned by multiple people — reviewed as a complete unit before the merge. But that requires a genuine workstream with enough people engaged to ensure quality before a big lump lands on main.

The purpose of breaking this refactor into phases is to slide changes into main incrementally — not breaking anything, and hopefully sized sensibly for review. Not too small, not too large (which brings its own problems with conflicts). This is essentially what trunk-based development calls branch by abstraction, and it's how LlamaIndex handled their v0.10 refactor — 300+ integrations extracted to main incrementally behind compatibility shims.

If a PR doesn't regress the code and is clearly positioned, it feels like an effective way to get things in. In the case of intrinsics, the wiring-up comes later — you do the enablement pieces first, carefully, then activate the new function. Feature flags are an option where needed.

planetf1 · 2026-05-28T20:36:18Z

On contribs — I think for that to genuinely work you'd need real plug points in the core: interfaces, registries, something that lets a contrib slot in cleanly without touching core code. Without that, it's essentially a different folder with the same coupling and the same review overhead, plus an extra migration step when it graduates to core.

The adapter scaffolding this PR is introducing is part of building those extension points — so it's a bit circular to suggest housing it in contribs. Once the lifecycle interfaces are in place, contribs becomes a much more viable home for experimental adapters that want to develop independently before being considered for inclusion.

planetf1 · 2026-05-28T21:04:24Z

One reflection — the obvious alternative was a single big PR. But to do that well you'd need all the design decisions locked down upfront, and a diff that large tends to get a cursory pass rather than a close read. The phased approach was deliberate for exactly this reason: catching things like the capability naming and IOContract signature early, when course-correcting is cheap. A long-lived feature branch arguably makes this worse, not better — changes pile up out of sight and the quality gate moves to the end. This thread is a good example of why incremental works.

jakelorocco · 2026-05-29T12:46:10Z

One reflection — the obvious alternative was a single big PR. But to do that well you'd need all the design decisions locked down upfront, and a diff that large tends to get a cursory pass rather than a close read. The phased approach was deliberate for exactly this reason: catching things like the capability naming and IOContract signature early, when course-correcting is cheap. A long-lived feature branch arguably makes this worse, not better — changes pile up out of sight and the quality gate moves to the end. This thread is a good example of why incremental works.

My argument was that we can still evaluate PRs into the feature branch and review as we are doing now. I don't think that requires much more testing (probably just a manual job every PR). As long as there are no API changes until the whole epic is complete, I would be okay with incremental merge as well.

github-actions Bot added the enhancement New feature or request label May 26, 2026

planetf1 added 2 commits May 26, 2026 17:43

planetf1 force-pushed the worktree-issue-1134 branch from 92b9e7e to 8ecefaf Compare May 26, 2026 16:44

planetf1 mentioned this pull request May 26, 2026

fix(intrinsics): pin catalogue entries to HF revision SHAs + deduplicate requirement_check (#1135) #1157

Open

7 tasks

planetf1 marked this pull request as ready for review May 26, 2026 17:16

planetf1 requested a review from a team as a code owner May 26, 2026 17:16

planetf1 requested review from AngeloDanducci, jakelorocco, nrfulton and psschwei May 26, 2026 17:16

jakelorocco reviewed May 27, 2026

View reviewed changes

planetf1 mentioned this pull request May 28, 2026

Epic: Fix Intrinsic Adapter Lifecycle & Consistency in Mellea #929

Open

Conversation

planetf1 commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Where this fits in Epic #929

What changed

Naming-collision resolution

Deviations from the literal issue spec

Acceptance criteria

Testing

Attribution

Uh oh!

jakelorocco left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

planetf1 May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

planetf1 May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jakelorocco May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

psschwei commented May 28, 2026

Uh oh!

planetf1 commented May 28, 2026

Uh oh!

planetf1 commented May 28, 2026

Uh oh!

planetf1 commented May 28, 2026

Uh oh!

jakelorocco commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

planetf1 commented May 26, 2026 •

edited

Loading

planetf1 May 28, 2026 •

edited

Loading

planetf1 May 28, 2026 •

edited

Loading

jakelorocco May 29, 2026 •

edited

Loading