Skip to content

DEV-185072: introduce context specification mcp tools#77

Open
martin-anderson-collibra wants to merge 8 commits into
mainfrom
feature/DEV-185072-introduce-context-specification-mcp-tools
Open

DEV-185072: introduce context specification mcp tools#77
martin-anderson-collibra wants to merge 8 commits into
mainfrom
feature/DEV-185072-introduce-context-specification-mcp-tools

Conversation

@martin-anderson-collibra

@martin-anderson-collibra martin-anderson-collibra commented Jun 16, 2026

Copy link
Copy Markdown

🎯 What does this PR do?

This PR introduces a new experimental feature gate, context-specifications, which adds three new Model Context Protocol (MCP) tools. These tools integrate with Collibra's Semantic Blueprint and Context Engine APIs to allow LLMs to discover, inspect, and generate structured YAML context for assets.

🚀 Key Changes

  • New Experimental Feature Gate: Added context-specifications to knownExperimentalFeatures in cmd/chip/experimental.go, preventing these tools from exposing themselves unless explicitly opted into by the user.

  • API Client Implementations (pkg/clients):

  • semantic_blueprint_client.go: Handles fetching paged context specifications (ListContextSpecifications) and single specification details (GetContextSpecification).

  • context_engine_client.go: Handles context generation (GenerateContext), allowing responses in both raw application/yaml and full JSON envelope format (application/json) based on the includeMetadata flag.

  • error.go: Introduces a specialized executeCollibraRequest error wrapper that extracts machine-readable error codes and human-readable user messages from the collibraStandardError envelope, helping downstream LLMs understand exact API failures.

  • New MCP Tools Added:

  1. list_context_specifications: Primary discovery tool used to find specs matching an asset ID or asset type.
  2. get_context_specification: Inspection tool that fetches specific blueprint mapping YAML configurations.
  3. get_context: Execution tool that processes the specification against an asset and surfaces structured semantic context.
  • Documentation: Updated the README.md to track and document the new experimental context tools.

✅ Checklist

  • My code follows the style guidelines of this project.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation (if needed).
  • My commit messages follow the Conventional Commits standard.

@martin-anderson-collibra martin-anderson-collibra requested a review from a team as a code owner June 16, 2026 16:09
@svc-snyk-github-jira

svc-snyk-github-jira commented Jun 16, 2026

Copy link
Copy Markdown

Snyk checks have passed. No issues have been found so far.

Status Scan Engine Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@martin-anderson-collibra martin-anderson-collibra changed the title Feature/dev 185072 introduce context specification mcp tools DEV-185072: introduce context specification mcp tools Jun 16, 2026
@nevers

nevers commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Hey @martin-anderson-collibra, thank you for contributing.
Could you also have a look how this tool can be better contextualized in the skills in pkg/skills/files/collibra?
This may require a new skill or at least an improvement to the discovery skill.

@martin-anderson-collibra

Copy link
Copy Markdown
Author

@nevers thanks, added a new skill. Let me know if you see any issues with it

return &chip.Tool[Input, Output]{
Name: "get_context",
Title: "Get Context",
Description: "Context generation execution tool. Executes a Context Specification against a specific asset and returns the generated context as structured YAML. This is the final step in the context workflow: use list_context_specifications to discover specs, optionally use get_context_specification to inspect the mapping, then call this tool to generate the context.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-anderson-collibra any reason this could not live as a functional parameter for get_asset_details? Context is a very broad term and I worry an LLM will get confused figuring out when to use these tools vs others

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've discussed with our PM and we agree that it doesn't make sense to combine it with get_asset_details, but we came up with a more specific name - get_asset_context_from_specification. Is it okay?

We also have a new skill describing in more detail how these tools should be used, so that will hopefully help as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about sprawl of LLM solving user problems. What is the different between details and context? I would prefer, prior to release of impactful tools such as this, performance tests be conducted to ensure we understand behavior of tools. Can you explain a bit more the reasoning for the separation?

As it stands, I still believe this should sit as an available option for get_asset_details. Happy to chat with you and/or your PM to work through this, and I can be convinced if performance testing bears out the approach of having multiple tools of this nature

return &chip.Tool[Input, Output]{
Name: "get_context_specification",
Title: "Get Context Specification",
Description: "Inspection tool. Returns the full Context Specification including the mappingYaml configuration, so you can understand which fields and metrics will be populated before executing get_context. Use this after list_context_specifications to examine a specific spec before running it against an asset.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-anderson-collibra this should be more specific. What is a context specification in the context of Collibra? What types of fields and metrics?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the description to be more specific

return &chip.Tool[Input, Output]{
Name: "list_context_specifications",
Title: "List Context Specifications",
Description: "Primary discovery tool for Context Specifications. Returns all Context Specifications applicable to a given asset or asset type. Use assetId to find specs that match the type of a specific asset, or assetTypePublicId to filter by a known asset type. Call this first to discover which Context Specifications are available before calling get_context.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martin-anderson-collibra Same comment - please be specific about what a 'Context Specification' is

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

|---|---|---|---|
| `listContextSpecifications` | Discover which Context Specifications (Knowledge Graph blueprints) are available for an asset or asset type | List of spec names, descriptions, and IDs | Always, entry point |
| `getContextSpecification` | Inspect a spec's blueprint: which relations it defines, what fields it extracts from the Knowledge Graph, what transforms it applies | Complete YAML mapping and spec metadata | Optional, only when user asks what a spec covers |
| `getContext` | Execute a spec's blueprint against an asset to extract and shape its governed metadata subset | Structured metadata (JSON, YAML, etc.) shaped for the target system | Always, output step |

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is getContext supposed to be the renamed get_asset_context_from_specification here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, looks like I missed those. Fixed and applied a few other changes suggested by AI

@martin-anderson-collibra martin-anderson-collibra force-pushed the feature/DEV-185072-introduce-context-specification-mcp-tools branch from c5b057f to b604dc6 Compare June 24, 2026 16:33
bobby-smedley
bobby-smedley previously approved these changes Jun 24, 2026
return &chip.Tool[Input, Output]{
Name: "get_context",
Title: "Get Context",
Description: "Context generation execution tool. Executes a Context Specification against a specific asset and returns the generated context as structured YAML. This is the final step in the context workflow: use list_context_specifications to discover specs, optionally use get_context_specification to inspect the mapping, then call this tool to generate the context.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worry about sprawl of LLM solving user problems. What is the different between details and context? I would prefer, prior to release of impactful tools such as this, performance tests be conducted to ensure we understand behavior of tools. Can you explain a bit more the reasoning for the separation?

As it stands, I still believe this should sit as an available option for get_asset_details. Happy to chat with you and/or your PM to work through this, and I can be convinced if performance testing bears out the approach of having multiple tools of this nature

return &chip.Tool[Input, Output]{
Name: "list_context_specifications",
Title: "List Context Specifications",
Description: "Retrieve a list of available Context Specifications. A Context Specification defines how to extract governed metadata from Collibra. Starting from an asset (e.g., a Data Product), it specifies which relations to traverse, what fields to pull (name, status, description), and what shape to return for a target system (Snowflake, Databricks, custom for AI agents). Use this to discover which Contexts are available for querying metadata about specific asset or asset type.",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An LLM may struggle with how this extract is different than keyword search or get asset details or some other mechanism for getting metadata.

Specifically, I would be careful with this sentence, even though there is more detail after it. Please be very specific

A Context Specification defines how to extract governed metadata from Collibra

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants