Skip to content

feat(mcp): add search_memories to Python MCP server#7550

Merged
mdmohsin7 merged 12 commits into
BasedHardware:mainfrom
aryaminus:feat/mcp-python-search-memories
Jun 1, 2026
Merged

feat(mcp): add search_memories to Python MCP server#7550
mdmohsin7 merged 12 commits into
BasedHardware:mainfrom
aryaminus:feat/mcp-python-search-memories

Conversation

@aryaminus
Copy link
Copy Markdown
Contributor

@aryaminus aryaminus commented May 30, 2026

Summary

  • Adds search_memories semantic search tool to the Python MCP server (mcp/src/mcp_server_omi/server.py), enabling Claude Desktop, Cursor, ChatGPT, and other MCP clients to search the user's memories by meaning.
  • Adds GET /v1/mcp/memories/search REST endpoint to backend/routers/mcp.py, returning memories ranked by relevance with relevance_score.

The gap

The SSE MCP server (mcp_sse.py) already has search_memories (added in PR #6572, fixed in #6609, requested in #6555). The Python MCP server — which is what developers connect from Claude Desktop, Cursor, and ChatGPT — only has get_memories (paginated list) and search_conversations, but no semantic memory search.

This PR closes that gap by wiring the same vector_db.find_similar_memories backend into the Python MCP path.

Before/After

MCP Client Before After
Claude Desktop get_memories only (paginated) search_memories("machine learning") returns ranked results
Cursor, ChatGPT No way to find memories by topic Semantic search with relevance scores

Implementation

  • Backend route follows the exact same pattern as the existing GET /v1/mcp/conversations/search in the same file
  • Uses vector_db.find_similar_memories(uid, query, threshold=0.0, limit=limit) — same function the SSE server uses
  • Returns SearchedMemory with id, content, category, relevance_score
  • Filters locked memories (consistent with existing get_memories behavior)
  • Caps limit at 20 to match the SSE server behavior
  • Python MCP server mirrors the existing search_conversations tool pattern exactly

Files

File Change
backend/routers/mcp.py New SearchedMemory model and GET /v1/mcp/memories/search route
mcp/src/mcp_server_omi/server.py New SearchMemories model, search_memories function, tool registration, and handler

Related

🤖 Generated with OpenCode

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 30, 2026

Greptile Summary

This PR closes the feature gap between the SSE MCP server (which already had search_memories) and the Python MCP server used by Claude Desktop, Cursor, and ChatGPT. It also adds Pinecone vector sync to the create/edit/delete memory mutations so the index stays consistent.

  • GET /v1/mcp/memories/search — new backend route that calls vector_db.find_similar_memories with a 3× over-fetch strategy to absorb locked/rejected memory filtering, re-sorts by relevance_score, and returns a SearchedMemory list capped at 20. Limit is clamped with max(1, min(limit, 20)).
  • Vector synccreate_memory, edit_memory, and delete_memory now wrap upsert_memory_vector / delete_memory_vector in try/except so Firestore writes are never blocked by a Pinecone failure.
  • Python MCP serverSearchMemories model, search_memories HTTP function, and call_tool handler added following the exact search_conversations pattern; empty-query guard included.

Confidence Score: 5/5

Safe to merge — the new search route and vector-sync mutations are additive, auth-gated by the existing MCP API key dependency, and don't touch any shared state beyond what was already modified by other memory endpoints.

The implementation correctly maps Pinecone memory_id metadata to Firestore document id fields. The 3x over-fetch strategy correctly absorbs locked/rejected filtering. Limit clamping, empty-query guard, and try/except vector-sync wrappers are all in place. No routing conflicts exist.

No files require special attention.

Important Files Changed

Filename Overview
backend/routers/mcp.py Adds SearchedMemory model, GET /v1/mcp/memories/search route with 3x-candidate fetch + relevance re-sort, and vector sync (upsert/delete) for create/edit/delete memory mutations. Limit clamped to max(1, min(limit,20)). Auth and logging are consistent with existing routes.
mcp/src/mcp_server_omi/server.py Adds SearchMemories Pydantic model, search_memories HTTP function (mirrors search_conversations pattern with raise_for_status), SEARCH_MEMORIES enum value, Tool registration, and call_tool handler with empty-query guard. URL construction is correct.
backend/tests/unit/test_mcp_search_memories.py Comprehensive test suite covering empty results, ranked results, locked/rejected memory filtering, limit clamping (0, negative, >20), 3x fetch_limit calculation, missing memory_id skip, sorting order, and vector sync for edit/delete.
mcp/tests/test_search_conversations.py Replaces fragile count-based test with an explicit set equality check; adds test_search_memories_in_enum. More robust against future tool additions.
backend/test.sh Adds test_mcp_search_memories.py to the CI test script.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant MCP as Python MCP Server
    participant API as FastAPI Backend
    participant Pinecone as Pinecone
    participant Firestore as Firestore

    Client->>MCP: call_tool search_memories
    MCP->>MCP: validate api_key, guard empty query
    MCP->>API: GET /v1/mcp/memories/search
    API->>API: clamp limit, compute fetch_limit
    API->>Pinecone: find_similar_memories
    Pinecone-->>API: memory_id + score matches
    API->>Firestore: get_memories_by_ids
    Firestore-->>API: memory docs
    API->>API: filter, sort, slice
    API-->>MCP: SearchedMemory list
    MCP-->>Client: JSON text response
Loading

Reviews (2): Last reviewed commit: "fix(mcp): overfetch Pinecone candidates ..." | Re-trigger Greptile

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ca1d18fc57

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread backend/routers/mcp.py Outdated
Comment thread backend/routers/mcp.py Outdated
Comment thread backend/routers/mcp.py Outdated
@aryaminus aryaminus force-pushed the feat/mcp-python-search-memories branch from ca1d18f to 28628dc Compare May 30, 2026 16:10
aryaminus added 2 commits May 30, 2026 09:12
- MCP create_memory now calls upsert_memory_vector so newly created
  memories are immediately searchable (previously they were invisible
  to search_memories until a backfill ran).
- Added max(1, ...) guard on limit to prevent 0 or negative values
  reaching Pinecone.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 24ea6f653e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread backend/routers/mcp.py Outdated
aryaminus and others added 2 commits May 30, 2026 09:16
MCP delete_memory only removed the Firestore record, leaving a stale
Pinecone vector. Now that search_memories is exposed over MCP, those
stale vectors would occupy top-K slots and push valid results out.
Mirror the canonical /v3/memories delete path (delete_memory_vector in
a try/except). Adds tests covering the vector-delete call and the
Firestore-deleted-but-vector-failed path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 275d77550c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread mcp/src/mcp_server_omi/server.py
Adding SEARCH_MEMORIES made OmiTools 8 members, breaking the
len(OmiTools) == 7 assertion. Replace the brittle count with an
explicit expected-members set and add a SEARCH_MEMORIES membership test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c72eb5763

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread backend/routers/mcp.py
get_memories_by_ids returns raw Firestore docs without applying the
user_review filter that memories_db.get_memories enforces. A memory
the user explicitly rejected (user_review=False) had no vector removed,
so it could surface in semantic search results. Mirror the DB-layer
filter: skip any memory where user_review is False before appending to
results. Unreviewed (field absent/None) memories are still included.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8ac995e9d4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread backend/routers/mcp.py Outdated
MCP edit_memory only updated Firestore, leaving the Pinecone vector
pointing at the old content. Searching for the new text would miss the
memory; searching for the old text could still rank it. Re-use the
category already fetched by _validate_mcp_memory (edit doesn't change
category) and call upsert_memory_vector with the new plaintext value,
matching the pattern used in create_memory and delete_memory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1f147bbe70

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread backend/routers/mcp.py Outdated
Truncating content for locked hits still leaks existence and short
content via MCP API keys. Both reference search implementations
(utils/retrieval/tool_services/memories.py and
utils/retrieval/tools/memory_tools.py) filter locked hits out
entirely. Align search_memories to that policy: skip any memory
where is_locked is True before appending to results, and update
the test that previously asserted truncation behaviour.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f9701655d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread backend/routers/mcp.py Outdated
Fetching exactly limit candidates from Pinecone meant that skipped
hits (locked or user_review=False) consumed the entire result budget.
With limit=1, a locked top hit returned an empty list even though a
valid memory was the next Pinecone match.

Fetch fetch_limit = min(limit * 3, 60) candidates, apply all
visibility filters, then trim to limit after sorting. The 3x
multiplier covers typical filter rates while keeping the Pinecone
top_k within a reasonable bound.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@kodjima33 kodjima33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Backend+MCP feature: adds search_memories to Python MCP server, mirrors existing SSE pattern. Approving (backend → approve only per policy).

@mdmohsin7
Copy link
Copy Markdown
Member

@greptile-apps re-review

@mdmohsin7 mdmohsin7 merged commit ddf9cce into BasedHardware:main Jun 1, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants