feat(mcp): add search_memories to Python MCP server by aryaminus · Pull Request #7550 · BasedHardware/omi

aryaminus · 2026-05-30T16:02:00Z

Summary

Adds search_memories semantic search tool to the Python MCP server (mcp/src/mcp_server_omi/server.py), enabling Claude Desktop, Cursor, ChatGPT, and other MCP clients to search the user's memories by meaning.
Adds GET /v1/mcp/memories/search REST endpoint to backend/routers/mcp.py, returning memories ranked by relevance with relevance_score.

The gap

The SSE MCP server (mcp_sse.py) already has search_memories (added in PR #6572, fixed in #6609, requested in #6555). The Python MCP server — which is what developers connect from Claude Desktop, Cursor, and ChatGPT — only has get_memories (paginated list) and search_conversations, but no semantic memory search.

This PR closes that gap by wiring the same vector_db.find_similar_memories backend into the Python MCP path.

Before/After

MCP Client	Before	After
Claude Desktop	`get_memories` only (paginated)	`search_memories("machine learning")` returns ranked results
Cursor, ChatGPT	No way to find memories by topic	Semantic search with relevance scores

Implementation

Backend route follows the exact same pattern as the existing GET /v1/mcp/conversations/search in the same file
Uses vector_db.find_similar_memories(uid, query, threshold=0.0, limit=limit) — same function the SSE server uses
Returns SearchedMemory with id, content, category, relevance_score
Filters locked memories (consistent with existing get_memories behavior)
Caps limit at 20 to match the SSE server behavior
Python MCP server mirrors the existing search_conversations tool pattern exactly

Files

File	Change
`backend/routers/mcp.py`	New `SearchedMemory` model and `GET /v1/mcp/memories/search` route
`mcp/src/mcp_server_omi/server.py`	New `SearchMemories` model, `search_memories` function, tool registration, and handler

PR feat(mcp): add search_memories and search_conversations tools #6572 — added search_memories to the SSE MCP server
PR fix(mcp): search_memories 500 — wrong key name from vector_db #6609 — fixed the SSE search_memories 500 bug
Issue MCP server: add search/query support for memories and conversations #6555 — original feature request for MCP search
Issue MCP: search_conversations should search transcripts (and return match snippets) #6621 — open request for search_conversations to search transcripts (separate concern)

🤖 Generated with OpenCode

greptile-apps · 2026-05-30T16:05:27Z

Greptile Summary

This PR closes the feature gap between the SSE MCP server (which already had search_memories) and the Python MCP server used by Claude Desktop, Cursor, and ChatGPT. It also adds Pinecone vector sync to the create/edit/delete memory mutations so the index stays consistent.

GET /v1/mcp/memories/search — new backend route that calls vector_db.find_similar_memories with a 3× over-fetch strategy to absorb locked/rejected memory filtering, re-sorts by relevance_score, and returns a SearchedMemory list capped at 20. Limit is clamped with max(1, min(limit, 20)).
Vector sync — create_memory, edit_memory, and delete_memory now wrap upsert_memory_vector / delete_memory_vector in try/except so Firestore writes are never blocked by a Pinecone failure.
Python MCP server — SearchMemories model, search_memories HTTP function, and call_tool handler added following the exact search_conversations pattern; empty-query guard included.

Confidence Score: 5/5

Safe to merge — the new search route and vector-sync mutations are additive, auth-gated by the existing MCP API key dependency, and don't touch any shared state beyond what was already modified by other memory endpoints.

The implementation correctly maps Pinecone memory_id metadata to Firestore document id fields. The 3x over-fetch strategy correctly absorbs locked/rejected filtering. Limit clamping, empty-query guard, and try/except vector-sync wrappers are all in place. No routing conflicts exist.

No files require special attention.

Important Files Changed

Filename	Overview
backend/routers/mcp.py	Adds SearchedMemory model, GET /v1/mcp/memories/search route with 3x-candidate fetch + relevance re-sort, and vector sync (upsert/delete) for create/edit/delete memory mutations. Limit clamped to max(1, min(limit,20)). Auth and logging are consistent with existing routes.
mcp/src/mcp_server_omi/server.py	Adds SearchMemories Pydantic model, search_memories HTTP function (mirrors search_conversations pattern with raise_for_status), SEARCH_MEMORIES enum value, Tool registration, and call_tool handler with empty-query guard. URL construction is correct.
backend/tests/unit/test_mcp_search_memories.py	Comprehensive test suite covering empty results, ranked results, locked/rejected memory filtering, limit clamping (0, negative, >20), 3x fetch_limit calculation, missing memory_id skip, sorting order, and vector sync for edit/delete.
mcp/tests/test_search_conversations.py	Replaces fragile count-based test with an explicit set equality check; adds test_search_memories_in_enum. More robust against future tool additions.
backend/test.sh	Adds test_mcp_search_memories.py to the CI test script.

Sequence Diagram

sequenceDiagram
    participant Client as MCP Client
    participant MCP as Python MCP Server
    participant API as FastAPI Backend
    participant Pinecone as Pinecone
    participant Firestore as Firestore

    Client->>MCP: call_tool search_memories
    MCP->>MCP: validate api_key, guard empty query
    MCP->>API: GET /v1/mcp/memories/search
    API->>API: clamp limit, compute fetch_limit
    API->>Pinecone: find_similar_memories
    Pinecone-->>API: memory_id + score matches
    API->>Firestore: get_memories_by_ids
    Firestore-->>API: memory docs
    API->>API: filter, sort, slice
    API-->>MCP: SearchedMemory list
    MCP-->>Client: JSON text response

_{Reviews (2): Last reviewed commit: "fix(mcp): overfetch Pinecone candidates ..." | Re-trigger Greptile}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ca1d18fc57

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

- MCP create_memory now calls upsert_memory_vector so newly created memories are immediately searchable (previously they were invisible to search_memories until a backfill ran). - Added max(1, ...) guard on limit to prevent 0 or negative values reaching Pinecone.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 24ea6f653e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

MCP delete_memory only removed the Firestore record, leaving a stale Pinecone vector. Now that search_memories is exposed over MCP, those stale vectors would occupy top-K slots and push valid results out. Mirror the canonical /v3/memories delete path (delete_memory_vector in a try/except). Adds tests covering the vector-delete call and the Firestore-deleted-but-vector-failed path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 275d77550c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Adding SEARCH_MEMORIES made OmiTools 8 members, breaking the len(OmiTools) == 7 assertion. Replace the brittle count with an explicit expected-members set and add a SEARCH_MEMORIES membership test. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c72eb5763

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

get_memories_by_ids returns raw Firestore docs without applying the user_review filter that memories_db.get_memories enforces. A memory the user explicitly rejected (user_review=False) had no vector removed, so it could surface in semantic search results. Mirror the DB-layer filter: skip any memory where user_review is False before appending to results. Unreviewed (field absent/None) memories are still included. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8ac995e9d4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

MCP edit_memory only updated Firestore, leaving the Pinecone vector pointing at the old content. Searching for the new text would miss the memory; searching for the old text could still rank it. Re-use the category already fetched by _validate_mcp_memory (edit doesn't change category) and call upsert_memory_vector with the new plaintext value, matching the pattern used in create_memory and delete_memory. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1f147bbe70

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Truncating content for locked hits still leaks existence and short content via MCP API keys. Both reference search implementations (utils/retrieval/tool_services/memories.py and utils/retrieval/tools/memory_tools.py) filter locked hits out entirely. Align search_memories to that policy: skip any memory where is_locked is True before appending to results, and update the test that previously asserted truncation behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f9701655d0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Fetching exactly limit candidates from Pinecone meant that skipped hits (locked or user_review=False) consumed the entire result budget. With limit=1, a locked top hit returned an empty list even though a valid memory was the next Pinecone match. Fetch fetch_limit = min(limit * 3, 60) candidates, apply all visibility filters, then trim to limit after sorting. The 3x multiplier covers typical filter rates while keeping the Pinecone top_k within a reasonable bound. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

kodjima33

Backend+MCP feature: adds search_memories to Python MCP server, mirrors existing SSE pattern. Approving (backend → approve only per policy).

mdmohsin7 · 2026-06-01T09:38:26Z

@greptile-apps re-review

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

Comment thread backend/routers/mcp.py Outdated

greptile-apps Bot reviewed May 30, 2026

View reviewed changes

Comment thread backend/routers/mcp.py Outdated

Comment thread backend/routers/mcp.py Outdated

aryaminus added 3 commits May 30, 2026 09:10

feat(mcp): add search_memories endpoint for REST MCP API

dd012d7

feat(mcp): add search_memories tool to Python MCP server

0e7cf7c

test(mcp): add unit tests for search_memories endpoint

28628dc

aryaminus force-pushed the feat/mcp-python-search-memories branch from ca1d18f to 28628dc Compare May 30, 2026 16:10

aryaminus added 2 commits May 30, 2026 09:12

test(mcp): add limit clamping tests for search_memories

24ea6f6

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

Comment thread backend/routers/mcp.py Outdated

aryaminus and others added 2 commits May 30, 2026 09:16

test(mcp): add search_memories unit test to test.sh

5636152

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

Comment thread mcp/src/mcp_server_omi/server.py

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

Comment thread backend/routers/mcp.py

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

Comment thread backend/routers/mcp.py Outdated

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

Comment thread backend/routers/mcp.py Outdated

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

Comment thread backend/routers/mcp.py Outdated

kodjima33 approved these changes May 31, 2026

View reviewed changes

mdmohsin7 approved these changes Jun 1, 2026

View reviewed changes

mdmohsin7 merged commit ddf9cce into BasedHardware:main Jun 1, 2026
2 checks passed

Conversation

aryaminus commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

The gap

Before/After

Implementation

Files

Related

Uh oh!

greptile-apps Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

kodjima33 left a comment

Choose a reason for hiding this comment

Uh oh!

mdmohsin7 commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aryaminus commented May 30, 2026 •

edited

Loading

greptile-apps Bot commented May 30, 2026 •

edited

Loading