Skip to content

feat: add meta-search client#426

Merged
e06084 merged 2 commits into
MigoXLab:devfrom
e06084:dev
Jun 10, 2026
Merged

feat: add meta-search client#426
e06084 merged 2 commits into
MigoXLab:devfrom
e06084:dev

Conversation

@e06084

@e06084 e06084 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

No description provided.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new meta_search backend (MetaSearchClient) for Sciverse metadata search, refactoring the existing AgenticSearchClient to share common logic, and updates the CLI and configuration to support new parameters like sorting, freshness boosts, and filters. Feedback on these changes highlights a high-risk substring check in query formatting that could silently corrupt search queries, a vulnerability in filter normalization where malformed filters produce nonsense outputs, and minor code quality issues in the MetaSearchClient constructor regarding an unused parameter and unnecessary argument forwarding.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread dingo/retrieval/mteb_adapter.py Outdated
Comment on lines +338 to +342
if "field" in item and "value" in item:
item = dict(item)
item.setdefault("operator", MetaSearchClient._default_filter_operator(item))
normalized.append(item)
continue

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If an explicit filter is malformed (e.g., it contains "field" but is missing "value", or vice versa), it will currently fall through to the shortcut normalization loop. This treats "field" or "value" as a custom field name, producing a nonsense filter like {"field": "field", "operator": "FILTER_OP_EQ", "value": "some_field_name"}.

To prevent this, ensure that if either "field" or "value" is present in the item, it is treated as an explicit filter attempt and skipped if incomplete.

Suggested change
if "field" in item and "value" in item:
item = dict(item)
item.setdefault("operator", MetaSearchClient._default_filter_operator(item))
normalized.append(item)
continue
if "field" in item or "value" in item:
if "field" in item and "value" in item:
item = dict(item)
item.setdefault("operator", MetaSearchClient._default_filter_operator(item))
normalized.append(item)
continue

Comment on lines +291 to +304
def __init__(
self,
*args,
search_type: str = "paper",
sort_by: str | None = None,
freshness_boost: str | None = None,
filters: list[dict[str, Any]] | dict[str, Any] | None = None,
**kwargs: Any,
) -> None:
self.search_type = search_type
self.sort_by = sort_by
self.freshness_boost = freshness_boost
self.filters = self._normalize_filters(filters)
super().__init__(*args, **kwargs)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There are two improvement opportunities in this constructor:

  1. Unused Parameter: search_type is stored as self.search_type but is never used anywhere in MetaSearchClient (e.g., in _build_public_payload or _build_local_payload). If the Sciverse meta-search API expects a search type parameter, it should be included in the payload. Otherwise, consider removing it to avoid dead code.
  2. Unnecessary *args: Forwarding *args to super().__init__ is unnecessary and can be error-prone. Since search clients are typically instantiated via keyword arguments (e.g., in create_client), it is cleaner and more idiomatic to use **kwargs and forward them directly.
Suggested change
def __init__(
self,
*args,
search_type: str = "paper",
sort_by: str | None = None,
freshness_boost: str | None = None,
filters: list[dict[str, Any]] | dict[str, Any] | None = None,
**kwargs: Any,
) -> None:
self.search_type = search_type
self.sort_by = sort_by
self.freshness_boost = freshness_boost
self.filters = self._normalize_filters(filters)
super().__init__(*args, **kwargs)
def __init__(
self,
search_type: str = "paper",
sort_by: str | None = None,
freshness_boost: str | None = None,
filters: list[dict[str, Any]] | dict[str, Any] | None = None,
**kwargs: Any,
) -> None:
self.search_type = search_type
self.sort_by = sort_by
self.freshness_boost = freshness_boost
self.filters = self._normalize_filters(filters)
super().__init__(**kwargs)

@e06084 e06084 merged commit eeda720 into MigoXLab:dev Jun 10, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant