feat: add meta-search client#426
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new meta_search backend (MetaSearchClient) for Sciverse metadata search, refactoring the existing AgenticSearchClient to share common logic, and updates the CLI and configuration to support new parameters like sorting, freshness boosts, and filters. Feedback on these changes highlights a high-risk substring check in query formatting that could silently corrupt search queries, a vulnerability in filter normalization where malformed filters produce nonsense outputs, and minor code quality issues in the MetaSearchClient constructor regarding an unused parameter and unnecessary argument forwarding.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if "field" in item and "value" in item: | ||
| item = dict(item) | ||
| item.setdefault("operator", MetaSearchClient._default_filter_operator(item)) | ||
| normalized.append(item) | ||
| continue |
There was a problem hiding this comment.
If an explicit filter is malformed (e.g., it contains "field" but is missing "value", or vice versa), it will currently fall through to the shortcut normalization loop. This treats "field" or "value" as a custom field name, producing a nonsense filter like {"field": "field", "operator": "FILTER_OP_EQ", "value": "some_field_name"}.
To prevent this, ensure that if either "field" or "value" is present in the item, it is treated as an explicit filter attempt and skipped if incomplete.
| if "field" in item and "value" in item: | |
| item = dict(item) | |
| item.setdefault("operator", MetaSearchClient._default_filter_operator(item)) | |
| normalized.append(item) | |
| continue | |
| if "field" in item or "value" in item: | |
| if "field" in item and "value" in item: | |
| item = dict(item) | |
| item.setdefault("operator", MetaSearchClient._default_filter_operator(item)) | |
| normalized.append(item) | |
| continue |
| def __init__( | ||
| self, | ||
| *args, | ||
| search_type: str = "paper", | ||
| sort_by: str | None = None, | ||
| freshness_boost: str | None = None, | ||
| filters: list[dict[str, Any]] | dict[str, Any] | None = None, | ||
| **kwargs: Any, | ||
| ) -> None: | ||
| self.search_type = search_type | ||
| self.sort_by = sort_by | ||
| self.freshness_boost = freshness_boost | ||
| self.filters = self._normalize_filters(filters) | ||
| super().__init__(*args, **kwargs) |
There was a problem hiding this comment.
There are two improvement opportunities in this constructor:
- Unused Parameter:
search_typeis stored asself.search_typebut is never used anywhere inMetaSearchClient(e.g., in_build_public_payloador_build_local_payload). If the Sciverse meta-search API expects a search type parameter, it should be included in the payload. Otherwise, consider removing it to avoid dead code. - Unnecessary
*args: Forwarding*argstosuper().__init__is unnecessary and can be error-prone. Since search clients are typically instantiated via keyword arguments (e.g., increate_client), it is cleaner and more idiomatic to use**kwargsand forward them directly.
| def __init__( | |
| self, | |
| *args, | |
| search_type: str = "paper", | |
| sort_by: str | None = None, | |
| freshness_boost: str | None = None, | |
| filters: list[dict[str, Any]] | dict[str, Any] | None = None, | |
| **kwargs: Any, | |
| ) -> None: | |
| self.search_type = search_type | |
| self.sort_by = sort_by | |
| self.freshness_boost = freshness_boost | |
| self.filters = self._normalize_filters(filters) | |
| super().__init__(*args, **kwargs) | |
| def __init__( | |
| self, | |
| search_type: str = "paper", | |
| sort_by: str | None = None, | |
| freshness_boost: str | None = None, | |
| filters: list[dict[str, Any]] | dict[str, Any] | None = None, | |
| **kwargs: Any, | |
| ) -> None: | |
| self.search_type = search_type | |
| self.sort_by = sort_by | |
| self.freshness_boost = freshness_boost | |
| self.filters = self._normalize_filters(filters) | |
| super().__init__(**kwargs) |
No description provided.