Conversation
Completed Working on "Code Review"✅ Workflow completed successfully. |
|
🚅 Deployed to the QueryWeaver-pr-386 environment in queryweaver
|
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing touches🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
|
||
| except Exception as e: # pylint: disable=broad-except | ||
| error_msg = str(e) | ||
| logging.warning("%s API key validation failed: %s", vendor.capitalize(), error_msg) |
Check failure
Code scanning / CodeQL
Log Injection High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 23 days ago
In general, to fix log injection you should sanitize or validate any user-provided values before logging them, removing or escaping newline and other control characters that could break log structure. Ideally, you also constrain such values to an expected set (e.g., known vendor identifiers) and log a safe fallback if the input is unexpected.
For this specific case, the minimal, non-breaking fix is to sanitize vendor (or a copy used for logging) by stripping or replacing newline and carriage-return characters (and optionally other control characters) before using it in the log message. This keeps all existing behavior (functionality and log content) the same for normal, valid inputs and just normalizes malicious or malformed values into a single-line safe string.
Concretely:
- Introduce a small helper inside
validate_api_keyto sanitize a string for logging, or just sanitizevendorright before logging. - For example, create a local variable
safe_vendor_for_logfromvendorthat replaces\rand\nwith empty strings (or spaces), and maybe trims leading/trailing whitespace. - Use
safe_vendor_for_log.capitalize()in thelogging.warningcall on line 76 instead ofvendor.capitalize().
All changes occur within api/routes/settings.py in the shown function. No new imports are required since we can use built-in string methods.
| @@ -73,7 +73,9 @@ | ||
|
|
||
| except Exception as e: # pylint: disable=broad-except | ||
| error_msg = str(e) | ||
| logging.warning("%s API key validation failed: %s", vendor.capitalize(), error_msg) | ||
| # Sanitize vendor before logging to prevent log injection via control characters | ||
| safe_vendor_for_log = vendor.replace("\r", "").replace("\n", "").strip() | ||
| logging.warning("%s API key validation failed: %s", safe_vendor_for_log.capitalize(), error_msg) | ||
|
|
||
| # Check for common error messages | ||
| if "invalid" in error_msg.lower() or "authentication" in error_msg.lower(): |
| ) | ||
| else: | ||
| return JSONResponse( | ||
| content={"valid": False, "error": f"Failed to validate API key: {error_msg}"}, |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 23 days ago
In general, to fix this class of issue the server should never send raw exception messages or stack traces to clients. Instead, it should (a) log the full error message and stack trace on the server side for diagnostics, and (b) return a generic, user-friendly error string or at most a coarse error category.
For this specific function, the best fix without changing behavior for known cases is:
- Keep logging the detailed
error_msg(and ideally a stack trace) to the server logs. - Preserve the existing tailored responses for “invalid/authentication” and “quota/rate” error patterns.
- For all other errors (the
elsebranch), stop includingerror_msgin the JSON response. Replace it with a generic message like"Failed to validate API key due to an internal error"or similar, so the client no longer sees provider/internal details. - Optionally, enhance logging to include the full stack trace with
logging.exceptionso that removing detail from the response does not degrade debuggability.
Concretely, in api/routes/settings.py:
- In the
except Exception as eblock, replace the finalJSONResponse(lines 90–92) so that theerrorfield is a fixed generic string and does not interpolateerror_msg. - Optionally, also change the logging call to
logging.exceptionto record a stack trace, while not affecting the response.
No new imports are strictly necessary; we can reuse the existing logging module.
| @@ -73,7 +73,7 @@ | ||
|
|
||
| except Exception as e: # pylint: disable=broad-except | ||
| error_msg = str(e) | ||
| logging.warning("%s API key validation failed: %s", vendor.capitalize(), error_msg) | ||
| logging.exception("%s API key validation failed: %s", vendor.capitalize(), error_msg) | ||
|
|
||
| # Check for common error messages | ||
| if "invalid" in error_msg.lower() or "authentication" in error_msg.lower(): | ||
| @@ -88,6 +88,9 @@ | ||
| ) | ||
| else: | ||
| return JSONResponse( | ||
| content={"valid": False, "error": f"Failed to validate API key: {error_msg}"}, | ||
| content={ | ||
| "valid": False, | ||
| "error": "Failed to validate API key due to an internal server error", | ||
| }, | ||
| status_code=500 | ||
| ) |
There was a problem hiding this comment.
Pull request overview
This PR adds configurable AI model usage to QueryWeaver, allowing users to select and validate their preferred AI provider (OpenAI, Google Gemini, or Anthropic) through the settings UI. The custom API keys and models are stored in memory only and passed to all LLM agents for query processing.
Changes:
- Added frontend settings context and UI for configuring AI vendor, API key, and model name with validation
- Implemented backend API endpoint for validating API keys across multiple providers
- Updated all AI agents to accept and use custom API keys and models
- Enhanced configuration logic to support multiple AI providers with automatic fallback
Reviewed changes
Copilot reviewed 19 out of 20 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| app/src/types/api.ts | Added optional custom API configuration fields to ChatRequest interface |
| app/src/services/chat.ts | Modified to include custom API settings in chat requests with vendor-specific prefixing |
| app/src/pages/Settings.tsx | Added comprehensive AI model configuration UI with validation and provider selection |
| app/src/contexts/SettingsContext.tsx | Created new context for managing AI provider settings in memory |
| app/src/components/modals/SettingsModal.tsx | Completely refactored to support AI model configuration instead of query rules |
| app/src/components/layout/Sidebar.tsx | Changed settings icon from Settings to Sliders |
| app/src/components/chat/ChatInterface.tsx | Integrated custom AI settings into chat query requests |
| app/src/App.tsx | Added SettingsProvider to application context hierarchy |
| api/routes/settings.py | Created API endpoint for validating AI provider API keys |
| api/core/text2sql.py | Updated to extract and pass custom API settings to all agents |
| api/config.py | Enhanced to support multiple AI providers with priority-based fallback logic |
| api/app_factory.py | Registered new settings router |
| api/agents/utils.py | Updated BaseAgent to accept custom API key and model parameters |
| api/agents/response_formatter_agent.py | Modified to use custom API settings when provided |
| api/agents/relevancy_agent.py | Modified to use custom API settings when provided |
| api/agents/follow_up_agent.py | Modified to use custom API settings when provided |
| api/agents/analysis_agent.py | Modified to use custom API settings when provided |
| README.md | Updated documentation to reflect multi-provider support |
| .env.example | Updated with comprehensive AI provider configuration examples |
Files not reviewed (1)
- app/package-lock.json: Language not supported
| export const SettingsProvider: React.FC<SettingsProviderProps> = ({ children }) => { | ||
| const [vendor, setVendor] = useState<AIVendor>('openai'); | ||
| const [apiKey, setApiKey] = useState<string | null>(null); | ||
| const [modelName, setModelName] = useState<string>('gpt-4o-mini'); |
There was a problem hiding this comment.
The default model name 'gpt-4o-mini' is inconsistent with the example models shown elsewhere in the codebase (e.g., 'gpt-4.1' in AI_VENDORS). This inconsistency could confuse users about which model names are valid.
| const [modelName, setModelName] = useState<string>('gpt-4o-mini'); | |
| const [modelName, setModelName] = useState<string>('gpt-4.1'); |
| setRules(""); | ||
| setTempVendor('openai'); | ||
| setTempApiKey(''); | ||
| setTempModelName('gpt-4o'); |
There was a problem hiding this comment.
The handleClear function sets the model name to 'gpt-4o', which is inconsistent with other default model names used throughout the codebase ('gpt-4.1', 'gpt-4o-mini'). Standardize on a single valid default model name.
| setTempModelName('gpt-4o'); | |
| setTempModelName('gpt-4.1'); |
| custom_api_key = chat_data.custom_api_key if hasattr(chat_data, 'custom_api_key') else None | ||
| custom_model = chat_data.custom_model if hasattr(chat_data, 'custom_model') else None |
There was a problem hiding this comment.
Using hasattr for Pydantic model attributes is unnecessary since ChatRequest already defines these fields with default None values. Access them directly as chat_data.custom_api_key and chat_data.custom_model for cleaner code.
| custom_api_key = chat_data.custom_api_key if hasattr(chat_data, 'custom_api_key') else None | |
| custom_model = chat_data.custom_model if hasattr(chat_data, 'custom_model') else None | |
| custom_api_key = chat_data.custom_api_key | |
| custom_model = chat_data.custom_model |
| custom_api_key = confirm_data.custom_api_key if hasattr(confirm_data, 'custom_api_key') else None | ||
| custom_model = confirm_data.custom_model if hasattr(confirm_data, 'custom_model') else None |
There was a problem hiding this comment.
Using hasattr for Pydantic model attributes is unnecessary since ConfirmRequest defines these fields with default None values. Access them directly for cleaner code.
There was a problem hiding this comment.
Findings summary:
- Importance counts: Major (9)
Key themes:
- Custom model/credential handling is inconsistent—the confirm flow drops user-provided keys/models and both the settings modal/page overwrite saved model names when opened.
- Vendor prefix logic double-prefixes Gemini (and other) models in the UI and chat service, so validated models never reach the backend intact.
- The new configuration paths have unsafe defaults: Anthropic-only setups fall back to Azure embeddings without credentials, and the
/api/validate-api-keyendpoint is unauthenticated and unthrottled.
Next steps:
- Restore per-request credentials end-to-end (types, ChatInterface, ChatService) and ensure UI state preserves user-entered models instead of resetting to defaults.
- Fix vendor-prefix construction so values already containing a provider aren’t rewritten, and align the documented model format with what the backend expects.
- Harden backend configuration: only select embeddings when credentials exist, and require authentication + rate limiting before invoking litellm for key validation.
| EMBEDDING_MODEL_NAME = "voyage/voyage-3" | ||
| else: | ||
| # Anthropic has no native embeddings, fall back to Azure embeddings | ||
| EMBEDDING_MODEL_NAME = "azure/text-embedding-ada-002" |
There was a problem hiding this comment.
[major]: When only ANTHROPIC_API_KEY is configured, the new fallback forces EMBEDDING_MODEL_NAME = "azure/text-embedding-ada-002" (lines 96‑104) even if no Azure credentials are present. That makes the documented Anthropic-only setup impossible: every embedding call will now hit Azure without a key and fail at runtime, despite the user explicitly choosing Anthropic. Before selecting the Azure embedding model, ensure the Azure env vars exist, or fall back to a provider that actually has credentials (e.g., Voyage only when VOYAGE_API_KEY is set).
| use_user_rules: useRulesFromDatabase, // Backend fetches from DB when true | ||
| customApiKey: isApiKeyValid ? apiKey : undefined, | ||
| customModel: isApiKeyValid ? modelName : undefined, | ||
| customVendor: isApiKeyValid ? vendor : undefined, |
There was a problem hiding this comment.
[major]: The confirm flow never forwards the user’s custom credentials. streamConfirmOperation is still called without customApiKey/customModel/customVendor, and the ConfirmRequest type doesn’t have those fields, so destructive queries and confirmation retries run with the server default key/model even when the initial chat used a custom provider. To keep confirmation consistent (and avoid executing under the wrong account or failing because the vendor/model mismatch), extend ConfirmRequest, ChatInterface, and ChatService to pass through the optional custom credentials just like the initial chat request.
|
|
||
|
|
||
| @settings_router.post("/validate-api-key") | ||
| async def validate_api_key(request: Request, data: ValidateKeyRequest): |
There was a problem hiding this comment.
[major]: /api/validate-api-key is exposed without any authentication guard, so unauthenticated traffic can trigger arbitrary LiteLLM calls and abuse your infrastructure. The rest of the settings surface requires a token/session, but this route skips token_required (or any equivalent dependency). Please enforce authentication before issuing upstream validation requests to prevent anonymous DoS and keep the endpoint consistent with the rest of the API.
| ) | ||
| # Note: 'gemini' is accepted as vendor (Google's LiteLLM prefix) | ||
|
|
||
| try: |
There was a problem hiding this comment.
[major]: Every POST to /api/validate-api-key results in a LiteLLM completion with user‑supplied credentials, but there’s no throttling or caching. An attacker can hammer this unauthenticated endpoint to burn your upstream quota or tie up the server doing proxy calls. Please add per-user/IP rate limiting (or reuse your existing quota middleware) and short‑circuit repeated failures so validation attempts can’t be abused for free resource exhaustion.
| const vendorConfig = getVendorConfig(tempVendor); | ||
| // Auto-update model name when vendor changes | ||
| if (vendorConfig) { | ||
| setTempModelName(vendorConfig.exampleModel); |
There was a problem hiding this comment.
[major]: The useEffect tied to tempVendor always runs setTempModelName(vendorConfig.exampleModel) (lines 58‑62), so opening the modal or focusing the vendor select overwrites any previously saved custom model value with the default example. As a result, users can’t persist their own model names—saving after reopening will always revert to the default. Only auto-fill when the user actually changes vendors and the model field is empty, or track the previous vendor to avoid clobbering existing selections.
|
|
||
| try { | ||
| // Map vendor to LiteLLM prefix (google -> gemini) | ||
| const vendorPrefix = vendorType === 'google' ? 'gemini' : vendorType; |
There was a problem hiding this comment.
[MAJOR]: vendorPrefix rewrites Google to gemini, yet the modal already asks users to enter the full LiteLLM model name (e.g. gemini/gemini-3-pro-preview). Saving or validating with a Gemini key therefore produces vendor: "gemini" but model: "gemini/gemini-…", which the backend rejects because the provider segment is duplicated. Either stop overriding the vendor when the model already contains a prefix or split the input so you don’t double-prepend gemini/. As written, Gemini validations always fail.
| useEffect(() => { | ||
| const vendorConfig = AI_VENDORS.find(v => v.value === tempVendor); | ||
| if (vendorConfig) { | ||
| setTempModelName(vendorConfig.exampleModel); |
There was a problem hiding this comment.
[MAJOR]: This effect resets tempModelName to the vendor’s example model whenever tempVendor changes, including when the component simply loads the persisted settings. If a user previously saved gpt-4o-mini and returns later, the state change from openai→openai still reruns the effect and overwrites the field before they click Save. The practical result is that any custom model is silently lost. Only initialize to the example when no user value is present or when the user explicitly switches vendors.
| setValidationStatus(null); | ||
|
|
||
| try { | ||
| const vendorPrefix = vendorType === 'google' ? 'gemini' : vendorType; |
There was a problem hiding this comment.
[MAJOR]: The validation call forces const vendorPrefix = vendorType === 'google' ? 'gemini' : vendorType and then sends model: tempModelName. Because the UI instructs users to enter the full LiteLLM model string (already prefixed with gemini/), every Gemini validation ends up posting gemini/gemini-… to the backend, which doesn’t match the provider-derived vendor. That mismatch makes all Google validations fail even with correct keys. Either accept provider-less suffixes in the UI or detect existing prefixes and avoid double-prepending gemini/.
| // Map vendor to LiteLLM prefix (google -> gemini) | ||
| const vendorPrefix = request.customVendor === 'google' ? 'gemini' : request.customVendor; | ||
| // Format model name with vendor prefix for LiteLLM | ||
| requestBody.custom_model = `${vendorPrefix}/${request.customModel}`; |
There was a problem hiding this comment.
[MAJOR]: custom_model is built as ${vendorPrefix}/${request.customModel} even though the Settings UI already captures the full LiteLLM identifier (e.g. openai/gpt-4o-mini). This produces openai/openai/… for OpenAI and gemini/gemini/… for Google, so the backend rejects the model and silently falls back to defaults. Please send the user-provided model verbatim when it already contains /, or change the UI to collect only the suffix so you can safely prepend the prefix once.
|
|
||
| try { | ||
| // Map vendor to LiteLLM prefix (google -> gemini) | ||
| const vendorPrefix = vendorType === 'google' ? 'gemini' : vendorType; |
There was a problem hiding this comment.
[MAJOR]: vendorPrefix rewrites Google to gemini, yet the modal already asks users to enter the full LiteLLM model name (e.g. gemini/gemini-3-pro-preview). Saving or validating with a Gemini key therefore produces vendor: "gemini" but model: "gemini/gemini-…", which the backend rejects because the provider segment is duplicated. Either stop overriding the vendor when the model already contains a prefix or split the input so you don’t double-prepend gemini/. As written, Gemini validations always fail.
|
@copilot add descriptioon to the pr |
|
@galshubeli I've opened a new pull request, #403, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
Good feature idea — runtime model selection is valuable. Must fix:
Should fix:
|
No description provided.