feat: Add Rate Limiting Middleware to Prevent LLM API Overuse#56
Open
SandeepChauhan00 wants to merge 3 commits intoINCF:mainfrom
Open
feat: Add Rate Limiting Middleware to Prevent LLM API Overuse#56SandeepChauhan00 wants to merge 3 commits intoINCF:mainfrom
SandeepChauhan00 wants to merge 3 commits intoINCF:mainfrom
Conversation
5 tasks
Collaborator
|
Clean implementation — slowapi middleware with configurable RATE_LIMIT env var, proper 429 handler with retry_after, and useful request/response logging. Dependency added correctly to pyproject.toml. Good understanding of the codebase. |
Collaborator
|
This PR has merge conflicts with the current main branch, likely due to recent merges touching main.py. Please rebase against main and resolve the conflicts — the implementation itself is good and will be merged once clean. |
This was referenced Mar 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds rate limiting middleware to the
/api/chatendpoint using slowapi to prevent uncontrolled LLM API usage, protect against abuse, and improve production readiness.Closes #55
Problem
The
/api/chatendpoint inbackend/main.pyhad no rate limiting. Any user could send unlimited concurrent requests, leading to:Changes Made
backend/main.pyslowapirate limiter with per-IP tracking@limiter.limit(RATE_LIMIT)decorator to/api/chatendpoint429 Too Many Requestsexception handler with user-friendly messageRequestparameter tochat_endpoint(required by slowapi)RATE_LIMITenvironment variable (default:10/minute)pyproject.toml"slowapi>=0.1.9"to project dependencies.env.templateRATE_LIMIT=10/minuteconfiguration variableHow It Works
429with a user-friendly JSON response.envTesting
slowapiimports correctlymain.pyConfiguration
RATE_LIMIT10/minuteSupports any format:
5/second,100/hour,1000/dayAcceptance Criteria
/api/chatendpoint inbackend/main.py.envfile429response with user-friendly error message