feat: Add Rate Limiting Middleware to Prevent LLM API Overuse by SandeepChauhan00 · Pull Request #56 · INCF/knowledge-space-agent

SandeepChauhan00 · 2026-02-09T18:43:18Z

Summary

Adds rate limiting middleware to the /api/chat endpoint using slowapi to prevent uncontrolled LLM API usage, protect against abuse, and improve production readiness.

Closes #55

Problem

The /api/chat endpoint in backend/main.py had no rate limiting. Any user could send unlimited concurrent requests, leading to:

Uncontrolled Google Gemini/Vertex AI API costs
Unhandled rate limit errors from LLM providers
Vulnerability to bot abuse or accidental request loops

Changes Made

`backend/main.py`

Added slowapi rate limiter with per-IP tracking
Added @limiter.limit(RATE_LIMIT) decorator to /api/chat endpoint
Added custom 429 Too Many Requests exception handler with user-friendly message
Added Request parameter to chat_endpoint (required by slowapi)
Added structured logging for incoming requests, responses, and errors
Rate limit is configurable via RATE_LIMIT environment variable (default: 10/minute)

`pyproject.toml`

Added "slowapi>=0.1.9" to project dependencies

`.env.template`

Added RATE_LIMIT=10/minute configuration variable

How It Works

Each client IP is tracked independently
When limit is exceeded, returns 429 with a user-friendly JSON response
All requests are logged with client IP, session ID, and processing time
Rate limit is fully configurable without code changes via .env

Testing

Verified slowapi imports correctly
Verified rate limiting code present in main.py
No breaking changes to existing endpoints or functionality

Configuration

Variable	Default	Description
`RATE_LIMIT`	`10/minute`	Max requests per IP per time window

Supports any format: 5/second, 100/hour, 1000/day

Acceptance Criteria

Rate limiting middleware added to /api/chat endpoint in backend/main.py
Limit is configurable via .env file
Returns clear 429 response with user-friendly error message
Basic request count logging added for monitoring
Existing tests still pass after integration

Fixes INCF#50

QuantumByte-01 · 2026-03-12T12:15:56Z

Clean implementation — slowapi middleware with configurable RATE_LIMIT env var, proper 429 handler with retry_after, and useful request/response logging. Dependency added correctly to pyproject.toml. Good understanding of the codebase.

QuantumByte-01 · 2026-03-12T12:16:49Z

This PR has merge conflicts with the current main branch, likely due to recent merges touching main.py. Please rebase against main and resolve the conflicts — the implementation itself is good and will be merged once clean.

SandeepChauhan00 added 3 commits February 1, 2026 11:10

feat(backend): add SSE streaming endpoint for real-time chat responses

3e6871a

Fixes INCF#50

feat: add rate limiting middleware to /api/chat endpoint

db21991

fix: restore pyproject.toml metadata, add slowapi dependency

54eeacd

SandeepChauhan00 mentioned this pull request Feb 9, 2026

Add structured conversation trace and export support for debugging and reproducibility #54

Open

SandeepChauhan00 mentioned this pull request Feb 17, 2026

[Feature] Add Rate Limiting Middleware to Prevent LLM API Overuse #55

Open

5 tasks

This was referenced Mar 12, 2026

Feature/rate limiting middleware #71

Closed

test: add unit test suite for backend API endpoints (#63) #64

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add Rate Limiting Middleware to Prevent LLM API Overuse#56

feat: Add Rate Limiting Middleware to Prevent LLM API Overuse#56
SandeepChauhan00 wants to merge 3 commits intoINCF:mainfrom
SandeepChauhan00:feature/rate-limiting

SandeepChauhan00 commented Feb 9, 2026 •

edited

Loading

Uh oh!

QuantumByte-01 commented Mar 12, 2026

Uh oh!

QuantumByte-01 commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

SandeepChauhan00 commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes Made

backend/main.py

pyproject.toml

.env.template

How It Works

Testing

Configuration

Acceptance Criteria

Uh oh!

QuantumByte-01 commented Mar 12, 2026

Uh oh!

QuantumByte-01 commented Mar 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

SandeepChauhan00 commented Feb 9, 2026 •

edited

Loading

`backend/main.py`

`pyproject.toml`

`.env.template`