fix(ai): improves handling of llm context window limit by mvilanova · Pull Request #5849 · Netflix/dispatch

mvilanova · 2025-03-20T18:57:31Z

Summary of Changes to AI Service Module

Key Improvements

1. Model Token Limit Management

Added get_model_token_limit() function that returns maximum token capacity for different LLM models
Included support for latest models (GPT-4o, Claude 3.5/3.7 Sonnet)
Implemented a safety buffer parameter (default: 5%) to prevent hitting absolute limits

2. Dynamic Token Truncation

Updated truncate_prompt() to use model-specific limits instead of fixed values
Enhanced function signature to accept the model token limit as a parameter
Improved logging to show the actual token limit being applied

3. Smarter Prompt Processing

Modified prompt processing in signal and incident summary generation
Each function now dynamically calculates appropriate token limits based on the model in use
Applied consistent approach across different generation functions

Benefits

Model-Specific Handling: Each model gets appropriate token limits based on its capabilities
Safety Margins: Buffer prevents errors from edge cases or slight miscalculations
Maintainability: Centralized token limit management makes updates easier
Future-Proofing: Simple to add new models as they become available

These changes provide a more robust approach to managing context windows across different LLM models while maximizing available context space.

fix(ai): improves handling of llm context window limit

71b3520

mvilanova added the bug Something isn't working label Mar 20, 2025

mvilanova requested review from a user and whitdog47 March 20, 2025 18:57

mvilanova self-assigned this Mar 20, 2025

Merge branch 'main' into fix/llm-content-length-exceeded

bb47bd4

ghost approved these changes Mar 20, 2025

View reviewed changes

mvilanova merged commit d6ea58a into main Mar 20, 2025
9 checks passed

mvilanova deleted the fix/llm-content-length-exceeded branch March 20, 2025 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(ai): improves handling of llm context window limit#5849

fix(ai): improves handling of llm context window limit#5849
mvilanova merged 2 commits intomainfrom
fix/llm-content-length-exceeded

mvilanova commented Mar 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mvilanova commented Mar 20, 2025

Summary of Changes to AI Service Module

Key Improvements

1. Model Token Limit Management

2. Dynamic Token Truncation

3. Smarter Prompt Processing

Benefits

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant