-
Notifications
You must be signed in to change notification settings - Fork 27
Open
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed
Description
Problem
LLM API calls can fail due to:
- Rate limiting (429 errors)
- Temporary network issues
- Service unavailability (503 errors)
- Timeout errors
Currently, these failures cause immediate workflow termination without retry attempts.
Proposed Solution
Implement retry logic with exponential backoff for transient failures:
# flo_ai/llm/retry.py
import time
from typing import Callable, TypeVar, Optional
from functools import wraps
T = TypeVar('T')
class RetryConfig:
def __init__(
self,
max_retries: int = 3,
initial_delay: float = 1.0,
max_delay: float = 60.0,
exponential_base: float = 2.0,
jitter: bool = True
):
self.max_retries = max_retries
self.initial_delay = initial_delay
self.max_delay = max_delay
self.exponential_base = exponential_base
self.jitter = jitter
def with_retry(config: Optional[RetryConfig] = None):
"""Decorator for retrying LLM API calls with exponential backoff"""
if config is None:
config = RetryConfig()
def decorator(func: Callable[..., T]) -> Callable[..., T]:
@wraps(func)
def wrapper(*args, **kwargs) -> T:
last_exception = None
for attempt in range(config.max_retries + 1):
try:
return func(*args, **kwargs)
except (RateLimitError, TimeoutError, ServiceUnavailableError) as e:
last_exception = e
if attempt == config.max_retries:
raise
# Calculate delay with exponential backoff
delay = min(
config.initial_delay * (config.exponential_base ** attempt),
config.max_delay
)
# Add jitter to prevent thundering herd
if config.jitter:
delay *= (0.5 + random.random() * 0.5)
logger.warning(
f"Attempt {attempt + 1}/{config.max_retries} failed: {e}. "
f"Retrying in {delay:.2f}s..."
)
time.sleep(delay)
raise last_exception
return wrapper
return decoratorUsage
# In LLM client
class OpenAIClient:
@with_retry(RetryConfig(max_retries=3, initial_delay=1.0))
def generate(self, prompt: str) -> str:
return self.client.chat.completions.create(...)YAML Configuration
agents:
- name: "my_agent"
model: "gpt-4"
retry:
enabled: true
max_retries: 3
initial_delay: 1.0
max_delay: 60.0
exponential_base: 2.0
jitter: trueBenefits
- ✅ Improved reliability for production workflows
- ✅ Automatic recovery from transient failures
- ✅ Configurable per-agent
- ✅ Prevents cascading failures
- ✅ Better user experience (no manual retries)
Implementation Checklist
- Create retry decorator with exponential backoff
- Add retry configuration to agent schema
- Integrate with all LLM clients (OpenAI, Anthropic, Gemini)
- Add retry metrics/logging
- Update documentation
- Add tests for retry logic
Related Issues
- [BUG] Gemini agents are looping multiple times, even after getting tool response #172 (Gemini looping) - Retry logic could help with transient failures
- Add support for max_tool_count and max iterations from yaml #141 (YAML configuration) - Retry config should be YAML-configurable
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or requesthelp wantedExtra attention is neededExtra attention is needed