Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,7 @@
"browsers/faq",
"info/concepts",
"info/pricing",
"info/rate-limiting",
"info/support",
"info/unikernels"
]
Expand Down
78 changes: 78 additions & 0 deletions info/rate-limiting.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
---
title: "Rate Limiting"
---

Kernel enforces rate limits on API requests to ensure fair usage and platform stability. When you exceed the rate limit for an endpoint, the API returns a `429 Too Many Requests` response.

## Rate limits by plan

Rate limits are applied per organization, per endpoint. Limits are expressed in requests per minute (RPM):

| Plan | Requests per minute |
| --- | --- |
| Developer (free) | 10 |
| Hobbyist | 25 |
| Start-Up | 100 |
| Enterprise | 250 |

Organizations on a trial use Start-Up rate limits regardless of the selected plan.

## Rate-limited endpoints

The following endpoints enforce rate limits:

| Endpoint | Method |
| --- | --- |
| `/browsers` | `POST` |

Additional endpoints (`POST /browser-pools`, `PUT /browser-pools/:id`, `POST /invocations`) have rate limiting infrastructure in place and may be enforced in the future.

## Response headers

When a rate-limited endpoint is called, the API includes these headers in the response:

| Header | Description | Example |
| --- | --- | --- |
| `X-RateLimit-Limit` | Maximum number of requests allowed (burst capacity) | `100` |
| `X-RateLimit-Remaining` | Number of requests remaining in the current window | `47` |

These headers are included on both successful and rate-limited responses for rate-limited endpoints.

When a request is rejected, the response also includes:

| Header | Description | Example |
| --- | --- | --- |
| `Retry-After` | Seconds to wait before retrying | `3` |

## How Retry-After is calculated

Kernel uses a token bucket algorithm for rate limiting. Each organization gets a bucket with capacity equal to the RPM limit. The bucket refills at a steady rate (capacity / 60 tokens per second).

The `Retry-After` value is the number of seconds until enough tokens have refilled to allow the request, with a minimum of 1 second.

## Example rate-limited response

```
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 3
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0

{
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Please retry later."
}
```

## Handling rate limits

When you receive a `429` response:

1. Read the `Retry-After` header to determine how long to wait
2. Wait for the specified number of seconds
3. Retry the request

For sustained workloads, use the `X-RateLimit-Remaining` header to proactively throttle requests before hitting the limit.

If you need higher rate limits, contact us about the Enterprise plan or request a custom override.