Skip to content

Commit d0f43c4

Browse files
author
Jonathan Visser
committed
Update documentation on nginx ratelimiting
1 parent 5c0adce commit d0f43c4

File tree

2 files changed

+90
-42
lines changed

2 files changed

+90
-42
lines changed

docs/hypernode-platform/nginx/how-to-resolve-rate-limited-requests-429-too-many-requests.md

Lines changed: 89 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ redirect_from:
1212

1313
# How to Resolve Rate Limited Requests (429 Too Many Requests)
1414

15-
To protect your Hypernode from all kinds of attacks, bots, brute forces, and scriptkiddies causing downtime, we've implemented several layers of rate limiting.
15+
To protect your Hypernode from all kinds of attacks, bots, brute forces, and script kiddies causing downtime, we've implemented several layers of rate limiting.
1616

1717
Most of these rate-limit methods only apply to bots. Still, to avoid FPM worker depletion, we [implemented a rate-limiting mechanism per IP](https://changelog.hypernode.com/release-4735-upper-limit-active-php-requests-per-ip/) to prevent one single IP from exhausting the available FPM workers.
1818

@@ -25,16 +25,16 @@ On Hypernode we currently differentiate between two rate limiting methods and th
2525
- Rate limiting based on User Agents and requests per second (zone `bots`)
2626
- Rate limiting based on requests per IP address (zone `zoneperip`)
2727

28-
Both methods are implemented using [NginX's limit_req module](http://nginx.org/en/docs/http/ngx_http_limit_req_module.html)
28+
Both methods are implemented using [Nginx's limit_req module](https://nginx.org/en/docs/http/ngx_http_limit_req_module.html)
2929

3030
### Determining the Applied Rate Limiting Method
3131

32-
You can quickly determine which method of Rate Limiting was the cause of the request being 429'd since each time any of the rate-limiting methods are hit, a message with be logged in the Nginx error log.
32+
You can quickly determine which method of rate limiting was the cause of the request being 429'd since each time any of the rate-limiting methods are hit, a message will be logged in the Nginx error log.
3333

3434
To look for rate limiting messages in the error log, you can run the following command:
3535

3636
```console
37-
$ grep limiting.requests /var/log/nginx/error.log
37+
$ grep -E 'limiting (requests|connections)' /var/log/nginx/error.log
3838
2020/06/07 13:33:37 [error] limiting requests, excess: 0.072 by zone "bots", client: 203.0.113.104, server: example.hypernode.io, request: "GET /api/ HTTP/2.0", host: "example.hypernode.io"
3939
2020/06/07 13:33:37 [error] limiting connections by zone "zoneperip", client: 198.51.100.69, server: example.hypernode.io, request: "POST /admin/ HTTP/2.0", host: "example.hypernode.io"
4040
```
@@ -51,7 +51,7 @@ A log entry where the rate limit is applied per IP address (based on the `zonepe
5151
2020/06/07 13:33:37 [error] limiting connections by zone "zoneperip", client: 198.51.100.69, server: example.hypernode.io, request: "POST /admin/ HTTP/2.0", host: "example.hypernode.io"
5252
```
5353

54-
**Note: Per IP rate limiting only applies to requests handled by PHP and not to the static content.**
54+
**Note: PerIP rate limiting only applies to requests handled by PHP and not to static content.**
5555

5656
## Rate Limiting for Bots and Crawlers
5757

@@ -63,7 +63,7 @@ Since our goal is not to block bots but to rate limit them nicely, we must be ca
6363

6464
### How to Configure the Bot Rate Limiter
6565

66-
Some bots are default exempt from rate limitings, like Google, Bing, and several monitoring systems. These bots never get rate limited since they usually abide by the robots.txt. However, some bots don't follow the instructions given in robots.txt or are used by abusive crawlers. These bots will be rate limited at one request per second. Any requests over this limit will then return a 429 error. If you want, you can override the system-wide configuration on who gets blocked and who does not. To get started, place the following in a config file called `/data/web/nginx/http.ratelimit`:
66+
Some bots are exempt from rate limiting by default, like Google, Bing, and several monitoring systems. These bots never get rate limited since they usually abide by the robots.txt. However, some bots don't follow the instructions given in robots.txt or are used by abusive crawlers. These bots will be rate limited at one request per second. Any requests over this limit will then return a 429 error. If you want, you can override the system-wide configuration on who gets blocked and who does not. To get started, place the following in a config file called `/data/web/nginx/http.ratelimit`:
6767

6868
```nginx
6969
map $http_user_agent $limit_bots {
@@ -77,8 +77,8 @@ map $http_user_agent $limit_bots {
7777

7878
As you can see, this sorts all visitors into two groups:
7979

80-
- On the first line, the allowlist, you find the keywords that are exempt from the rate liming, like: `google`, `bing`, `heartbeat`, or `magereport.com`.
81-
- The second line, contains keywords for generic and abusive bots and crawlers, which can trigger the ratelimiter, like `crawler`, `spider`, or `bot`
80+
- On the first line, the allowlist, you find the keywords that are exempt from rate limiting, like: `google`, `bing`, `heartbeat`, or `magereport.com`.
81+
- The second line contains keywords for generic and abusive bots and crawlers, which can trigger the rate limiter, like `crawler`, `spider`, or `bot`.
8282

8383
The keywords are separated by `|` characters since it is a regular expression.
8484

@@ -97,14 +97,14 @@ In the example above you can see that a bot with the User Agent `SpecialSnowflak
9797
```nginx
9898
map $http_user_agent $limit_bots {
9999
default '';
100-
~*(specialsnowflakecrawler|google|bing|heartbeat|uptimerobot|shoppimon|facebookexternal|monitis.com|Zend_Http_Client|magereport.com|SendCloud/|Adyen|ForusP|contentkingapp|node-fetch|Hipex) '';
100+
~*(specialsnowflakecrawler|google|bing|heartbeat|uptimerobot|shoppimon|facebookexternal|monitis.com|Zend_Http_Client|magereport.com|SendCloud/|Adyen|ForusP|contentkingapp|node-fetch|Hipex|xCore|Mollie) '';
101101
~*(http|crawler|spider|bot|search|Wget|Python-urllib|PHPCrawl|bGenius|MauiBot|aspiegel) 'bot';
102102
}
103103
```
104104

105-
Instead of adding the complete User Agent to the regex, it’s often better to limit it to just an identifying keyword, as shown above. The reason behind this is that the string is evaluated as a Regular Expression, which means that extra care needs to be taken when adding anything other than alphanumeric characters. Also as user agents might change slightly over time, this may this bot will no longer be allowlisted over time.
105+
Instead of adding the complete User Agent to the regex, it’s often better to limit it to just an identifying keyword, as shown above. The reason behind this is that the string is evaluated as a Regular Expression, which means that extra care needs to be taken when adding anything other than alphanumeric characters. Also, as user agents might change slightly over time, an overly specific string may stop matching and the bot will no longer be allowlisted.
106106

107-
### Known Rate Limited Plugins and Service Provider
107+
### Known Rate Limited Plugins and Service Providers
108108

109109
There are a couple of plugins and service providers that tend to hit the blacklisted keyword in the `http.ratelimit` snippet and, therefore, may need to be excluded individually. Below we have listed them and their User Agents for your convenience
110110

@@ -115,74 +115,122 @@ There are a couple of plugins and service providers that tend to hit the blackli
115115
- Mollie - `Mollie.nl HTTP client/1.0`
116116
- Screaming - `Screaming Frog SEO Spider`
117117

118-
Besides the above-known plugins that will hit the blacklisted keyword, `http.ratelimit` we know that Picqer will also hit the rate limiter because of being blocked by "**zoneperip**". Please find [here](https://picqer.com/files/ip-addresses.txt) the IP addresses of Picqer. You can exclude those IP addressess from hitting the rate limiter if you follow the [instructions](#known-rate-limited-plugins-and-service-provider).
118+
Besides the above-known plugins that will hit the blacklisted keyword, `http.ratelimit` we know that Picqer will also hit the rate limiter because of being blocked by "**zoneperip**". Please find [here](https://picqer.com/files/ip-addresses.txt) the IP addresses of Picqer. You can exclude those IP addresses from hitting the rate limiter if you follow the [instructions](#known-rate-limited-plugins-and-service-providers).
119119

120120
## Rate Limiting per IP Address
121121

122122
To prevent a single IP from using all the FPM workers available simultaneously, leaving no workers available for other visitors, we implemented a per IP rate limit mechanism. This mechanism sets a maximum amount of PHP-FPM workers that can be used by one IP to 20. This way, one single IP address cannot deplete all the available FPM workers, leaving other visitors with an error page or a non-responding site.
123123

124124
**Please note:** if [Hypernode Managed Vhosts](hypernode-managed-vhosts.md) is enabled, only add the `http.ratelimit` file in the Nginx root. Don't add it to the specific vhost as well, as this may cause conflicts.
125125

126-
### Exclude IP Addresses from the per IP Rate Limiting
126+
### How per‑IP limiting works (what you can influence)
127127

128-
In some cases, it might be necessary to exclude specific IP addresses from the per IP rate limiting. If you wish to exclude an IP address, you can do so by creating a config file called `/data/web/nginx/http.ratelimit` with the following content:
128+
The platform manages the global per‑IP limiter (zone and limits). You control only the key variable used for counting connections: `$limit_conn_per_ip`. If this variable is an empty string, the per‑IP limiter is effectively disabled for that request; if it contains the client IP, that request is counted towards that IP.
129+
130+
### Exclude IP addresses from the per‑IP rate limiting
131+
132+
In some cases, it might be necessary to exclude specific IP addresses from the per‑IP rate limiting. Define an allowlist and compose the effective key using a geo→map chain in `/data/web/nginx/http.ratelimit`:
129133

130134
```nginx
131-
geo $limit_conn_per_ip {
132-
default $remote_addr;
133-
198.51.100.69 '';
135+
# 1) Mark IPs/CIDRs that should be exempt from per‑IP limiting
136+
geo $limit_conn_ip_allow {
137+
default 1; # 1 = enforce limit
138+
1.2.3.4 0; # 0 = exempt
139+
}
140+
141+
# 2) Build the base key used for per‑IP limiting. If exempt → empty key disables per‑IP limiting for this request
142+
map $limit_conn_ip_allow $limit_conn_per_ip_base {
143+
0 '';
144+
1 $remote_addr;
145+
}
146+
147+
# 3) Exclude additional URLs from per-IP limiting
148+
map $request_uri $limit_conn_per_ip {
149+
default $limit_conn_per_ip_base;
150+
# ~^/rest/V1/example-call/ '';
151+
# ~^/elasticsearch\.php$ '';
152+
# ~^/graphql$ '';
134153
}
135154
```
136155

137-
In this example, we have excluded the IP address **198.51.100.69** by setting an empty value in the form of `''`.
156+
In this example, we have excluded the IP address **1.2.3.4** by emitting an empty key, no URL whitelists are active in the above example.
138157

139-
In addition to excluding a single IP address, it is also possible to allow a whole range of IP addresses. You can do this by using the so-called CIDR notation (e.g., 198.51.100.0/24 to whitelist all IP addresses within the range 198.51.100.0 to 198.51.100.255). In that case, you can use the following snippet in `/data/web/nginx/http.ratelimit` instead:
158+
In addition to excluding a single IP address, it is also possible to allow a whole range of IP addresses. You can do this by using the so-called CIDR notation (e.g., 198.51.100.0/24 to allowlist all IP addresses within the range 198.51.100.0 to 198.51.100.255). Extend the `geo` block accordingly:
140159

141160
```nginx
142-
geo $limit_conn_per_ip {
143-
default $remote_addr;
144-
198.51.100.0/24 '';
161+
geo $limit_conn_ip_allow {
162+
default 1;
163+
1.2.3.1 0;
164+
1.2.3.0/24 0;
145165
}
146166
```
147167

148-
### Disable per IP Rate Limiting
168+
### Disable perIP rate limiting
149169

150170
When your shop performance is very poor, it’s possible all your FPM workers are busy just serving regular traffic. Handling a request takes so much time that all workers are continuously depleted by a small number of visitors. We highly recommend optimizing your shop for speed and a temporary upgrade to a bigger plan if this situation arises. Disabling the rate limit will not fix this problem but only change the error message from a `Too many requests` error to a timeout error.
151171

152-
For debugging purposes, however, it could be helpful to disable the per-IP connection limit for all IP’s. With the following snippet in `/data/web/nginx/http.ratelimit` , it is possible to altogether disable IP based rate limiting:
172+
For debugging purposes, however, it could be helpful to disable the perIP connection limit for all IPs. With the following snippet in `/data/web/nginx/http.ratelimit`, it is possible to disable per‑IP rate limiting entirely by emitting an empty key for all requests:
153173

154174
```nginx
155-
geo $limit_conn_per_ip {
175+
map $request_uri $limit_conn_per_ip {
156176
default '';
157177
}
158178
```
159179

160-
**Warning: Only use this setting for debugging purposed! Using this setting on production Hypernodes is highly discouraged, as your shop can be easily taken offline by a single IP using slow and/or flood attacks.**
180+
**Warning: Only use this setting for debugging purposes! Using this setting on production Hypernodes is highly discouraged, as your shop can be easily taken offline by a single IP using slow and/or flood attacks.**
161181

162-
### Exclude Specific URLs from the per IP Rate Limiting Mechanism
182+
### Exclude specific URLs from the perIP rate limiting mechanism
163183

164-
To exclude specific URLs from being rate-limited you can create a file `/data/web/nginx/server.ratelimit` with the following content:
184+
To exclude specific URLs from being ratelimited, use the `map $request_uri $limit_conn_per_ip` you added above in `/data/web/nginx/http.ratelimit` and add/uncomment entries like:
165185

166186
```nginx
167-
set $ratelimit_request_url "$remote_addr";
168-
if ($request_uri ~ ^\/(.*)\/rest\/V1\/example-call\/(.*) ) {
169-
set $ratelimit_request_url '';
170-
}
171-
172-
if ($request_uri ~ ^\/elasticsearch.php$ ) {
173-
set $ratelimit_request_url '';
187+
map $request_uri $limit_conn_per_ip {
188+
default $limit_conn_per_ip_base;
189+
~^/rest/V1/example-call/ '';
190+
~^/elasticsearch\.php$ '';
191+
~^/graphql$ '';
174192
}
175193
```
176194

177-
In the example above, the URLs `*/rest/V1/example-call/*` and `/elasticsearch.php` are the ones that have to be excluded. You now have to use the `$ratelimit_request` variable as a default value in the file `/data/web/nginx/http.ratelimit` (see below) to exclude these URLs from the rate limiter and make sure that bots and crawlers will still be rate limited based on their User Agent.
195+
With these entries, the URLs `*/rest/V1/example-call/*`, `/elasticsearch.php`, and `/graphql` are excluded from per‑IP limiting. The platform’s global limiter will use `$limit_conn_per_ip` implicitly. You can also combine this with a regular allowlist, as described above.
178196

179-
```nginx
180-
geo $limit_conn_per_ip {
181-
default $ratelimit_request_url;
182-
}
197+
### Debugging per‑IP rate limiting
198+
199+
Define a custom JSON log format that records the effective per‑IP key and enable it. Add the JSON log format in `/data/web/nginx/http.ratelimit`:
200+
201+
Ensure the log directory exists:
202+
203+
```bash
204+
mkdir -p /data/web/log
205+
```
206+
207+
Then configure the JSON log format and enable the access log:
208+
209+
```text
210+
log_format custom escape=json '{'
211+
'"time":"$time_iso8601", '
212+
'"remote_addr":"$remote_addr", '
213+
'"host":"$http_host", '
214+
'"request":"$request", '
215+
'"status":"$status", '
216+
'"request_time":"$request_time", '
217+
'"user_agent":"$http_user_agent", '
218+
'"limit_conn_per_ip":"$limit_conn_per_ip"'
219+
'}';
220+
access_log /data/web/log/nginx-custom custom;
183221
```
184222

185-
You can also combine this with a regular allowlist, and exclude IP Addresses as described above.
223+
How to read it:
224+
225+
- "limit_conn_per_ip" empty: per‑IP limiter disabled (allowlisted IP/CIDR or URL exclusion)
226+
- Rejections from the per‑IP limiter are logged to the error log, not the access log.
227+
228+
Inspect and correlate recent rejections with keys seen in the access log:
229+
230+
```bash
231+
grep -E 'limiting (requests|connections)' /var/log/nginx/error.log | tail -n 50
232+
tail -n 200 /data/web/log/nginx-custom | jq -r '. | "\(.remote_addr) \(.request) \(.limit_conn_per_ip)"' | tail -n 50
233+
```
186234

187235
### How to Serve a Custom Static Error Page to Rate Limited IP Addresses
188236

docs/hypernode-platform/php/what-limits-apply-to-active-php-requests-per-ip.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,4 +20,4 @@ When one IP uses up most or all of the available workers, this causes a processi
2020

2121
Previously we configured a limit of `vCPUs * 5 - 2`. For our largest Hypernode plans, this could theoretically mean one IP using up to 99% of the available workers. This is undesired behaviour, but it can happen when a lot of people are accessing the admin pages from one office IP. This is why we have chosen to set the limit at 30 workers per IP.
2222

23-
Users that overstep this limit will be served a [429 too many requests status code](../nginx/how-to-resolve-rate-limited-requests-429-too-many-requests.md). You can always circumvent this per IP rate-limiting by [whitelisting IP's in the NGINX config](../nginx/how-to-resolve-rate-limited-requests-429-too-many-requests.md#exclude-ip-addresses-from-the-per-ip-rate-limiting).
23+
Users that overstep this limit will be served a [429 too many requests status code](../nginx/how-to-resolve-rate-limited-requests-429-too-many-requests.md). You can always circumvent this per IP rate-limiting by [whitelisting IP's in the NGINX config](../nginx/how-to-resolve-rate-limited-requests-429-too-many-requests.md#rate-limiting-per-ip-address).

0 commit comments

Comments
 (0)