You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/hypernode-platform/nginx/how-to-resolve-rate-limited-requests-429-too-many-requests.md
+39-31Lines changed: 39 additions & 31 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,28 +27,30 @@ On Hypernode we currently differentiate between two rate limiting methods and th
27
27
- Rate limiting based on User Agents and requests per second (zone `bots`)
28
28
- Rate limiting based on requests per IP address (zone `zoneperip`)
29
29
30
-
Both methods are implemented using [this module](http://nginx.org/en/docs/http/ngx_http_limit_req_module.html)
30
+
Both methods are implemented using [NginX's limit_req module](http://nginx.org/en/docs/http/ngx_http_limit_req_module.html)
31
31
32
32
### Determining the Applied Rate Limiting Method
33
33
34
34
You can quickly determine which method of Rate Limiting was the cause of the request being 429'd since each time any of the rate-limiting methods are hit, a message with be logged in the Nginx error log.
35
35
36
-
To do so you first look up the request in the access logs, which can be done using the hypernode-parse-nginx-logs (**pnl**) command: `pnl --today --fields time,status,remote_addr,request --filter status=429`
36
+
To look for rate limiting messages in the error log, you can run the following command:
2020/06/07 13:33:37 [error] 7492#7492: *1590770 limiting connections by zone "zoneperip", client: 198.51.100.69, server: example.hypernode.io, request: "POST /admin/ HTTP/2.0", host: "example.hypernode.io"
37
41
38
-
Copy the IP address from the output generated by this command and look up the corresponding log entry in the aforementioned Nginx error log with `cat /var/log/nginx/error.log | grep "1.2.3.4"`
39
-
40
-
These entries look as follows:
42
+
```
41
43
42
44
A log entry where rate limit is applied to user-agents and requests per second (based on the `bots` zone):
**Note: do not remove the heartbeat entry! As this will break the monitoring of your Hypernode**
78
79
79
80
As you can see, this sorts all visitors into two groups:
80
81
81
-
- On the first (whitelist) line, you find the keywords that are exempt from the rate liming, like: ‘google’, ‘bing’, ‘heartbeat’, or ‘monitis.com’
82
-
-On the second (blacklist) line, you will find the keyword for generic and abusive bots and crawlers, which will always be rate limited, like crawler, spider, bot
82
+
- On the first line, the allowlist, you find the keywords that are exempt from the rate liming, like: `google`, `bing`, `heartbeat`, or `magereport.com`.
83
+
-The second line, contains keywords for generic and abusive bots and crawlers, which can trigger the ratelimiter, like `crawler`, `spider`, or `bot`
83
84
84
85
The keywords are separated by `|` characters since it is a regular expression.
85
86
86
-
### Whitelisting Additional User Agents
87
+
### Allowlisting Additional User Agents
87
88
88
-
To extend the whitelist, first determine what user agent you wish to add. Use the access log files to see what bots get blocked and which user agent identification it uses. Say the bot we want to add has the User Agent `SpecialSnowflakeCrawler 3.1.4`. Which contains the word ‘crawler’, so it matches the second regular expression and is labeled as a bot. Since the whitelist line overrules the blacklist line, the best way to allow this bot is to add their user agent to the whitelist instead of removing ‘crawler’ from the blacklist:
89
+
To extend the allowlist, first determine what user agent you wish to add. Use the access log files to see what bots get blocked and which user agent identification it uses. To find the user agent, you can use the following command:
2020-06-07T13:33:37+00:00 429 203.0.113.104 GET /api/ HTTP/2.0 SpecialSnowflakeCrawler 3.1.4
93
+
2020-06-07T13:35:37+00:00 429 203.0.113.104 GET /api/ HTTP/2.0 SpecialSnowflakeCrawler 3.1.4
94
+
```
89
95
90
-
```nginx
96
+
In the example above you can see that a bot with the User Agent `SpecialSnowflakeCrawler 3.1.4` triggered the ratelimiter. As it contains the word ‘crawler’, it matches the second regular expression and is labeled as a bot. Since the allowlist line overrules the denylist line, the best way to allow this bot is to add their user agent to the allowlist instead of removing ‘crawler’ from the blacklist:
Instead of adding the complete User Agent to the regex, it’s often better to limit it to just an identifying keyword, as shown above. The reason behind this is that the string is evaluated as a Regular Expression, which means that extra care needs to be taken when adding anything other than alphanumeric characters.
106
+
Instead of adding the complete User Agent to the regex, it’s often better to limit it to just an identifying keyword, as shown above. The reason behind this is that the string is evaluated as a Regular Expression, which means that extra care needs to be taken when adding anything other than alphanumeric characters. Also as user agents might change slightly over time, this may this bot will no longer be allowlisted over time.
100
107
101
108
### Known Rate Limited Plugins and Service Provider
102
109
@@ -115,28 +122,28 @@ Besides the above-known plugins that will hit the blacklisted keyword, `http.rat
115
122
116
123
To prevent a single IP from using all the FPM workers available simultaneously, leaving no workers available for other visitors, we implemented a per IP rate limit mechanism. This mechanism sets a maximum amount of PHP-FPM workers that can be used by one IP to 20. This way, one single IP address cannot deplete all the available FPM workers, leaving other visitors with an error page or a non-responding site.
117
124
118
-
**Please note:** if [Hypernode Managed Vhosts](hypernode-managed-vhosts.md) is enabled, only add the `http.conn_ratelimit` file in the Nginx root. Don't add it to the specific vhost as well, as these files will cancel each other out.
125
+
**Please note:** if [Hypernode Managed Vhosts](hypernode-managed-vhosts.md) is enabled, only add the `http.ratelimit` file in the Nginx root. Don't add it to the specific vhost as well, as this may cause conflicts.
119
126
120
127
### Exclude IP Addresses from the per IP Rate Limiting
121
128
122
-
In some cases, it might be necessary to exclude specific IP addresses from the per IP rate limiting. If you wish to exclude an IP address, you can do so by creating a config file called `/data/web/nginx/http.conn_ratelimit` with the following content:
129
+
In some cases, it might be necessary to exclude specific IP addresses from the per IP rate limiting. If you wish to exclude an IP address, you can do so by creating a config file called `/data/web/nginx/http.ratelimit` with the following content:
123
130
124
131
```nginx
125
132
geo $conn_limit_map {
126
133
default $remote_addr;
127
-
1.2.3.4 '';
134
+
198.51.100.69 '';
128
135
}
129
136
130
137
```
131
138
132
-
In this example, we have excluded the IP address **1.2.3.4** by setting an empty value in the form of `''`.
139
+
In this example, we have excluded the IP address **198.51.100.69** by setting an empty value in the form of `''`.
133
140
134
-
In addition to whitelisting one single IP address, it is also possible to whitelist a whole range of IP addresses. You can do this by using the so-called CIDR notation (e.g., 10.0.0.0/24 to whitelist all IP addresses within the range 10.0.0.0 to 10.0.0.255). In that case, you can use the following snippet in `/data/web/nginx/http.conn_ratelimit` instead:
141
+
In addition to excluding a single IP address, it is also possible to allow a whole range of IP addresses. You can do this by using the so-called CIDR notation (e.g., 198.51.100.0/24 to whitelist all IP addresses within the range 198.51.100.0 to 198.51.100.255). In that case, you can use the following snippet in `/data/web/nginx/http.ratelimit` instead:
135
142
136
143
```nginx
137
144
geo $conn_limit_map {
138
145
default $remote_addr;
139
-
10.0.0.0/24 '';
146
+
198.51.100.0/24 '';
140
147
}
141
148
142
149
```
@@ -145,7 +152,7 @@ geo $conn_limit_map {
145
152
146
153
When your shop performance is very poor, it’s possible all your FPM workers are busy just serving regular traffic. Handling a request takes so much time that all workers are continuously depleted by a small number of visitors. We highly recommend optimizing your shop for speed and a temporary upgrade to a bigger plan if this situation arises. Disabling the rate limit will not fix this problem but only change the error message from a `Too many requests` error to a timeout error.
147
154
148
-
For debugging purposes, however, it could be helpful to disable the per-IP connection limit for all IP’s. With the following snippet in `/data/web/nginx/http.conn_ratelimit` , it is possible to altogether disable IP based rate limiting:
155
+
For debugging purposes, however, it could be helpful to disable the per-IP connection limit for all IP’s. With the following snippet in `/data/web/nginx/http.ratelimit` , it is possible to altogether disable IP based rate limiting:
149
156
150
157
```nginx
151
158
geo $conn_limit_map {
@@ -158,7 +165,7 @@ geo $conn_limit_map {
158
165
159
166
### Exclude Specific URLs from the per IP Rate Limiting Mechanism
160
167
161
-
To exclude specific URLs from being rate-limited you can create a file `/data/web/nginx/before_redir.ratelimit_exclude` with the following content (this could also be done in a http.\* file):
168
+
To exclude specific URLs from being rate-limited you can create a file `/data/web/nginx/server.ratelimit` with the following content:
162
169
163
170
```nginx
164
171
set $ratelimit_request_url "$remote_addr";
@@ -172,14 +179,15 @@ if ($request_uri ~ ^\/elasticsearch.php$ ) {
172
179
173
180
```
174
181
175
-
In the example above, the URLs `*/rest/V1/example-call/*` and `/elasticsearch.php` are the ones that have to be excluded. You can now use the `$ratelimit_request` variable in the file `/data/web/nginx/http.conn_ratelimit` (see the example below) to exclude these URLs from the rate limiter and make sure that bots and crawlers will still be rate limited based on their User Agent.
182
+
In the example above, the URLs `*/rest/V1/example-call/*` and `/elasticsearch.php` are the ones that have to be excluded. You now have to use the `$ratelimit_request` variable as a default value in the file `/data/web/nginx/http.ratelimit` (see below) to exclude these URLs from the rate limiter and make sure that bots and crawlers will still be rate limited based on their User Agent.
176
183
177
184
```nginx
178
185
geo $conn_limit_map {
179
186
default $ratelimit_request_url;
180
187
}
181
-
182
188
```
189
+
You can also combine this with a regular allowlist, and exclude IP Addresses as described above.
190
+
183
191
184
192
### How to Serve a Custom Static Error Page to Rate Limited IP Addresses
0 commit comments