How to Handle Rate Limiting in n8n Workflows (Complete Pattern Guide)
The first time you run a workflow against a real API with real data, you will probably hit a rate limit. 429 Too Many Requests, 403 Quota Exceeded, or a silent slowdown where every other request fails. n8n does not handle rate limits automatically; you have to design your workflow to respect them. Here are the patterns that actually work.
Understand the Rate Limit First
Before designing anything, read the API docs. Rate limits come in three flavors: per-second (e.g., 10 requests per second), per-minute (e.g., 60 requests per minute), and per-day quotas (e.g., 10,000 requests per day). Some APIs have all three simultaneously. Your pattern differs depending on which limit is binding.
Pattern 1: Built-in Retry with Backoff
The simplest pattern. On the HTTP Request or API node, enable "Retry On Fail" with 3 to 5 retries and exponential backoff. For transient 429s, this often resolves by itself because the retry delay lets the rate limit window reset.
Drawbacks: only handles transient limits. If you are persistently over the rate, retries just keep failing. Good for bursty workflows that occasionally spike over limit, bad for workflows that systematically exceed the limit.
Pattern 2: Wait Node Between Requests
For per-second limits, insert a Wait node between each request. If the limit is 10 per second, wait 110ms between requests. This keeps you well under the limit with headroom for clock skew.
Best for small item counts where sequential processing is acceptable. Breaks down when you have 10,000 items because the workflow takes too long.
Pattern 3: Split In Batches + Wait
For per-minute or per-hour limits. Use Split In Batches to process N items at a time, then a Wait node, then loop back. If the limit is 60 per minute, batches of 10 with 10-second waits respects the limit and runs efficiently.
This is the workhorse pattern for bulk processing. Tune batchSize and wait interval based on your specific rate limit. Leave 20 percent headroom so occasional latency spikes do not push you over.
Rate Limit Pattern by Volume and Limit Type
Pattern 4: Respect Retry-After Header
Well-designed APIs return a Retry-After header on 429 responses telling you how long to wait. Most n8n HTTP nodes ignore this by default. Use a Code node after the HTTP Request to check for 429, read the Retry-After header (seconds or HTTP date), and Wait the specified duration before retrying.
This is the most API-friendly pattern. It adapts dynamically to the API's current load rather than using static timeouts.
Pattern 5: Token Bucket with External Storage
For distributed n8n setups (queue mode with multiple workers), per-worker rate limiting is not enough because all workers share the same API limit. Use a shared token bucket: Redis, a database counter, or an external rate-limiting service. Before each request, check and decrement a counter. If the counter is empty, wait.
Overhead is real (extra round-trip per request), but it is the only way to correctly enforce limits across multiple workers. Use only when needed.
Pattern 6: Queue-Based Decoupling
For workflows triggered by user actions (e.g., form submission triggers a CRM sync), decouple the user-facing workflow from the rate-limited API call. The trigger workflow writes a job to a queue (Redis, database table, external queue service). A separate scheduled workflow reads from the queue and processes jobs at a rate that respects the API limit.
Benefits: the user gets instant acknowledgement, the API respects limits regardless of spike volume, failures can be retried without involving the user. Drawbacks: more architecture to maintain.
Handling Daily Quotas
If the API limit is per-day (e.g., 10,000 requests per day), monitor your usage and throttle if you are approaching the cap. A simple pattern: at the start of each workflow execution, query your usage database for today's count. If you are above a safety threshold (e.g., 90 percent of daily quota), defer the work to tomorrow or alert an operator.
Caching to Reduce Request Volume
The best rate limit fix is not making the request at all. For lookups that do not change often (company data, product catalogs, reference data), cache the result in a local database or Redis for hours or days. A 1-hour cache hit rate of 80 percent means you are making 5x fewer API calls.
Use the Redis nodes in n8n or a simple Airtable cache table. Even a crude cache dramatically reduces rate limit pressure.
Impact of Each Pattern on Rate Limit Compliance
Monitoring Rate Limit Usage
Log every API call with timestamp and endpoint. Build a dashboard showing requests per minute for each external API. When you approach a limit, you see it before hitting it and can adjust. Without visibility, you only learn about rate limits when workflows start failing.
Many APIs return current usage in response headers (X-RateLimit-Remaining, X-RateLimit-Reset). Capture these and alert when usage approaches thresholds. Cheaper than waiting for 429s.
When to Negotiate a Higher Limit
If your workflows consistently bump against rate limits and you are already using efficient patterns, contact the API provider and ask for a higher tier. Many providers have informal or negotiable rate limits above the public defaults. A one-line email sometimes doubles your quota at no cost.
Join 215+ AI Agency Owners
Get free access to our all-in-one outreach platform, AI content templates, and a community of builders landing clients in days.