I am facing an issue where our application is receiving 429 Too Many Requests responses from the NHS API Gateway even though we’re making what seems to be a very low number of requests (under 10 per minute).
We’ve implemented standard headers, authentication is working fine, and we’re not seeing this behavior in our test environment. But in production, we get throttled intermittently without clear cause.
We’ve checked for background jobs, retry storms, and other unexpected calls, but logs show consistent behavior well under any documented limit.
Is there any undocumented nuance around per-IP, per-user, or even per-resource throttling we might be missing? Could caching or some shared infrastructure be a factor?
Checked https://digital.nhs.uk/services/api-platform-CCSP Training guide related to this and found it quite informative.
Any insight into rate-limiting rules or ways to get clearer diagnostics from the API gateway would be a big help. Also, is there any recommended way to monitor our rate limit usage in real-time?
Welcome to the NHS England API Developer community.
The NHSE API Platform enforces rate limiting to protect services and ensure fair usage. There are some nuances, however, rate limits may vary by:
Environment (Sandbox, INT, Live/Production)
Client Application (based on client ID or ASID)
API Type (e.g., PDS, EPS, MESH, etc.)
Access Mode (Application-Restricted vs. User-Restricted)
Even if your traffic is low (e.g., <10 requests/min), you can still hit a 429 response if:
You share infrastructure with another system using the same client credentials or IP, and their usage pushes the total over the limit.
You’re sending bursty traffic (e.g., 10 requests within 1 second).
There is hidden retry logic (e.g., HTTP clients doing automatic retries).
Throttling is often per-ASID or client ID, not always per-IP
Some APIs may have stricter per-resource or per-endpoint limits, especially FHIR endpoints like GET /Patient or GET /Appointment.
Understanding the 429 Error : The 429 status code (“Too Many Requests”) is triggered when your request count exceeds the configured threshold in the given interval. Given that you’ve received a 429 error despite staying well under the limit, it’s possible that:
SpikeArrest is enforced in smaller increments : If the SpikeArrest policy is applied here, the per-minute rate might be effectively divided across seconds, which would restrict bursts within any single second.
Setting correct values: If you are enforcing the rate at, for example, 120 requests per minute, the SpikeArrest value should not simply be 120pm. Instead, Apigee documentation advises that if you need 120 requests per minute, set the SpikeArrest to 2ps (per second) (since 120 / 60 = 2).
Recommendations :
Verify Policy Configurations : Check the manifest settings for both the Quota and SpikeArrest policies, ensuring they align with your desired rate (e.g., 2ps for 120 requests per minute).
Consider Apigee Documentation Guidance : Apigee suggests configuring SpikeArrest in terms of per-second intervals rather than aggregating the total per minute.
If this doesn’t solve the issue, additional configuration tuning might be required, particularly in distinguishing between Quota and SpikeArrest limits in your manifest. Enable full request/response logging, including headers, especially during spikes to help further with diagnostics.
We could investigate further from our side if you can provide the following details:
Your Client ID or ASID
A timestamped log sample of the 429 responses
The API you are calling and environment (INT or PROD)
Hope the above helps.
Thanks,
NHS England API Platform team
Please note:The API Platform team can only address queries relevant to the NHS England API platform, including security, rate limiting, logging, monitoring and alerting. For any API specific queries, please reach out the relevant API teams.