Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
make check
to catch common errors. Fixed any that came up.Description:
This PR introduces request queueing to the concurrency limiter.
The concurrency limiter was designed to limit the number of simultaneous integration keys, service, and/or user-requests that are processed concurrently per-instance.
The present-day implementation limits concurrency, effectively isolating busy consumers, but all "pending" requests are then processed in random order. This results in latency spikes under load (per individual key) and 502 responses due to "backend timeout".
This change introduces a FIFO queue for all pending requests instead of the random group.
Rejected requests are still scoped per integration key/service/user as before but will now allow a distinction between actual hung requests and throttling.
Describe any introduced API changes:
Throttled requests will now return status code 429 instead of timing out and returning 502.