-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core: fix unrecoverable freezes of rabbit's consumer #10594
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## dev #10594 +/- ##
==========================================
- Coverage 81.93% 81.93% -0.01%
==========================================
Files 1079 1079
Lines 107380 107376 -4
Branches 737 737
==========================================
- Hits 87984 87978 -6
- Misses 19356 19358 +2
Partials 40 40
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
06726c5
to
ee9d8aa
Compare
b9647fc
to
c8bd86f
Compare
54ad82d
to
17f2e86
Compare
Hard fix (kind of) to kill the process (and let orchestrator restart) if: * rabbit shuts down (triggering consumer's ShutdownCallback) * or an exception before starting basicConsume() goes up to main() (even if other threads may run) Bump amqp-client on the way as it doesn't hurt Signed-off-by: Pierre-Etienne Bougué <[email protected]>
From hand-tests, shutdown is already covered by the System.exit in App.java::main(). Signed-off-by: Pierre-Etienne Bougué <[email protected]>
…llback Hard fix (kind of) to kill the process (and let orchestrator restart) if an exception goes all the way up to the DeliverCallback. For example when not able to reach editoast for infra reload. This will release unacked messages and move them back to ready (instead of keeping them unacked until the worker exits). Signed-off-by: Pierre-Etienne Bougué <[email protected]>
3ecda5e
to
e6c5f8c
Compare
A third commit was pushed, and the main comment updated (and all rebased on No more work is planned on this, please read the main comment, any feedback is welcome, and we should be good to go 🙏 |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for handling this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
Improve the expressiveness when handling exceptions in rabbit worker. After #10594 and #9439. Also minor code/log improvements. Signed-off-by: Pierre-Etienne Bougué <[email protected]>
Improve the expressiveness when handling exceptions in rabbit worker. After #10594 and #9439. Also minor code/log improvements. Signed-off-by: Pierre-Etienne Bougué <[email protected]>
Hard fix (kind of) to kill the process if a thread's exception goes up to
main()
, or if rabbit's cancel/shutdown notification is received (even if other threads may run) and let orchestrator restart it.Bump amqp-client on the way as it doesn't hurt.
Fixes #8621
Also reproduced and fixed the case that leads to the following (different) core logs:
Hand-tested
core-req-all
queue to initiate cancel notification: ✅core logs "consumer cancelled" then stops (should cover Core message consumer fails and blocks the scenario #8621).Understanding of previous and current work
Previous:
Current:
catch
in themain()
, then the finalreturn
) and it may be an idea to use shutdown notification to exit cleanlyLooks like some improvements may be done (to be explored later, sorted by ROI)
isRecoverable
, cleanup consumer logs, properly close channels and connections)