-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unblocks helix thread for online segment #15246
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #15246 +/- ##
============================================
+ Coverage 61.75% 63.60% +1.85%
- Complexity 207 1459 +1252
============================================
Files 2436 2773 +337
Lines 133233 156263 +23030
Branches 20636 23985 +3349
============================================
+ Hits 82274 99390 +17116
- Misses 44911 49384 +4473
- Partials 6048 7489 +1441
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
_allowConsumptionDuringCommit = !_realtimeTableDataManager.isPartialUpsertEnabled() ? true | ||
: _tableConfig.getUpsertConfig().isAllowPartialUpsertConsumptionDuringCommit(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this not checking for dedup?
Problem
Consider a scenario for Realtime Tables where there is a slow server and a fast server where fast server commits multiple segments but slow server is still consuming an old segment.
Slow server might run into a situation like below:

(since segments are not consumed in strict order based on sequence number, some later segment's consuming thread can block Helix transition thread of an older segment.)
Solution
Change semaphore acquire logic such that if a Helix threads fails to acquire the semaphore, check if the fast server has already committed the segment. If yes, simply return as eventually the slow server will receive a consuming -> online state transition message for the same segment.
(Currently dedup and partial-upsert enabled tables are being ignored for this problem because we want to ensures that new segment is not downloaded until the previous consuming segment releases the semaphore. (Although there is an existing bug in dedup and upsert where strict order of segment consumption is not guaranteed)).