Skip to content

Conversation

@tarun11Mavani
Copy link
Contributor

@tarun11Mavani tarun11Mavani commented Dec 11, 2025

Problem

As discussed in #17337, Segments are incorrectly deleted when any single replica reported zero valid documents, causing permanent data loss during server restarts and async state transitions where replicas had inconsistent validDocIds.

Solution

Added consensus requirement: only delete/compact segments when ALL replicas agree on validDoc counts
Prevents race conditions during OFFLINE→ONLINE transitions and tie-breaking logic divergence

Changes

  • New hasValidDocConsensus() method to verify replica agreement before operation to select segment for deletion or compaction

Tests

Deployed this change in a test cluster and see this warning message about skipping few segments due to validDocId mismatch.


2025-12-16T06:50:01+0530 [phx51] [phx51-2u] Inconsistent validDoc counts across replicas for segment: rta_tarunm_index__101__181__20251215T2107Z. Expected: 607695, but found: 607791
2025-12-16T06:50:01+0530 [phx51] [phx51-2u] Inconsistent validDoc counts across replicas for segment: rta_tarunm_index__53__142__20251215T0114Z. Expected: 423539, but found: 423645
2025-12-16T06:50:01+0530 [phx51] [phx51-2u] Skipping segment rta_tarunm_index__121__155__20251215T0859Z for table rta_tarunm_index_REALTIME - no consensus on validDoc counts across replicas
2025-12-16T06:50:01+0530 [phx51] [phx51-2u] Inconsistent validDoc counts across replicas for segment: rta_tarunm_index__75__184__20251215T2158Z. Expected: 595070, but found: 594974
2025-12-16T06:50:01+0530 [phx51] [phx51-2u] Inconsistent validDoc counts across replicas for segment: rta_tarunm_index__19__171__20251215T1626Z. Expected: 569283, but found: 569224
2025-12-16T06:50:01+0530 [phx51] [phx51-2u] CRC mismatch for segment: rta_tarunm_index__12__188__20251215T2329Z, (segmentZKMetadata=450178614, validDocIdsMetadata=3330760987)
2025-12-16T06:50:01+0530 [phx51] [phx51-2u] Skipping segment rta_tarunm_index__27__148__20251215T0504Z for table rta_tarunm_index_REALTIME - no consensus on validDoc counts across replicas
2025-12-16T06:50:01+0530 [phx51] [phx51-2u] Segment rta_tarunm_index__26__164__20251215T1301Z marked for deletion - all replicas agree it has zero valid documents
2025-12-16T06:50:01+0530 [phx51] [phx51-2u] Skipping segment rta_tarunm_index__25__156__20251215T0922Z for table rta_tarunm_index_REALTIME - no consensus on validDoc counts across replicas

@codecov-commenter
Copy link

codecov-commenter commented Dec 11, 2025

Codecov Report

❌ Patch coverage is 40.00000% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.22%. Comparing base (d47ea27) to head (ee87a38).

Files with missing lines Patch % Lines
...che/pinot/plugin/minion/tasks/MinionTaskUtils.java 42.85% 9 Missing and 3 partials ⚠️
...tcompactmerge/UpsertCompactMergeTaskGenerator.java 14.28% 4 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##             master   #17352   +/-   ##
=========================================
  Coverage     63.22%   63.22%           
- Complexity     1475     1477    +2     
=========================================
  Files          3147     3147           
  Lines        187575   187597   +22     
  Branches      28712    28717    +5     
=========================================
+ Hits         118590   118614   +24     
+ Misses        59794    59792    -2     
  Partials       9191     9191           
Flag Coverage Δ
custom-integration1 100.00% <ø> (ø)
integration 100.00% <ø> (ø)
integration1 100.00% <ø> (ø)
integration2 0.00% <ø> (ø)
java-11 63.20% <40.00%> (+0.01%) ⬆️
java-21 63.16% <40.00%> (-0.02%) ⬇️
temurin 63.22% <40.00%> (+<0.01%) ⬆️
unittests 63.22% <40.00%> (+<0.01%) ⬆️
unittests1 55.65% <100.00%> (+0.02%) ⬆️
unittests2 33.87% <40.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

List<ValidDocIdsMetadataInfo> replicaMetadataList = validDocIdsMetadataInfoMap.get(segmentName);

// Check consensus across all replicas before proceeding with any operations
if (!hasValidDocConsensus(segmentName, replicaMetadataList)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will reduce the chances but is it still possible that a subsequent restart/repalce and doSnapshot leads to a segment with 0 validDocIds to become non zero again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do this check in controller and immediately fire the delete call for selected segments. Since this is happening inside the same method, chances of a server restart and segment coming up with non-zero validDocIds between this period is very very low.

@tarun11Mavani
Copy link
Contributor Author

@tibrewalpratik17 @xiangfu0 can you please take a look at this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants