Bug 2008333 - IrrelevantDataRemoval may stop early depending on repository order #9155
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If the IrrelevantDataRemoval strategy encounters an empty repository during the process, the next repository is currently skipped (https://bugzilla.mozilla.org/show_bug.cgi?id=2008333).
In this PR, I updated the strategy so that it iterates through all candidate repositories without skipping any.
However, before landing this PR, we should first complete the work to improve the data cycling performance. Once this issue is fixed, the previously skipped repositories will start to be cleaned up. There are about 130 such repositories, and if each repository takes around 2~3 minutes, the total runtime could increase significantly.
Now, the data does not accumulate permanently. Any skipped data will eventually be removed by MainRemovalStrategy, and the target data size for this strategy is relatively small compared to the others.
It seems reasonable to delay landing this PR until the data cycling improvements are completed.