-
Notifications
You must be signed in to change notification settings - Fork 74
MLE-26420 Can now perform incremental writes #1865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Copyright Validation Results ❌ Failed Files
⏭️ Skipped (Excluded) Files
✅ Valid Files
🛠️ GuidanceFollow these steps to fix the failed files:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements incremental write functionality for MarkLogic document batches, allowing documents to be skipped if their content hasn't changed. The implementation uses hash-based content comparison stored in a configurable MarkLogic field, with support for both Optic and eval-based hash retrieval strategies.
Key changes:
- Introduced
DocumentWriteSetFilterinterface for pre-write document set modification - Implemented
IncrementalWriteFilterwith builder pattern for customizable incremental write behavior - Added comprehensive test coverage including JSON canonicalization scenarios
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| marklogic-client-api/build.gradle | Added dependencies for JSON canonicalization and hash generation |
| marklogic-client-api/src/main/java/com/marklogic/client/datamovement/DocumentWriteSetFilter.java | New interface for filtering document write sets before writing |
| marklogic-client-api/src/main/java/com/marklogic/client/datamovement/WriteBatcher.java | Added withDocumentWriteSetFilter method to enable filter integration |
| marklogic-client-api/src/main/java/com/marklogic/client/datamovement/impl/WriteBatcherImpl.java | Integrated filter support into batch writing workflow |
| marklogic-client-api/src/main/java/com/marklogic/client/datamovement/impl/BatchWriter.java | Applied filter to document write sets before writing |
| marklogic-client-api/src/main/java/com/marklogic/client/datamovement/impl/BatchWriteSet.java | Implemented Context interface and added method for updating filtered write sets |
| marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteFilter.java | Core abstract implementation for incremental write filtering with hash-based comparison |
| marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteOpticFilter.java | Optic-based implementation for retrieving existing hash values |
| marklogic-client-api/src/main/java/com/marklogic/client/datamovement/filter/IncrementalWriteEvalFilter.java | Eval-based implementation for retrieving existing hash values |
| marklogic-client-api/src/main/java/com/marklogic/client/impl/okhttp/RetryIOExceptionInterceptor.java | Added handling for MarkLogicIOException in retry logic |
| marklogic-client-api/src/test/java/com/marklogic/client/datamovement/WriteNakedPropertiesTest.java | Moved test to datamovement package and simplified |
| marklogic-client-api/src/test/java/com/marklogic/client/datamovement/filter/IncrementalWriteTest.java | Integration tests for incremental write functionality |
| marklogic-client-api/src/test/java/com/marklogic/client/datamovement/filter/IncrementalWriteFilterTest.java | Unit tests for metadata handling in incremental writes |
| marklogic-client-api/src/test/java/com/marklogic/client/test/datamovement/IncrementalWriteTest.java | Removed old test file (moved to new location) |
| test-app/src/main/ml-config/databases/content-database.json | Added field and range index configuration for incremental write hash |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| { | ||
| "field-name": "incrementalWriteHash", |
Copilot
AI
Dec 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation: this block uses tabs while the surrounding code uses spaces. Should use spaces to match the existing style.
| { | ||
| "scalar-type": "string", |
Copilot
AI
Dec 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent indentation: this block uses tabs while the surrounding code uses spaces. Should use spaces to match the existing style.
|
|
||
| doc2 = IncrementalWriteFilter.addHashToMetadata(doc2, "theField", "abc123"); | ||
|
|
||
| assertEquals(metadata, doc1.getMetadata(), "doc1 should stillhave the original metadata object"); |
Copilot
AI
Dec 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected spelling of 'stillhave' to 'still have'.
| assertEquals(metadata, doc1.getMetadata(), "doc1 should stillhave the original metadata object"); | |
| assertEquals(metadata, doc1.getMetadata(), "doc1 should still have the original metadata object"); |
|
|
||
| for (DocumentWriteOperation doc : context.getDocumentWriteSet()) { | ||
| if (!DocumentWriteOperation.OperationType.DOCUMENT_WRITE.equals(doc.getOperationType())) { | ||
| newWriteSet.add(doc); |
Copilot
AI
Dec 30, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic error: non-DOCUMENT_WRITE operations are added to newWriteSet but then processing continues for all documents. This causes non-DOCUMENT_WRITE operations to be processed for hashing when they should be skipped. The continue statement is missing after adding non-DOCUMENT_WRITE operations.
| newWriteSet.add(doc); | |
| newWriteSet.add(doc); | |
| continue; |
Added DocumentWriteSetFilter as a generic interface for modifying a DocumentWriteSet before it's written. IncrementalWriteFilter is then the entry point, with a Builder for customizing its behavior. Also started moving some tests into "com.marklogic.client.datamovement" so we can have unit tests that verify protected methods.
975a342 to
8ea32e4
Compare
|
Going to break this up |
Added DocumentWriteSetFilter as a generic interface for modifying a DocumentWriteSet before it's written. IncrementalWriteFilter is then the entry point, with a Builder for customizing its behavior.
Also started moving some tests into "com.marklogic.client.datamovement" so we can have unit tests that verify protected methods.