-
Notifications
You must be signed in to change notification settings - Fork 4
feat(services): add configurable batch size with file container rotation #189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
y-lakhdar
wants to merge
6
commits into
feat/configurable-batch-size-infrastructure
Choose a base branch
from
feat/configurable-batch-size-services
base: feat/configurable-batch-size-infrastructure
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+1,202
−125
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
23deee5
feat(services): add configurable batch size with file container rotation
y-lakhdar 5ea3977
merge
y-lakhdar de2271d
docs: replace UPGRADE_NOTES with ConfigureBatchSize sample
y-lakhdar c6e17b9
docs: update doc
y-lakhdar 7c18c02
run spotless
y-lakhdar 45059b9
run spotless
y-lakhdar File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,169 @@ | ||
| # Configuration Guide | ||
|
|
||
| This document describes the available configuration options for the Coveo Push API Java Client. | ||
|
|
||
| ## Batch Size Configuration | ||
|
|
||
| The batch size controls how much data is accumulated before creating a file container and pushing to Coveo. The default is **5 MB**. The maximum allowed is **256 MB** (Stream API limit). | ||
|
|
||
| ### Configuration Methods | ||
|
|
||
| There are two ways to configure the batch size: | ||
|
|
||
| #### 1. System Property (Runtime Configuration) | ||
|
|
||
| Set the `coveo.push.batchSize` system property to configure the default batch size globally for all service instances: | ||
|
|
||
| **Java Command Line:** | ||
|
|
||
| ```bash | ||
| java -Dcoveo.push.batchSize=134217728 -jar your-application.jar | ||
| ``` | ||
|
|
||
| **Within Java Code:** | ||
|
|
||
| ```java | ||
| // Set before creating any service instances | ||
| System.setProperty("coveo.push.batchSize", "134217728"); // 128 MB in bytes | ||
| ``` | ||
|
|
||
| **Maven/Gradle Build:** | ||
|
|
||
| ```xml | ||
| <!-- pom.xml --> | ||
| <properties> | ||
| <argLine>-Dcoveo.push.batchSize=134217728</argLine> | ||
| </properties> | ||
| ``` | ||
|
|
||
| ```groovy | ||
| // build.gradle | ||
| test { | ||
| systemProperty 'coveo.push.batchSize', '134217728' | ||
| } | ||
| ``` | ||
|
|
||
| **Example Values:** | ||
|
|
||
| - `5242880` = 5 MB (default) | ||
| - `268435456` = 256 MB (maximum) | ||
| - `134217728` = 128 MB | ||
| - `67108864` = 64 MB | ||
| - `33554432` = 32 MB | ||
| - `10485760` = 10 MB | ||
|
|
||
| #### 2. Constructor Parameter (Per-Instance Configuration) | ||
|
|
||
| Pass the `maxQueueSize` parameter when creating service instances: | ||
|
|
||
| ```java | ||
| // UpdateStreamService with custom 128 MB batch size | ||
| UpdateStreamService service = new UpdateStreamService( | ||
| catalogSource, | ||
| backoffOptions, | ||
| null, // userAgents (optional) | ||
| 128 * 1024 * 1024 // 128 MB in bytes | ||
| ); | ||
|
|
||
| // PushService with custom batch size | ||
| PushService pushService = new PushService( | ||
| pushEnabledSource, | ||
| backoffOptions, | ||
| 128 * 1024 * 1024 // 128 MB | ||
| ); | ||
|
|
||
| // StreamService with custom batch size | ||
| StreamService streamService = new StreamService( | ||
| streamEnabledSource, | ||
| backoffOptions, | ||
| null, // userAgents (optional) | ||
| 128 * 1024 * 1024 // 128 MB | ||
| ); | ||
| ``` | ||
|
|
||
| ### Configuration Priority | ||
|
|
||
| When both methods are used: | ||
|
|
||
| 1. **Constructor parameter** takes precedence (if specified) | ||
| 2. **System property** is used as default (if set) | ||
| 3. **Built-in default** of 5 MB is used otherwise | ||
|
|
||
| ### Validation Rules | ||
|
|
||
| All batch size values are validated: | ||
|
|
||
| - ✅ **Maximum:** 256 MB (268,435,456 bytes) - API limit | ||
| - ✅ **Minimum:** Greater than 0 | ||
| - ❌ Values exceeding 256 MB will throw `IllegalArgumentException` | ||
| - ❌ Invalid or negative values will throw `IllegalArgumentException` | ||
|
|
||
| ### Examples | ||
|
|
||
| #### Example 1: Using System Property | ||
|
|
||
| ```java | ||
| // Configure globally via system property | ||
| System.setProperty("coveo.push.batchSize", "134217728"); // 128 MB | ||
|
|
||
| // All services will use 128 MB by default | ||
| UpdateStreamService updateService = new UpdateStreamService(catalogSource, backoffOptions); | ||
| PushService pushService = new PushService(pushEnabledSource, backoffOptions); | ||
| StreamService streamService = new StreamService(streamEnabledSource, backoffOptions); | ||
| ``` | ||
|
|
||
| #### Example 2: Override Per Service | ||
|
|
||
| ```java | ||
| // Set global default to 128 MB | ||
| System.setProperty("coveo.push.batchSize", "134217728"); | ||
|
|
||
| // Update service uses global default (128 MB) | ||
| UpdateStreamService updateService = new UpdateStreamService(catalogSource, backoffOptions); | ||
|
|
||
| // Push service overrides with 64 MB | ||
| PushService pushService = new PushService(pushEnabledSource, backoffOptions, 64 * 1024 * 1024); | ||
|
|
||
| // Stream service uses global default (128 MB) | ||
| StreamService streamService = new StreamService(streamEnabledSource, backoffOptions); | ||
| ``` | ||
|
|
||
| ### When to Adjust Batch Size | ||
|
|
||
| **Use smaller batches (32-64 MB) when:** | ||
|
|
||
| - Network bandwidth is limited | ||
| - Memory is constrained | ||
| - Processing many small documents | ||
| - You want more frequent progress updates | ||
|
|
||
| **Use larger batches (128-256 MB) when:** | ||
|
|
||
| - Network bandwidth is high | ||
| - Processing large documents or files | ||
| - You want to minimize API calls | ||
| - Maximum throughput is needed | ||
|
|
||
| **Keep default (5 MB) when:** | ||
|
|
||
| - You're unsure | ||
| - Memory is a concern | ||
| - You want predictable, frequent pushes | ||
|
|
||
| ### Configuration Property Reference | ||
|
|
||
| | Property Name | Description | Default Value | Valid Range | | ||
| | ---------------------- | --------------------------- | ---------------- | -------------- | | ||
| | `coveo.push.batchSize` | Default batch size in bytes | `5242880` (5 MB) | 1 to 268435456 | | ||
|
|
||
| ## Additional Configuration | ||
|
|
||
| ### Environment Variables | ||
|
|
||
| The following environment variables can be used for general configuration: | ||
|
|
||
| - `COVEO_API_KEY` - API key for authentication | ||
| - `COVEO_ORGANIZATION_ID` - Organization identifier | ||
| - `COVEO_PLATFORM_URL` - Custom platform URL (if needed) | ||
|
|
||
| Refer to the Coveo Platform documentation for complete environment configuration options. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| import com.coveo.pushapiclient.*; | ||
|
|
||
| import java.io.IOException; | ||
|
|
||
| /** | ||
| * Demonstrates how to configure the batch size for document uploads. | ||
| * | ||
| * The batch size controls how much data accumulates before automatically | ||
| * creating a file container and pushing to Coveo. Default is 5 MB, max is 256 MB. | ||
| */ | ||
| public class ConfigureBatchSize { | ||
|
|
||
| public static void main(String[] args) throws IOException, InterruptedException { | ||
|
|
||
| PlatformUrl platformUrl = new PlatformUrlBuilder() | ||
| .withEnvironment(Environment.PRODUCTION) | ||
| .withRegion(Region.US) | ||
| .build(); | ||
|
|
||
| CatalogSource catalogSource = CatalogSource.fromPlatformUrl( | ||
| "my_api_key", "my_org_id", "my_source_id", platformUrl); | ||
|
|
||
| // Option 1: Use default batch size (5 MB) | ||
| UpdateStreamService defaultService = new UpdateStreamService(catalogSource); | ||
|
|
||
| // Option 2: Configure batch size via constructor (50 MB) | ||
| int fiftyMegabytes = 50 * 1024 * 1024; | ||
| UpdateStreamService customService = new UpdateStreamService( | ||
| catalogSource, | ||
| new BackoffOptionsBuilder().build(), | ||
| null, | ||
| fiftyMegabytes); | ||
|
|
||
| // Option 3: Configure globally via system property (affects all services) | ||
| // Run with: java -Dcoveo.push.batchSize=52428800 ConfigureBatchSize | ||
| // This sets 50 MB for all service instances that don't specify a size | ||
|
|
||
| // Use the service | ||
| DocumentBuilder document = new DocumentBuilder("https://my.document.uri", "My document title") | ||
| .withData("these words will be searchable"); | ||
|
|
||
| customService.addOrUpdate(document); | ||
| customService.close(); | ||
| } | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.