feat(connectors): implement Redshift Sink Connector with S3 staging (#2540) #2557

GaneshPatil7517 · 2026-01-13T03:14:56Z

Summary

Implements Issue #2540 - Redshift Sink Connector with S3 staging support.

Features

✅ S3 staging with automatic CSV file upload
✅ Redshift COPY command execution via PostgreSQL wire protocol
✅ IAM role authentication (recommended) or access key credentials
✅ Configurable batch size and compression (gzip, lzop, bzip2, zstd)
✅ Automatic table creation with customizable schema
✅ Retry logic with exponential backoff for transient failures
✅ Automatic cleanup of staged S3 files

Configuration Options

Option	Description	Default
`connection_string`	Redshift cluster connection URL	Required
`target_table`	Destination table name	Required
`iam_role`	IAM role ARN for S3 access	Optional*
`s3_bucket`	S3 bucket for staging	Required
`s3_region`	AWS region	Required
`batch_size`	Messages per batch	10000
`compression`	COPY compression format	None
`delete_staged_files`	Auto-cleanup toggle	true
`auto_create_table`	Create table if missing	true

*Either iam_role or aws_access_key_id+aws_secret_access_key required

Testing

14 unit tests passing
Clippy clean with -D warnings

Closes #2540

Implements Issue apache#2540 - Redshift Sink Connector with S3 staging support. Features: - S3 staging with automatic CSV file upload - Redshift COPY command execution via PostgreSQL wire protocol - IAM role authentication (recommended) or access key credentials - Configurable batch size and compression (gzip, lzop, bzip2, zstd) - Automatic table creation with customizable schema - Retry logic with exponential backoff for transient failures - Automatic cleanup of staged S3 files Configuration options: - connection_string: Redshift cluster connection URL - target_table: Destination table name - iam_role: IAM role ARN for S3 access (recommended) - s3_bucket/s3_region/s3_prefix: S3 staging location - batch_size: Messages per batch (default: 10000) - compression: COPY compression format - delete_staged_files: Auto-cleanup toggle (default: true) - auto_create_table: Create table if missing (default: true) Closes apache#2540

- Fix markdown lint issues in README.md (table formatting, blank lines, code fence language) - Fix trailing newline in Cargo.toml - Apply TOML formatting via taplo - Add missing dependencies to DEPENDENCIES.md (rust-s3, rxml, rxml_validation, static_assertions)

hubcio · 2026-01-14T13:14:14Z

Please write true integration e2e test, similarly to postgres.

GaneshPatil7517 · 2026-01-14T13:36:22Z

Please write true integration e2e test, similarly to postgres.

ok ill update....

- Add Redshift sink integration test using PostgreSQL (Redshift-compatible) and LocalStack for S3 - Add s3_endpoint config option to support custom endpoints (LocalStack, MinIO) - Add path-style S3 access for custom endpoints - Add localstack feature to testcontainers-modules - Create test configuration files for Redshift connector

- Add s3_endpoint: None to test_config() in lib.rs (fixes E0063) - Add endpoint parameter to S3Uploader tests in s3.rs - Fix formatting for long line in init_s3_uploader() - Add iggy_connector_redshift_sink to DEPENDENCIES.md - Add maybe-async, md5, minidom to DEPENDENCIES.md

Copilot

Pull request overview

This pull request implements a Redshift Sink Connector that loads data from Iggy streams into Amazon Redshift using S3 staging. The connector uploads batches of messages as CSV files to S3, executes Redshift COPY commands to load the data, and optionally cleans up staged files.

Changes:

Adds new Redshift sink connector with S3 staging support, IAM/access key authentication, and configurable batching
Implements retry logic with exponential backoff and automatic table creation
Adds integration tests using PostgreSQL as a Redshift simulator and LocalStack for S3

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
core/connectors/sinks/redshift_sink/src/lib.rs	Main connector implementation with batch processing, CSV generation, and COPY command execution
core/connectors/sinks/redshift_sink/src/config.rs	Configuration structure with validation for credentials, compression formats, and required fields
core/connectors/sinks/redshift_sink/src/s3.rs	S3 uploader for staging CSV files with path-style and virtual-hosted-style support
core/connectors/sinks/redshift_sink/Cargo.toml	Package definition with dependencies on sqlx, rust-s3, and chrono
core/connectors/sinks/redshift_sink/README.md	Comprehensive documentation with configuration reference and examples
core/integration/tests/connectors/redshift/mod.rs	Integration test setup with PostgreSQL and LocalStack containers
core/integration/tests/connectors/redshift/redshift_sink.rs	Basic integration test case
Cargo.toml	Adds redshift_sink to workspace members
DEPENDENCIES.md	Updates dependency list with new packages

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-15T07:03:14Z

core/connectors/sinks/redshift_sink/src/lib.rs

+        if self.config.delete_staged_files.unwrap_or(true)
+            && let Err(e) = s3_uploader.delete_file(&s3_key).await
+        {
+            warn!("Failed to delete staged file {}: {}", s3_key, e);
+        }
+
+        copy_result?;


The cleanup logic has a critical flaw. If execute_copy fails, the function returns early at line 204 (copy_result?) without attempting to delete the staged S3 file. This will leave orphaned files in S3 when COPY operations fail. The cleanup should occur regardless of the COPY command result, or at minimum should be attempted even on failure to avoid accumulating staging files.

Copilot · 2026-01-15T07:03:14Z

core/connectors/sinks/redshift_sink/src/config.rs

+
+        // Validate compression if specified
+        if let Some(compression) = &self.compression {
+            let valid = ["gzip", "lzop", "bzip2", "none", "zstd"];


Redshift COPY command does not support 'zstd' compression format. According to AWS documentation, Redshift COPY only supports GZIP, LZOP, BZIP2, and ZSTD is not listed as a valid compression option. This will cause runtime errors when users specify 'zstd' compression. Remove 'zstd' from the valid compression formats list.

Suggested change

let valid = ["gzip", "lzop", "bzip2", "none", "zstd"];

let valid = ["gzip", "lzop", "bzip2", "none"];

core/connectors/sinks/redshift_sink/src/lib.rs

Copilot · 2026-01-15T07:03:15Z

core/connectors/sinks/redshift_sink/README.md

+| `csv_delimiter` | Char | No | `,` | CSV field delimiter |
+| `csv_quote` | Char | No | `"` | CSV quote character |
+| `max_errors` | Integer | No | `0` | Max errors before COPY fails |
+| `compression` | String | No | `none` | Compression: `gzip`, `lzop`, `bzip2`, `zstd` |


Copilot · 2026-01-15T07:03:16Z

core/integration/tests/connectors/redshift/mod.rs

+    // Create the bucket via LocalStack S3 API using path-style URL
+    let client = reqwest::Client::new();
+    let create_bucket_url = format!("{s3_endpoint}/{bucket_name}");
+    let _ = client.put(&create_bucket_url).send().await;


The bucket creation result is silently ignored with let _ =. If bucket creation fails, subsequent tests will fail in confusing ways. The error should be handled explicitly with .expect() to provide clear test failure messages, similar to how other container setup failures are handled in this function.

Suggested change

let _ = client.put(&create_bucket_url).send().await;

client

.put(&create_bucket_url)

.send()

.await

.expect("Failed to create S3 bucket in LocalStack");

Copilot · 2026-01-15T07:03:16Z

core/connectors/sinks/redshift_sink/src/lib.rs

+            };
+
+            // Escape quotes in payload
+            let escaped_payload = payload_str.replace(quote, &format!("{quote}{quote}"));


Creating a new String via format! macro for every message is inefficient. Since quote is always a single character, this can be optimized to use a string literal or a pre-allocated string. Consider using a match statement or caching the escaped quote string.

core/connectors/sinks/redshift_sink/src/lib.rs

Critical fixes: - Change Rust edition from 2024 to 2021 in Cargo.toml - Fix S3 cleanup to happen regardless of COPY result (prevents orphaned files) Moderate fixes: - Remove zstd from valid compression options (not supported by Redshift) - Update README to remove zstd from compression list - Handle bucket creation error in integration tests with expect() - Log JSON serialization errors instead of silent unwrap_or_default() Performance: - Cache escaped quote string to avoid repeated format! allocations Windows compatibility (for local testing): - Add #[cfg(unix)] conditionals for Unix-specific code in sender/mod.rs

spetz · 2026-01-15T08:38:30Z

I'd greatly appreciate the actual human interaction in this PR, otherwise, I'm closing this.

Fixes clippy warning about unused 'runtime' field in test setup struct. The runtime field is kept for future test expansion.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

GaneshPatil7517 added 2 commits January 13, 2026 08:43

GaneshPatil7517 added 2 commits January 14, 2026 18:49

fix: correct alphabetical ordering of rust-s3 in DEPENDENCIES.md

8cc7748

Merge branch 'master' into feature/redshift-sink-connector

efae2b0

GaneshPatil7517 added 4 commits January 14, 2026 19:23

Merge branch 'master' into feature/redshift-sink-connector

5671347

Merge branch 'master' into feature/redshift-sink-connector

ffce03c

Copilot AI review requested due to automatic review settings January 15, 2026 06:59

Copilot started reviewing on behalf of GaneshPatil7517 January 15, 2026 07:00 View session

Copilot AI reviewed Jan 15, 2026

View reviewed changes

GaneshPatil7517 added 2 commits January 15, 2026 17:29

fix(integration): add #[allow(dead_code)] to RedshiftTestSetup struct

1e9e9c7

Fixes clippy warning about unused 'runtime' field in test setup struct. The runtime field is kept for future test expansion.

Merge branch 'master' into feature/redshift-sink-connector

04c299b

hubcio mentioned this pull request Jan 16, 2026

Implement Redshift Sink Connector with S3 staging #2540

Open

GaneshPatil7517 and others added 2 commits January 16, 2026 14:00

Update core/connectors/sinks/redshift_sink/src/lib.rs

0a49fba

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch 'master' into feature/redshift-sink-connector

3a18911

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(connectors): implement Redshift Sink Connector with S3 staging (#2540) #2557

feat(connectors): implement Redshift Sink Connector with S3 staging (#2540) #2557

Uh oh!

GaneshPatil7517 commented Jan 13, 2026

Uh oh!

hubcio commented Jan 14, 2026

Uh oh!

GaneshPatil7517 commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 15, 2026

Uh oh!

Copilot AI Jan 15, 2026

Uh oh!

Uh oh!

Copilot AI Jan 15, 2026

Uh oh!

Copilot AI Jan 15, 2026

Uh oh!

Copilot AI Jan 15, 2026

Uh oh!

Uh oh!

spetz commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	let valid = ["gzip", "lzop", "bzip2", "none", "zstd"];
	let valid = ["gzip", "lzop", "bzip2", "none"];

	\| `compression` \| String \| No \| `none` \| Compression: `gzip`, `lzop`, `bzip2`, `zstd` \|
	\| `compression` \| String \| No \| `none` \| Compression: `gzip`, `lzop`, `bzip2` \|

feat(connectors): implement Redshift Sink Connector with S3 staging (#2540) #2557

Are you sure you want to change the base?

feat(connectors): implement Redshift Sink Connector with S3 staging (#2540) #2557

Uh oh!

Conversation

GaneshPatil7517 commented Jan 13, 2026

Summary

Features

Configuration Options

Testing

Uh oh!

hubcio commented Jan 14, 2026

Uh oh!

GaneshPatil7517 commented Jan 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

spetz commented Jan 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants