-
Notifications
You must be signed in to change notification settings - Fork 0
Document EventBridge routing architecture and Lambda message processing #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
drernie
wants to merge
7
commits into
main
Choose a base branch
from
eventbridge-routing
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Customer (FL109) followed EventBridge docs but package creation not working. Analysis shows: - EventBridge rule firing correctly - File indexing works - Package indexing broken - Input transformer missing (sending raw CloudTrail format) - PackagerQueue not subscribed to SNS Documentation gaps identified for testing and resolution. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Based on customer interaction transcript: - No Input Transformer was added (confirmed @ 52:20) - PackagerQueue subscriptions handled automatically by Quilt - SNS policy fix (events.amazonaws.com) was the only change needed - Quilt processes raw CloudTrail events natively Test plan created for quilt-staging environment: - Bucket: aneesh-test-service (us-east-1) - Tests EventBridge → SNS → SQS without Input Transformer - Verifies SNS policy is the critical configuration - Captures actual event format for documentation Ready to execute test to confirm findings. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
## Summary Customer's S3 events weren't reaching Quilt indexer via EventBridge. Root cause: EventBridge rule 'cloudtrail-to-sns' was disabled. ## Solution Enabled the rule with one command: ```bash aws events enable-rule --name cloudtrail-to-sns --region us-east-1 ``` ## Test Results ✅ EventBridge rule triggered: 1 event ✅ SNS published: 1 message ✅ SQS received and processed successfully ## Key Findings - CloudTrail→EventBridge integration is automatic with event selectors - Infrastructure was already correctly configured - Always check rule states before investigating complex issues ## Changes - Add SUCCESS-REPORT.md with complete resolution details - Add config-quilt-eventbridge-test.toml (working configuration) - Reorganize folder: backup-policies/, test-artifacts/, obsolete-reports/ - Rewrite README.md for concise, standalone reference - Archive superseded investigation documents 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Files renamed to show investigation progression: 01-customer-issue-summary.md (2025-12-29 11:00) 02-local-test-setup.md (2025-12-29 11:01) 03-test-plan-staging.md (2025-12-29 11:08) 04-config-quilt-eventbridge-test.toml (2025-12-29 12:28) 05-ACTION-ITEMS.md (2025-12-29 12:43) 06-SUCCESS-REPORT.md (2025-12-29 13:03) - Initial fix (enabled rule) 07-README.md (2025-12-29 21:18) - Complete fix summary 08-FAILURE_REPORT.md (2025-12-29 22:17) - Deep dive into Lambda issue 09-documented-steps.md (2025-12-29 22:18) - Public documentation Chronological order shows: 1. Customer report & initial investigation 2. Test planning & execution 3. First success (enabling EventBridge rule) 4. Discovery of deeper Lambda compatibility issue 5. Documentation of all findings
Major updates: - 07-README.md: Remove false 'RESOLVED' status, clarify 3-layer problem - 10-input-transformer-hypothesis.md: Complete analysis of Input Transformers Key findings: - Infrastructure fixes complete (EventBridge + SNS subscriptions) - Application issue identified: ManifestIndexer lacks SNS unwrapping - Input Transformers transform BEFORE SNS wrapping (insufficient alone) - Lambda code fix required in Platform 1.66+ Lessons learned: 1. Metrics ≠ end-to-end success (intermediate success, final failure) 2. Input Transformers cannot eliminate SNS wrapping layer 3. Test with real workflows (package creation), not synthetic events 4. Two event sources caused flaky testing (S3 direct + EventBridge) 5. All Lambdas need consistent SNS message handling Documentation includes: - Complete Lambda code audit (4 Lambdas analyzed) - Rigorous testing strategy (5 tests with isolation requirements) - S3 Event Notification management (when/how to disable) - Version-specific behavior (≤1.65 vs ≥1.66) - Production deployment guidance with rollback plans
Comprehensive documentation of EventBridge → SNS → SQS → Lambda flow, replacing hypothesis-focused analysis with architectural guide. Key content: - Message transformation chain and SNS wrapping behavior - Lambda processing patterns from code review (4 Lambda types) - Two event sources problem and testing isolation requirements - Three solution approaches with version-specific guidance - Complete testing strategy with common mistakes to avoid - S3 Event Notification management and production deployment Core findings: - ManifestIndexer (≤1.65) crashes due to missing SNS unwrapping - Input Transformers help SearchHandler but don't solve SNS issue - Testing requires isolating event sources to avoid false positives 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Comprehensive documentation of EventBridge → SNS → SQS → Lambda routing architecture, replacing hypothesis-focused analysis with an architectural guide focused on the core problem.
Key Changes:
10-input-transformer-hypothesis.md→10-eventbridge-routing.mdCore Documentation Sections
Message Flow Architecture
Lambda Processing Patterns
Analyzed 4 Lambda types from actual code:
The Two Event Sources Problem
Why testing was flaky: production environments may have both:
Critical insight: Files appeared in search via direct S3 notifications, masking EventBridge routing failures.
Solutions & Approaches
Solution 1: Fix Lambda Code (Recommended)
Solution 2: Input Transformers (Limited Use)
Solution 3: Dual Format Support (Comprehensive)
Testing Strategy
Critical Testing Principle: ALWAYS isolate event sources to avoid false positives
4 Test Scenarios:
Common Testing Mistakes:
Production Deployment
Version-Specific Recommendations:
Impact
This documentation:
Related Issues
🤖 Generated with Claude Code