Skip to content

Conversation

@moizpgedge
Copy link
Contributor

@moizpgedge moizpgedge commented Jan 9, 2026

Summary

This PR improves the stability of the Postgres image test flow by fixing runner/architecture config issues, cleaning up an invalid config, updating build/test dependencies, and making the Postgres readiness check more reliable.

Changes

  1. Fix GitHub Actions runner label and architecture naming to match the correct values
  2. Remove invalid lolor.node configuration
  3. Update dependencies and adjust the Makefile accordingly
  4. Improve Postgres readiness checks to reduce flakiness and keep the container stable during tests

Testing

  1. Verified workflow runs on the intended runner/arch setup
  2. Confirmed Postgres readiness behavior is more reliable during container startup

Summary by CodeRabbit

  • Documentation

    • Added comprehensive testing docs to README with local test instructions, limitations, and multi-architecture guidance; added a test badge.
  • Tests

    • Improved CI test workflow to better map and validate architectures, produce clearer structured output, and abort on unknown architectures.
    • Refactored test runner for modular phases, stabilized container handling, clearer phase/output formatting, and updated Go tooling directive.
  • Chores

    • Minor Makefile cleanup.

✏️ Tip: You can customize this high-level summary in your review settings.

@moizpgedge moizpgedge force-pushed the Feat/PLAT-362/Adding-tests-to-postgres-images-project branch from 2b1097d to 49271dc Compare January 9, 2026 10:38
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the Postgres image test suite to improve reliability and code organization. The changes fix GitHub Actions runner configuration issues, remove invalid Postgres settings, and enhance the robustness of container readiness checks.

Key changes:

  • Refactored test execution flow by extracting helper functions for improved maintainability
  • Updated GitHub Actions workflow to correctly map architecture names to runner labels
  • Enhanced Postgres readiness checks with improved connection verification and stabilization logic

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/main.go Major refactoring: extracted functions for parsing flags, printing headers/summaries, running tests, and waiting for containers; improved Postgres readiness verification; removed invalid lolor.node configuration
tests/go.mod Simplified Go version configuration by consolidating go directive and toolchain
Makefile Removed trailing newline
.github/workflows/test_images.yaml Fixed architecture naming and runner label mapping to support both user-friendly and canonical architecture names

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/main.go Outdated
// Then verify we can actually connect and query
exitCode, _, err := r.exec("psql -U postgres -d testdb -t -A -c 'SELECT 1'")
if err == nil && exitCode == 0 {
// Give PostgreSQL a moment to fully stabilize
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the comment explains the sleep, it should clarify why a 2-second delay is necessary beyond the successful query execution. Consider documenting what PostgreSQL state might not be fully stable after a successful query, or reference specific initialization steps that may still be in progress.

Suggested change
// Give PostgreSQL a moment to fully stabilize
// Give PostgreSQL a short grace period even after a successful readiness check.
// Although pg_isready and a trivial SELECT can succeed, background workers,
// extensions, and internal caches may still be initializing. Waiting a fixed
// 2 seconds helps ensure a stable database state and reduces test flakiness
// in subsequent operations that depend on a fully-initialized instance.

Copilot uses AI. Check for mistakes.
Copy link
Member

@maqeel75 maqeel75 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@moizpgedge moizpgedge changed the title Feat/plat 362/adding tests to postgres images project Adding tests to postgres images project Jan 9, 2026
@moizpgedge moizpgedge requested a review from Copilot January 9, 2026 14:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/main.go Outdated
Comment on lines 420 to 424
// Give PostgreSQL a short grace period even after a successful readiness check.
// Although pg_isready and a trivial SELECT can succeed, background workers,
// extensions, and internal caches may still be initializing. Waiting a fixed
// 2 seconds helps ensure a stable database state and reduces test flakiness
// in subsequent operations that depend on a fully-initialized instance.
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2-second sleep is a magic number hardcoded in the comment. Consider making this duration configurable or defining it as a named constant (e.g., 'postgresStabilizationPeriod') to improve maintainability and make the value easier to tune if needed.

Copilot uses AI. Check for mistakes.
required: true
architectures:
description: "Comma-separated list of architectures to test (amd64,arm64)"
description: "Comma-separated list of architectures to test (x86,arm)"
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description mentions 'x86,arm' but the code also accepts 'amd64,arm64' as valid inputs (lines 68-71). Update the description to reflect all accepted values, e.g., 'Comma-separated list of architectures to test (x86/amd64, arm/arm64)'.

Suggested change
description: "Comma-separated list of architectures to test (x86,arm)"
description: "Comma-separated list of architectures to test (x86/amd64, arm/arm64)"

Copilot uses AI. Check for mistakes.
@moizpgedge moizpgedge requested a review from Copilot January 9, 2026 14:22
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@moizpgedge moizpgedge requested a review from Copilot January 9, 2026 14:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

- Use correct runner label and architecture names
- Remove invalid lolor.node config
- Update dependencies and Makefile
- Improve Postgres readiness check and container stability

docs: improve test documentation and fix duplicate summary heading

- Enhance PostgreSQL stabilization delay comment with detailed explanation
- Remove duplicate markdown heading in GitHub Actions summary step

refactor: extract magic number to constant and improve workflow docs

- Extract 2-second sleep to postgresStabilizationPeriod constant for better maintainability
- Update architecture description to include all accepted values (x86/amd64, arm/arm64)

fix: rename summary job to prevent duplicate heading in GitHub Actions UI

- Rename job from 'summary' to 'results-summary' to avoid conflict
- Simplify step name from 'Test Summary' to 'Summary'
- Fixes duplicate 'summary summary' display in GitHub Actions summary section

fix: revert architecture description to x86,arm format

The code accepts x86/amd64 and arm/arm64 internally, but the description
should only show the primary input values (x86,arm) to avoid confusion.

fix: change step name to avoid duplicate heading and add testing docs

- Change step name from 'Summary' to 'Output' to prevent duplication with job name
- Add comprehensive Testing section to README with local and CI/CD instructions
- Document architecture limitations for local testing

fix: rename job from results-summary to results to remove summary word
@moizpgedge moizpgedge force-pushed the Feat/PLAT-362/Adding-tests-to-postgres-images-project branch from 2288af5 to 519fcb5 Compare January 9, 2026 14:50
@coderabbitai
Copy link

coderabbitai bot commented Jan 9, 2026

📝 Walkthrough

Walkthrough

Refactors the Go test runner into modular phases, adds Patroni-specific tests and Docker stabilization logic, updates CI workflow architecture mapping and output formatting, and expands README testing documentation.

Changes

Cohort / File(s) Summary
CI/CD Workflow
.github/workflows/test_images.yaml
Architecture renaming/mapping: arm/arm64 → runner ubuntu-24.04-arm with arch_display: arm; x86/amd64 → runner ubuntu-latest with arch_display: x86; unknown arch now errors. Matrix entries use arch_display. "Test Summary" step renamed to "Output" and now prints a structured table and pass/fail message.
Test Runner Refactoring
tests/main.go
Major rewrite: CLI flags, Docker client setup, stabilization delay, modular helpers (printHeader, printPhaseHeader, runEntrypointTests, runExtensionTests, printSummary), consolidated test builders (GetPostgreSQLTests, GetCommonExtensionTests, GetStandardOnlyTests), Patroni test phase with container lifecycle helpers, and waitForContainerCommand utility.
Documentation
README.md
Added CodeRabbit PR badge; new "Testing" section with local and Go-based run instructions, limitations (architecture notes), and CI/CD testing parameters.
Build Configuration
Makefile
Minor whitespace normalization: removed an empty line after cd tests in test-image target.
Go Module Management
tests/go.mod
Go version directive updated from go 1.24.0 to go 1.24.11; removed explicit toolchain line.

Sequence Diagram(s)

sequenceDiagram
  participant DevTool as Test CLI
  participant Docker as Docker Engine
  participant Container as Test Container (Postgres/Patroni)
  participant Runner as Test Routines
  participant Output as Console/GHA Output

  DevTool->>Docker: setupDockerClient()
  DevTool->>Docker: start Container(s) (image, config)
  Docker-->>Container: create & run
  Runner->>Container: waitForContainerCommand / readiness checks
  Container-->>Runner: readiness OK
  Runner->>Container: execute test commands
  Container-->>Runner: test outputs / exit codes
  Runner->>DevTool: aggregate results
  DevTool->>Output: print structured table and pass/fail
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I hopped through code to tidy the trails,
Laid out the tests, set steady sails,
Docker and Patroni, a synchronized dance,
Architectures mapped, each runner’s chance,
Hooray — the pipeline leaps and prevails! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title refers to the general area of adding tests but does not capture the core improvements: fixing runner/architecture config, updating dependencies, improving Postgres readiness checks, and refactoring test flow for stability.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @tests/main.go:
- Around line 351-357: The call to r.cli.ContainerExecInspect is ignoring its
error so inspectResp.ExitCode may default to 0 and produce a false success;
change the inspect call to capture the error (e.g., inspectResp, err :=
r.cli.ContainerExecInspect(r.ctx, execID.ID)), check if err != nil and return or
propagate a descriptive error (or log and treat as failure) before inspecting
ExitCode; ensure execResp.Close() still runs and that you only print success and
return nil when err == nil and inspectResp.ExitCode == 0.
🧹 Nitpick comments (4)
.github/workflows/test_images.yaml (1)

137-141: Consider handling additional result states.

The condition only checks for success. The needs.test.result can also be skipped or cancelled, which would currently show as failed.

♻️ Suggested improvement
-          if [[ "${{ needs.test.result }}" == "success" ]]; then
+          result="${{ needs.test.result }}"
+          if [[ "$result" == "success" ]]; then
             echo "✅ **All tests passed!**" >> $GITHUB_STEP_SUMMARY
+          elif [[ "$result" == "skipped" ]]; then
+            echo "⏭️ **Tests were skipped.**" >> $GITHUB_STEP_SUMMARY
+          elif [[ "$result" == "cancelled" ]]; then
+            echo "🚫 **Tests were cancelled.**" >> $GITHUB_STEP_SUMMARY
           else
             echo "❌ **Some tests failed.** Check the job logs for details." >> $GITHUB_STEP_SUMMARY
           fi
tests/main.go (3)

49-68: Consider closing the Docker client.

The Docker client returned by setupDockerClient() should be closed when done to release resources.

♻️ Suggested fix
 func main() {
 	image, flavor := parseFlags()
 
 	printHeader(image, flavor)
 
 	cli, ctx := setupDockerClient()
+	defer cli.Close()
+
 	defaultRunner := &DefaultEntrypointRunner{
 		cli:   cli,
 		ctx:   ctx,
 		image: image,
 	}

166-173: Minor: Avoid rebuilding test suite for summary.

buildTestSuite() is called again just to count tests. Consider passing the test count as a parameter or caching the test suite.


308-311: Consider logging cleanup errors for debugging.

Errors from ContainerStop and ContainerRemove are silently ignored. While this is acceptable for best-effort cleanup, logging could help debug issues in CI.

♻️ Suggested improvement
 func (r *DefaultEntrypointRunner) cleanupContainer(containerID string) {
-	r.cli.ContainerStop(r.ctx, containerID, container.StopOptions{})
-	r.cli.ContainerRemove(r.ctx, containerID, container.RemoveOptions{})
+	if err := r.cli.ContainerStop(r.ctx, containerID, container.StopOptions{}); err != nil {
+		log.Printf("Warning: error stopping container: %v", err)
+	}
+	if err := r.cli.ContainerRemove(r.ctx, containerID, container.RemoveOptions{}); err != nil {
+		log.Printf("Warning: error removing container: %v", err)
+	}
 }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a75566 and 5e509d5.

📒 Files selected for processing (5)
  • .github/workflows/test_images.yaml
  • Makefile
  • README.md
  • tests/go.mod
  • tests/main.go
💤 Files with no reviewable changes (1)
  • Makefile
🧰 Additional context used
🪛 LanguageTool
README.md

[uncategorized] ~162-~162: The official name of this software platform is spelled with a capital “H”.
Context: ...D Testing The GitHub Actions workflow (.github/workflows/test_images.yaml) can be tri...

(GITHUB)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (8)
README.md (1)

117-166: Well-structured testing documentation.

The new Testing section provides comprehensive guidance for both local and CI/CD testing, including architecture limitations and workflow parameters. The documentation aligns well with the test runner implementation and workflow configuration.

tests/main.go (5)

18-23: Good use of named constant with documentation.

The postgresStabilizationPeriod constant is well-documented, explaining why the delay is necessary. This improves maintainability over a magic number.


418-439: Robust readiness check with stabilization.

The two-phase readiness check (pg_isready followed by an actual query) combined with the stabilization period is a solid approach to reduce test flakiness.


458-466: Good defensive check for container state.

Checking if the container is still running before executing commands provides clearer error messages when containers crash unexpectedly during tests.


530-536: Clean modular test suite construction.

Breaking tests into getPostgreSQLTests(), getCommonExtensionTests(), and getStandardOnlyTests() improves readability and maintainability.


693-698: LGTM.

The expectSuccess helper provides a clean default for tests that only need to verify the command succeeded without checking output.

.github/workflows/test_images.yaml (1)

65-77: Good architecture mapping with backward compatibility.

The logic correctly handles both naming conventions (x86/amd64 and arm/arm64), fails fast for unknown values, and uses the valid ubuntu-24.04-arm runner label for ARM architecture.

tests/go.mod (1)

3-3: No issues found. Go version 1.24.11 is a valid security and stability release (December 2, 2025) and correctly matches the version specified in the GitHub Actions workflow (test_images.yaml).

- Capture and check error from ContainerExecInspect before checking ExitCode
- If inspect fails, retry instead of treating default ExitCode 0 as success
- Prevents false positive results when container exec inspection fails
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @tests/main.go:
- Around line 325-366: In DefaultEntrypointRunner.waitForContainerCommand, don’t
treat ExitCode as final immediately after ContainerExecAttach/Close; after
calling r.cli.ContainerExecInspect(execID.ID) check inspectResp.Running and, if
true, sleep interval and continue the loop (only treat inspectResp.ExitCode == 0
as success when Running is false). This mirrors the safer behavior in the exec
helper (or alternatively read the attach response until EOF before inspecting),
and prevents the race where ExitCode is uninitialized while the exec is still
running.
🧹 Nitpick comments (1)
tests/main.go (1)

534-702: Excellent test suite modularization.

The split into modular test builders (getPostgreSQLTests, getCommonExtensionTests, getStandardOnlyTests) improves code organization and makes it easier to maintain and extend the test suite. The expectSuccess helper effectively reduces duplication for tests that only need to verify successful execution.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5e509d5 and 8235d8a.

📒 Files selected for processing (1)
  • tests/main.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Codacy Static Code Analysis
🔇 Additional comments (5)
tests/main.go (5)

18-23: Good addition for test stability.

The stabilization period constant is well-documented and helps address test flakiness by allowing background workers and extensions to fully initialize after basic readiness checks pass.


49-192: Excellent modular refactoring.

The restructured main function and new helper functions improve code organization and maintainability. The separation into parseFlags, setupDockerClient, runEntrypointTests, runExtensionTests, and printSummary creates a clear test execution pipeline.


422-443: Excellent defensive approach to readiness checking.

The two-stage verification (pg_isready + connection test) followed by the stabilization period is a robust solution for reducing test flakiness. The inline comments clearly explain why the grace period is necessary even after basic readiness checks pass.

This aligns well with the PR objectives to improve reliability of readiness checks.


462-471: Good defensive check for container health.

The container inspection before attempting exec prevents cryptic errors when the container has stopped or crashed. The descriptive error message including the container status will help with debugging test failures.


241-323: Well-structured Patroni testing implementation.

The modular approach with dedicated helper functions (createPatroniTestConfig, startPatroniContainer, cleanupContainer) makes the Patroni test flow clear and maintainable. The REST API health check using curl is appropriate for verifying Patroni initialization.

Note: The reliability of this test depends on addressing the race condition in waitForContainerCommand (see separate comment on lines 325-366).

@moizpgedge moizpgedge requested a review from Copilot January 9, 2026 18:36
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

"-c", "track_commit_timestamp=on",
"-c", "max_replication_slots=10",
"-c", "max_wal_senders=10",
"-c", "snowflake.node=1",
Copy link

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment on line 380 mentions that these arguments are passed to postgres and handled by the entrypoint, but the removed lolor.node=1 configuration suggests some settings may be invalid. Verify that snowflake.node=1 is a valid PostgreSQL configuration parameter, as non-standard parameters could cause startup failures.

Suggested change
"-c", "snowflake.node=1",

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants