Address Security issues in Connector and Agent #4380

rithin-pullela-aws · 2025-10-30T22:38:58Z

Description

This PR addresses some security issues:
Connector:

Prevent 500 errors when connector params like Action type, method, and URL are NULL.
Prevent User input reflection in connector. protocol
Prevent 500 level errors when Connector creds and backend roles are not not valid

Agent:

Prevent printing unknown Agent ID
Prevent printing unknown trace/ message ID in traces API
Prevent 500 error when agent ID is unknown

Related Issues

Resolves #[Issue number to be closed when this PR is merged]

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

Bug Fixes
- Improved validation for connector credentials and backend roles to enforce proper data types.
- Enhanced error messaging when agents or messages cannot be retrieved.
Improvements
- Connector action parsing now handles missing fields more gracefully.
- Stricter protocol validation enforces correct connector configuration.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

mingshl

LGTM

rithin-pullela-aws · 2025-10-31T00:18:48Z

Failed because of irrelevant error:

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:test' --tests 'org.opensearch.ml.action.prediction.PredictionITTests.testPredictionWithDataFrame_FitRCF' -Dtests.seed=29336B35042075 -Dtests.security.manager=false -Dtests.locale=en-MV -Dtests.timezone=Europe/Samara -Druntime.java=24

PredictionITTests > testPredictionWithDataFrame_FitRCF FAILED
    CircuitBreakingException[Disk Circuit Breaker is open, please check your resources!]
        at __randomizedtesting.SeedInfo.seed([29336B35042075:981D7B7DE6199531]:0)
        at app//org.opensearch.ml.utils.MLNodeUtils.checkOpenCircuitBreaker(MLNodeUtils.java:138)
        at app//org.opensearch.ml.task.MLTaskRunner.checkCBAndExecute(MLTaskRunner.java:157)
        at app//org.opensearch.ml.task.MLTaskRunner.lambda$dispatchTask$0(MLTaskRunner.java:116)
        at app//org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82)
        at app//org.opensearch.ml.task.MLTaskDispatcher.dispatchTaskWithRoundRobin(MLTaskDispatcher.java:99)
        at app//org.opensearch.ml.task.MLTaskDispatcher.dispatchTaskWithRoundRobin(MLTaskDispatcher.java:177)
        at app//org.opensearch.ml.task.MLTaskDispatcher.dispatch(MLTaskDispatcher.java:69)
        at app//org.opensearch.ml.task.MLTaskRunner.dispatchTask(MLTaskRunner.java:111)
        at app//org.opensearch.ml.task.MLTaskRunner.run(MLTaskRunner.java:94)
        at app//org.opensearch.ml.action.training.TransportTrainingTaskAction.doExecute(TransportTrainingTaskAction.java:42)
        at app//org.opensearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:220)
        at app//org.opensearch.action.support.TransportAction.execute(TransportAction.java:190)
        at app//org.opensearch.action.support.TransportAction.execute(TransportAction.java:109)
        at app//org.opensearch.transport.client.node.NodeClient.executeLocally(NodeClient.java:113)
        at app//org.opensearch.transport.client.node.NodeClient.doExecute(NodeClient.java:100)
        at app//org.opensearch.transport.client.support.AbstractClient.execute(AbstractClient.java:501)
        at app//org.opensearch.transport.client.FilterClient.doExecute(FilterClient.java:83)
        at app//org.opensearch.transport.client.support.AbstractClient.execute(AbstractClient.java:501)
        at app//org.opensearch.transport.client.support.AbstractClient.execute(AbstractClient.java:488)
        at app//org.opensearch.ml.action.MLCommonsIntegTestCase.trainModel(MLCommonsIntegTestCase.java:263)
        at app//org.opensearch.ml.action.MLCommonsIntegTestCase.trainBatchRCFWithDataFrame(MLCommonsIntegTestCase.java:233)
        at app//org.opensearch.ml.action.prediction.PredictionITTests.setUp(PredictionITTests.java:77)

common/src/main/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInput.java

.../src/test/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInputTests.java

Signed-off-by: rithin-pullela-aws <rithinp@amazon.com>

coderabbitai · 2025-12-04T22:40:21Z

Walkthrough

Protocol values in connector tests updated from "testProtocol" to "http". Null-tolerant parsing added to ConnectorAction. Robust validation introduced for connector credentials, backend roles, and protocol with strict type checking. Error messages simplified in connector and agent components.

Changes

Cohort / File(s)	Summary
Test Protocol Updates `client/src/test/java/org/opensearch/ml/client/MachineLearning*Test.java`	Protocol value changed from "testProtocol" to "http" in createConnector test cases
Null-Tolerant Connector Parsing `common/src/main/java/org/opensearch/ml/common/connector/ConnectorAction.java`	String fields now use `textOrNull()` instead of `text()`, allowing missing or null fields without throwing exceptions
Connector Input Validation `common/src/main/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInput.java`	Added robust object-based credential parsing with string validation, strict backend roles type checking (string/number/null only), and explicit protocol validation using `validateProtocol()`
Connector Input Tests `common/src/test/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInputTests.java`	Added two new test methods for validation error cases: `testParse_BackendRolesWithJsonObject_ShouldThrowException` and `testParse_CredentialWithJsonObject_ShouldThrowException`; updated null protocol error message
Error Message Simplification `memory/src/main/java/org/opensearch/ml/memory/index/InteractionsIndex.java`, `ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/agent/MLAgentExecutor.java`	Error messages simplified: interaction lookup changed to "Message ID not found"; agent lookup message no longer includes agentId; agent fetch now wraps failures in OpenSearchStatusException

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Validation logic in MLCreateConnectorInput requires careful review of credential and backend roles type checking to ensure proper exception handling and edge cases
Null-tolerant parsing in ConnectorAction needs verification that lenient parsing does not introduce unexpected runtime behavior downstream
Error handling changes in MLAgentExecutor should be verified to ensure the simplified messages and new exception wrapping align with error handling contracts

Poem

🐰 From "testProtocol" we hop to "http" so bright,
Credentials now validated with strictness and might,
Null-tolerant parsing lets fields gently fall,
Error messages whisper their tales through it all,
Connector validation—robust, clean, and right! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Address Security issues in Connector and Agent' directly aligns with the main changes—addressing security vulnerabilities in both Connector and Agent components across multiple files.
Description check	✅ Passed	The description provides a clear overview of security fixes for both Connector and Agent components and follows the required template structure with the checklist present.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

common/src/test/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInputTests.java (1)
169-172: Consider the maintainability of hardcoded protocol lists.

The error message now includes an enumeration of valid protocols. While this improves clarity, it creates a maintenance point—whenever protocols are added or removed, this test expectation must be updated. The past review comment suggests using ConnectorProtocols.VALID_PROTOCOLS dynamically or using a contains() check instead of exact string matching.

Consider using a more flexible assertion:
-        assertEquals(
-            "Connector protocol is null. Please use one of [aws_sigv4, http, mcp_sse, mcp_streamable_http]",
-            exception.getMessage()
-        );
+        assertTrue(exception.getMessage().contains("Connector protocol is null"));
+        assertTrue(exception.getMessage().contains("aws_sigv4"));
+        assertTrue(exception.getMessage().contains("http"));
Or verify the error message is generated from ConnectorProtocols.VALID_PROTOCOLS if that's available in the implementation.

Based on learnings, past review suggested using ConnectorProtocols.VALID_PROTOCOLS or contains() API to avoid maintaining hardcoded protocol lists.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 88ee6e6 and 854d48a.

📒 Files selected for processing (7)

client/src/test/java/org/opensearch/ml/client/MachineLearningClientTest.java (1 hunks)
client/src/test/java/org/opensearch/ml/client/MachineLearningNodeClientTest.java (1 hunks)
common/src/main/java/org/opensearch/ml/common/connector/ConnectorAction.java (1 hunks)
common/src/main/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInput.java (4 hunks)
common/src/test/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInputTests.java (2 hunks)
memory/src/main/java/org/opensearch/ml/memory/index/InteractionsIndex.java (1 hunks)
ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/agent/MLAgentExecutor.java (2 hunks)

🔇 Additional comments (11)

memory/src/main/java/org/opensearch/ml/memory/index/InteractionsIndex.java (1)

356-358: LGTM! Security improvement by removing user-provided ID from error message.

This change aligns with the PR objective to prevent printing unknown trace/message IDs in the traces API, avoiding potential reflection of user-controlled input in error responses.

ml-algorithms/src/main/java/org/opensearch/ml/engine/algorithms/agent/MLAgentExecutor.java (2)

188-189: LGTM! Security improvement by using generic error message.

The original exception is logged for debugging while returning a sanitized error to the client, preventing potential information leakage about agents.

351-357: LGTM! Security improvement by removing user-provided agent ID from error message.

This prevents reflection of potentially untrusted user input in error responses.

common/src/main/java/org/opensearch/ml/common/connector/ConnectorAction.java (1)

164-203: LGTM! Null-tolerant parsing with deferred validation.

Using textOrNull() allows parsing to complete gracefully, deferring validation to the constructor where meaningful error messages are produced (lines 72-80). This prevents 500-level parsing errors and instead returns proper validation errors for null actionType, url, or method.
common/src/main/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInput.java (3)
110-110: LGTM! Improved protocol validation with informative error messages.

Using validateProtocol() provides clearer error messages that include the list of valid protocols when validation fails.

184-198: LGTM! Robust credential parsing with type validation.

This prevents nested JSON objects in credentials and provides clear error messages. The token type (not value) is included in the exception message, which is safe from a security perspective.

206-218: LGTM! Backend roles validation with proper type checking.

This prevents JSON objects in the backend_roles array. The inclusion of VALUE_NUMBER allows numeric role identifiers.

Consider whether VALUE_NUMBER is intentional for backend roles. Typically backend roles are string identifiers. If numeric roles are not expected, this could be tightened:
-                        if (parser.currentToken() != XContentParser.Token.VALUE_STRING
-                            && parser.currentToken() != XContentParser.Token.VALUE_NUMBER
-                            && parser.currentToken() != XContentParser.Token.VALUE_NULL) {
+                        if (parser.currentToken() != XContentParser.Token.VALUE_STRING
+                            && parser.currentToken() != XContentParser.Token.VALUE_NULL) {
client/src/test/java/org/opensearch/ml/client/MachineLearningClientTest.java (1)

469-469: LGTM! Test updated to use valid protocol.

The protocol change from "testProtocol" to "http" aligns with the new stricter validation in validateProtocol() that only accepts valid protocols.

client/src/test/java/org/opensearch/ml/client/MachineLearningNodeClientTest.java (1)

1036-1036: LGTM! Valid protocol value used in test.

Updating from "testProtocol" to "http" aligns with stricter protocol validation introduced in this PR.

common/src/test/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInputTests.java (2)

568-592: LGTM! Robust validation for backend roles.

This test ensures that backend roles containing JSON objects (instead of strings) are rejected with a clear error message, preventing potential 500-level errors mentioned in the PR objectives.

594-616: LGTM! Robust validation for credentials.

This test ensures that credential values containing JSON objects are rejected with a clear error message, preventing potential 500-level errors mentioned in the PR objectives.

rithin-pullela-aws requested review from HenryL27, Zhangxunmt, austintlee, b4sjoo, dhrubo-os, jngz-es, mingshl, model-collapse, pyek-bot, rbhavna, sam-herman, xinyual, ylwu-amzn and zane-neo as code owners October 30, 2025 22:39

pyek-bot added the backport 3.3 label Oct 30, 2025

rithin-pullela-aws temporarily deployed to ml-commons-cicd-env-require-approval October 30, 2025 22:41 — with GitHub Actions Inactive

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval October 30, 2025 22:41 — with GitHub Actions Error

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval October 30, 2025 22:41 — with GitHub Actions Failure

mingshl approved these changes Oct 30, 2025

View reviewed changes

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 00:20 — with GitHub Actions Failure

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 00:20 — with GitHub Actions Error

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 01:07 — with GitHub Actions Failure

ylwu-amzn approved these changes Oct 31, 2025

View reviewed changes

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 06:10 — with GitHub Actions Failure

rithin-pullela-aws had a problem deploying to ml-commons-cicd-env-require-approval October 31, 2025 06:10 — with GitHub Actions Error

akolarkunnu reviewed Oct 31, 2025

View reviewed changes

common/src/main/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInput.java Show resolved Hide resolved

.../src/test/java/org/opensearch/ml/common/transport/connector/MLCreateConnectorInputTests.java Show resolved Hide resolved

Address Security issues

854d48a

Signed-off-by: rithin-pullela-aws <rithinp@amazon.com>

b4sjoo force-pushed the security-fixx branch from caee2ef to 854d48a Compare December 4, 2025 22:39

b4sjoo temporarily deployed to ml-commons-cicd-env-require-approval December 4, 2025 22:41 — with GitHub Actions Inactive

b4sjoo had a problem deploying to ml-commons-cicd-env-require-approval December 4, 2025 22:41 — with GitHub Actions Failure

b4sjoo had a problem deploying to ml-commons-cicd-env-require-approval December 4, 2025 22:41 — with GitHub Actions Error

b4sjoo temporarily deployed to ml-commons-cicd-env-require-approval December 4, 2025 22:41 — with GitHub Actions Inactive

coderabbitai bot reviewed Dec 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Address Security issues in Connector and Agent #4380

Address Security issues in Connector and Agent #4380

Uh oh!

rithin-pullela-aws commented Oct 30, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

mingshl left a comment

Uh oh!

rithin-pullela-aws commented Oct 31, 2025

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Dec 4, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Address Security issues in Connector and Agent #4380

Are you sure you want to change the base?

Address Security issues in Connector and Agent #4380

Uh oh!

Conversation

rithin-pullela-aws commented Oct 30, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Summary by CodeRabbit

Uh oh!

mingshl left a comment

Choose a reason for hiding this comment

Uh oh!

rithin-pullela-aws commented Oct 31, 2025

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

rithin-pullela-aws commented Oct 30, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 4, 2025 •

edited

Loading