Skip to content

Conversation

@mingshl
Copy link
Collaborator

@mingshl mingshl commented Nov 27, 2025

Description

#4470

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Summary by CodeRabbit

  • Refactor
    • Streamlined internal deserialization flow to reduce intermediate processing steps while preserving validation rules and existing error logging/handling behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 27, 2025

Walkthrough

The change simplifies the deserialization path in ModelSerDeSer.java by replacing nested ByteArrayInputStream/ObjectInputStream construction with a single ValidatingObjectInputStream. Validation now occurs directly via ACCEPT_CLASS_PATTERNS and REJECT_CLASS_PATTERNS; error handling remains unchanged.

Changes

Cohort / File(s) Summary
Deserialization refactoring
ml-algorithms/src/main/java/org/opensearch/ml/engine/utils/ModelSerDeSer.java
Replaced nested ByteArrayInputStream/ObjectInputStream usage with a single ValidatingObjectInputStream in deserialize(byte[]), ensuring objects are read through the validator and removing redundant stream wrapping.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Areas requiring attention:
    • Confirm ValidatingObjectInputStream enforces ACCEPT_CLASS_PATTERNS/REJECT_CLASS_PATTERNS in the new flow.
    • Test deserialization for previously accepted and rejected classes to ensure behavior is unchanged.

Possibly related issues

Suggested labels

backport 3.4

Suggested reviewers

  • jngz-es
  • model-collapse
  • rbhavna
  • Zhangxunmt
  • ylwu-amzn
  • dhrubo-os

Poem

🐰 Streams unwound, one path to see,
A single guard keeps mischief free.
Bytes hop in, checked at the gate,
Clean and safe — that's simply great! 🥕✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is largely incomplete. It references an issue but provides no meaningful description of what was changed or why. Required sections like a proper Description are missing or only contain a link. Add a detailed Description section explaining what was changed and why. Consider addressing the comment requesting a unit test before merging.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Fix validation inputStream' is concise and directly relates to the main change in the PR, which simplifies the deserialization path by fixing how the validation input stream is wrapped.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1ff7f90 and d582f00.

📒 Files selected for processing (1)
  • ml-algorithms/src/main/java/org/opensearch/ml/engine/utils/ModelSerDeSer.java (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • ml-algorithms/src/main/java/org/opensearch/ml/engine/utils/ModelSerDeSer.java
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Build and Test MLCommons Plugin on linux (25)
  • GitHub Check: Build and Test MLCommons Plugin on linux (21)
  • GitHub Check: Build and Test MLCommons Plugin on Windows (25)
  • GitHub Check: Build and Test MLCommons Plugin on Windows (21)

Comment @coderabbitai help to get the list of available commands and usage tips.

@rithin-pullela-aws
Copy link
Contributor

Failing test seems to be because of the invalid image URL failing on all the CI:

REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithOpenAIWithConversationAndImage' -Dtests.seed=BCD4E4B7208A70FF -Dtests.security.manager=false -Dtests.locale=hr-Latn-HR -Dtests.timezone=PST -Druntime.java=24
RestMLRAGSearchProcessorIT > testBM25WithOpenAIWithConversationAndImage STANDARD_ERROR
    REPRODUCE WITH: ./gradlew ':opensearch-ml-plugin:integTest' --tests 'org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithOpenAIWithConversationAndImage' -Dtests.seed=BCD4E4B7208A70FF -Dtests.security.manager=false -Dtests.locale=hr-Latn-HR -Dtests.timezone=PST -Druntime.java=24

RestMLRAGSearchProcessorIT > testBM25WithOpenAIWithConversationAndImage FAILED
    org.opensearch.client.ResponseException: method [POST], host [http://127.0.0.1:46855/], URI [/test/_search?size=5&search_pipeline=pipeline_test], status line [HTTP/1.1 400 Bad Request]
    {"error":{"root_cause":[{"type":"status_exception","reason":"Error from remote service: {\n  \"error\": {\n    \"message\": \"Error while downloading [https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg.\](https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg./)",\n    \"type\": \"invalid_request_error\",\n    \"param\": null,\n    \"code\": \"invalid_image_url\"\n  }\n}"}],"type":"status_exception","reason":"Error from remote service: {\n  \"error\": {\n    \"message\": \"Error while downloading [https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg.\](https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg./)",\n    \"type\": \"invalid_request_error\",\n    \"param\": null,\n    \"code\": \"invalid_image_url\"\n  }\n}"},"status":400}
        at __randomizedtesting.SeedInfo.seed([BCD4E4B7208A70FF:70CFD0D1639505B6]:0)
        at app//org.opensearch.client.RestClient.convertResponse(RestClient.java:501)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:384)
        at app//org.opensearch.client.RestClient.performRequest(RestClient.java:359)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:199)
        at app//org.opensearch.ml.utils.TestHelper.makeRequest(TestHelper.java:172)
        at app//org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.performSearch(RestMLRAGSearchProcessorIT.java:1450)
        at app//org.opensearch.ml.rest.RestMLRAGSearchProcessorIT.testBM25WithOpenAIWithConversationAndImage(RestMLRAGSearchProcessorIT.java:1075)

@dhrubo-os
Copy link
Collaborator

Can we have a unit test for this?

@dhrubo-os dhrubo-os temporarily deployed to ml-commons-cicd-env December 6, 2025 00:56 — with GitHub Actions Inactive
@dhrubo-os dhrubo-os temporarily deployed to ml-commons-cicd-env December 6, 2025 00:56 — with GitHub Actions Inactive
@codecov
Copy link

codecov bot commented Dec 6, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.23%. Comparing base (66964af) to head (d582f00).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #4471      +/-   ##
============================================
+ Coverage     80.21%   80.23%   +0.01%     
- Complexity    10242    10255      +13     
============================================
  Files           858      858              
  Lines         44553    44552       -1     
  Branches       5158     5158              
============================================
+ Hits          35739    35746       +7     
+ Misses         6640     6633       -7     
+ Partials       2174     2173       -1     
Flag Coverage Δ
ml-commons 80.23% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Mingshi Liu <mingshl@amazon.com>
@mingshl mingshl force-pushed the main-fix-validate-modelinput branch from fce2020 to d582f00 Compare December 15, 2025 23:50
@mingshl mingshl temporarily deployed to ml-commons-cicd-env December 15, 2025 23:52 — with GitHub Actions Inactive
@mingshl mingshl temporarily deployed to ml-commons-cicd-env December 15, 2025 23:52 — with GitHub Actions Inactive
@mingshl mingshl temporarily deployed to ml-commons-cicd-env December 15, 2025 23:52 — with GitHub Actions Inactive
@mingshl mingshl temporarily deployed to ml-commons-cicd-env December 15, 2025 23:52 — with GitHub Actions Inactive
@mingshl mingshl temporarily deployed to ml-commons-cicd-env December 16, 2025 01:00 — with GitHub Actions Inactive
@mingshl mingshl had a problem deploying to ml-commons-cicd-env December 16, 2025 01:00 — with GitHub Actions Failure
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants