-
Notifications
You must be signed in to change notification settings - Fork 186
add Yandex Cloud embeddings connector blueprint #4469
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Mikhail Khludnev <mkhl@apache.org>
WalkthroughAdded documentation for a Yandex Cloud AI Studio connector embedding standard blueprint, plus an update to the standard blueprints README. The new doc describes connector creation, parameters/credentials, predict action payloads, model registration/deployment, testing examples, and placeholder/token guidance. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🔇 Additional comments (1)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/remote_inference_blueprints/standard_blueprints/yandexcloud_connector_embedding_standard_blueprint.md(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docs/remote_inference_blueprints/standard_blueprints/yandexcloud_connector_embedding_standard_blueprint.md
[grammar] ~120-~120: Ensure spelling is correct
Context: ...of life?" } } Sample response of Yadex Cloud AI Studio Embedding: json { ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: spotless
🔇 Additional comments (1)
docs/remote_inference_blueprints/standard_blueprints/yandexcloud_connector_embedding_standard_blueprint.md (1)
45-46: Verify pre/post-processing functions are correct for Yandex Cloud.The connector references
bedrockpre/post-processing functions, but this blueprint is for Yandex Cloud. Verify that these processing functions are:
- Generic/universal and work correctly with Yandex Cloud API responses, or
- Should be replaced with Yandex-specific processing functions.
If these are not the correct functions for Yandex Cloud, update them accordingly.
...ference_blueprints/standard_blueprints/yandexcloud_connector_embedding_standard_blueprint.md
Outdated
Show resolved
Hide resolved
...ference_blueprints/standard_blueprints/yandexcloud_connector_embedding_standard_blueprint.md
Outdated
Show resolved
Hide resolved
Signed-off-by: Mikhail Khludnev <mkhl@apache.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/remote_inference_blueprints/standard_blueprints/yandexcloud_connector_embedding_standard_blueprint.md(1 hunks)
🔇 Additional comments (1)
docs/remote_inference_blueprints/standard_blueprints/yandexcloud_connector_embedding_standard_blueprint.md (1)
45-46: Clarify why Bedrock pre/post-processing functions are used for Yandex Cloud.The pre/post-processing functions reference
bedrockfor a Yandex Cloud connector. Clarify whether Yandex's request/response format is compatible with Bedrock's processing, or if Yandex-specific processing functions should be used instead.If compatibility is intentional, add a brief comment explaining why Bedrock functions are appropriate here. If these should be Yandex-specific, update them accordingly.
| ```json | ||
| POST /_plugins/_ml/models/_register | ||
| { | ||
| "name": "yc-embedding", | ||
| "function_name": "remote", | ||
| "model_group_id": "4THNtZoBdUNOOrVAzj_V", | ||
| "description": "YC embedding model", | ||
| "connector_id": "CTEou5oBdUNOOrVArUAU" | ||
| } | ||
| ``` | ||
|
|
||
|
|
||
| ```json | ||
| POST /_plugins/_ml/models/_register | ||
| { | ||
| "name": "YC text embedding model", | ||
| "function_name": "remote", | ||
| "description": "test model", | ||
| "connector_id": "nzh9PIsBnGXNcxYpPEcv" | ||
| } | ||
| ``` | ||
|
|
||
| Sample response: | ||
| ```json | ||
| { | ||
| "task_id": "5THZtZoBdUNOOrVAEj_I", | ||
| "status": "CREATED", | ||
| "model_id": "CzEou5oBdUNOOrVA10Db" | ||
| } | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove or clarify the duplicate model registration (lines 90–98).
The section shows two separate model registrations:
- Lines 79–87:
yc-embeddingwith the connector created in section 2 (CTEou5oBdUNOOrVArUAU) and amodel_group_id - Lines 90–98:
YC text embedding modelwith a hardcoded, unrelated connector_id (nzh9PIsBnGXNcxYpPEcv) and nomodel_group_id
The second registration uses a connector_id that does not match the one established earlier, making it inconsistent and confusing. Users will not know which registration to follow. The sample response and test section both reference the second registration's model_id, but the instructions don't explain why two registrations are shown.
Either remove the second registration if it's leftover code, or clearly explain when/why to use both variants.
If the second registration should be removed, apply this diff:
```json
POST /_plugins/_ml/models/_register
{
"name": "yc-embedding",
"function_name": "remote",
"model_group_id": "4THNtZoBdUNOOrVAzj_V",
"description": "YC embedding model",
"connector_id": "CTEou5oBdUNOOrVArUAU"
}-```json
-POST /_plugins/_ml/models/_register
-{
- "name": "YC text embedding model",
- "function_name": "remote",
- "description": "test model",
- "connector_id": "nzh9PIsBnGXNcxYpPEcv"
-}
-```
Sample response:
Then update the sample response and test section to use the first registration's model IDs consistently.
<!-- This is an auto-generated comment by CodeRabbit -->
|
Thanks for the PR @mkhludnev , one high level comment: can you add this blueprint to the readme as well https://github.com/opensearch-project/ml-commons/blob/main/docs/remote_inference_blueprints/standard_blueprints/README.md? |
|
@mkhludnev thanks for adding this new blueprint!! it looks great, can you also add to repo so that it will direct users from opensearch.org to your blueprint https://github.com/opensearch-project/documentation-website/blob/main/_ml-commons-plugin/remote-models/supported-connectors.md |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4469 +/- ##
============================================
+ Coverage 80.21% 80.22% +0.01%
- Complexity 10242 10255 +13
============================================
Files 858 858
Lines 44553 44552 -1
Branches 5158 5158
============================================
+ Hits 35739 35743 +4
+ Misses 6640 6633 -7
- Partials 2174 2176 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Mikhail Khludnev <mkhl@apache.org>
|
@nathaliellenaa done here! |
|
@mingshl done opensearch-project/documentation-website#11693 Thanks for advice! |
|
I have a question, if I change the modelurl in the predict payload, is it still required the same field if not, then we will create different connectors for different modelUrl is it? |
|
Hello Mingshi My understanding is the following:
in |
Description
This contributes connector blueprint for Yandex Cloud.
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.
Note for reviews
I've contributed to OpenSearch before.
And thanks for reviewing it!
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.