feat: Support `list-of-dicts` format for multimodal message content #1037

ppraneth · 2025-12-05T12:35:32Z

This PR resolves a TODO in slime/utils/data.py to support datasets where message["content"] is a list of dictionaries (e.g., [{"type": "image", ...}, {"type": "text", ...}]), which is standard for many multimodal instruction tuning datasets.

Changes

Updated _build_messages in slime/utils/data.py to handle list inputs.
Added validation to ensure list items are dictionaries with a valid type.
Maintained backward compatibility for the legacy string format (text with <image> placeholders).

ppraneth · 2025-12-05T12:37:19Z

cc @zhuzilin

ppraneth · 2025-12-31T08:29:21Z

@yitianlian Can you check this pr?

fix

e1ad8cc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Support `list-of-dicts` format for multimodal message content #1037

feat: Support `list-of-dicts` format for multimodal message content #1037

ppraneth commented Dec 5, 2025

Uh oh!

ppraneth commented Dec 5, 2025

Uh oh!

ppraneth commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Support list-of-dicts format for multimodal message content #1037

Are you sure you want to change the base?

feat: Support list-of-dicts format for multimodal message content #1037

Conversation

ppraneth commented Dec 5, 2025

Uh oh!

ppraneth commented Dec 5, 2025

Uh oh!

ppraneth commented Dec 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

feat: Support `list-of-dicts` format for multimodal message content #1037

feat: Support `list-of-dicts` format for multimodal message content #1037