-
Notifications
You must be signed in to change notification settings - Fork 429
Pull requests: THUDM/slime
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix multi‑turn loss masking, clarify qwen25/qwen3 types, and strengthen mask tests
#1418
opened Jan 14, 2026 by
Daucloud
Loading…
issue #1414 fix: correct reward normalization for unequal group sizes
#1415
opened Jan 14, 2026 by
ccggddmm
Loading…
Add support for per token reward/advantages with a custom_reward_post_process_path
#1389
opened Jan 12, 2026 by
vpj
Loading…
[Feature] Add rollout concurrency argument for full async training
#1310
opened Jan 3, 2026 by
yitianlian
Loading…
Feat(router): add oai interface support for router
#1203
opened Dec 24, 2025 by
ChangyiYang
Loading…
[FEATURE] Add tool call support for multi-turn SFT with delta-based loss masking
#1159
opened Dec 20, 2025 by
Surya-Gunukula
Loading…
fix: fix 8B VLM true on policy issue
run-ci-short
#1155
opened Dec 19, 2025 by
nanjiangwill
Loading…
[FSDP][1/n] Support LoRA training for FSDP backend.
#1140
opened Dec 17, 2025 by
GuanxingLu
Loading…
4 tasks
[On Policy Distillation] resolve log prob dimension mismatch in on-policy distillation with CP > 1
#1135
opened Dec 17, 2025 by
Yuchen-Cao
Loading…
feat: Support
list-of-dicts format for multimodal message content
#1037
opened Dec 5, 2025 by
ppraneth
Loading…
RDMA Support for the weight transferring from Megatron to SGL
#932
opened Nov 25, 2025 by
JensenFire
•
Draft
1 of 2 tasks
Previous Next
ProTip!
Follow long discussions with comments:>50.