Skip to content

[skyrl-train] consider updating grpo_norm_by_std default to false #869

@erictang000

Description

@erictang000

Right now we set the default for trainer.algorithm.grpo_norm_by_std to true - for Dr. GRPO this should be false. Other research is adopting grpo_norm_by_std=false even if not using Dr. GRPO length normalization. We should consider updating this default.

Relevant discussion:

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions