Skip to content

Model sizes between paper and code implementation #35

@Maxwell-Zhao

Description

@Maxwell-Zhao

Thank you for open-sourcing the code. This project is a fantastic contribution to the video generation community! I have a question regarding the model sizes used for the teacher (real_score), critic (fake_score), and generator.

In the paper, it is mentioned that the teacher model is the larger Wan2.1-T2V-14B, while the generator and critic are the smaller Wan2.1-T2V-1.3B. This setup, where knowledge is distilled from a much larger teacher to a smaller student, is a key highlight of the method. However, when I look at the implementation in dmd.py (specifically in the BaseModel), it seems that the default model names for all three components point to the smaller 1.3B model.

Any clarification on this would be greatly appreciated. It would be very helpful for those of us trying to fully understand and reproduce your excellent results.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions