Model sizes between paper and code implementation

Thank you for open-sourcing the code. This project is a fantastic contribution to the video generation community!  I have a question regarding the model sizes used for the teacher (real_score), critic (fake_score), and generator.

In the paper, it is mentioned that the teacher model is the larger Wan2.1-T2V-14B, while the generator and critic are the smaller Wan2.1-T2V-1.3B. This setup, where knowledge is distilled from a much larger teacher to a smaller student, is a key highlight of the method. However, when I look at the implementation in dmd.py (specifically in the BaseModel), it seems that the default model names for all three components point to the smaller 1.3B model.

Any clarification on this would be greatly appreciated. It would be very helpful for those of us trying to fully understand and reproduce your excellent results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model sizes between paper and code implementation #35

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model sizes between paper and code implementation #35

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions