- Tatsunori Hashimoto β Instructor
- Percy Liang β Instructor
- Marcel RΓΈd β CA
- Neil Band β CA
- Rohith Kuditipudi β CA
- Lectures: Tuesday/Thursday 3:00β4:20pm, NVIDIA Auditorium
- Office Hours:
- Tatsu Hashimoto (Gates 364): Fridays 3β4pm
- Percy Liang (Gates 350): Fridays 11amβ12pm
- Marcel RΓΈd (Gates 415): Mon/Wed 11amβ12pm
- Neil Band (Gates 358): Mon 4β5pm, Tues 5β6pm
- Rohith Kuditipudi (Gates 358): Mon/Wed 10β11am
- Contact: Use public Slack channels for questions and announcements. For personal matters, email cs336-spr2425-staff@lists.stanford.edu.
This course provides a comprehensive, hands-on introduction to language modeling, guiding students through building language models from scratch. Topics include data collection, transformer architectures, model training, evaluation, and deployment. The course is implementation-heavy and requires strong Python and deep learning skills.
- Proficiency in Python
- Experience with deep learning (PyTorch) and systems optimization
- College-level calculus and linear algebra
- Basic probability and statistics
- Prior coursework in machine learning (e.g., CS221, CS229, CS230, CS124, CS224N)
- Basics: Implement and train a standard Transformer language model.
- Systems: Profile, optimize, and distribute model training.
- Scaling: Analyze and fit scaling laws for model growth.
- Data: Process and filter large-scale pretraining data.
- Alignment and Reasoning RL: Apply supervised finetuning and RL for reasoning tasks.