Homework 1
Release: Week 5 | Due: Week 8
Details and submission format will be announced in class and on the course page.
This course introduces core theory and modern practice in speech technology, from signal processing and ASR to speaker modeling, speech synthesis, and large-model-based speech applications.
As one of the most natural forms of human communication, speech is central to the next generation of AI systems. This course provides a structured path from fundamentals to state-of-the-art, covering both classical and neural approaches. We discuss practical design choices used in modern systems and how to critically read and implement research ideas.
| Week | Date | Topic | Materials | Notes |
|---|---|---|---|---|
| 1 | 2025.8.25 | Course Introduction & Overview of Speech Technology | Slides Demos | - |
| 2 | 2025.9.1 | Overview of Speech Technology (continued) | Slides | - |
| 3 | 2025.9.8 | Fundamentals of Speech Signal Processing | Slides | - |
| 4 | 2025.9.15 | Introduction to Automatic Speech Recognition | Slides WER Demo | - |
| 5 | 2025.9.22 | Traditional ASR Models (GMM/DNN-HMM) | Slides | Assignment 1 Released |
| 6 | 2025.9.29 | End-to-End ASR Models | Slides | - |
| 7 | 2025.10.6 | National Holiday | - | No class meeting |
| 8 | 2025.10.13 | Speaker Modeling (Part 1) | Slides | - |
| 9 | 2025.10.20 | Speaker Modeling (Part 2) | Slides | - |
| 10 | 2025.10.27 | Speech Synthesis (Part 1) | Slides (to be posted) | Guest Talk: Jingbei Li (StepAudio) |
| 11 | 2025.11.3 | Speech Synthesis (Part 2) | Slides (to be posted) | - |
| 12 | 2025.11.10 | Speech Synthesis (Part 3) | Slides (to be posted) | - |
| 13 | 2025.11.17 | Voice Conversion | Slides (to be posted) | Assignment 2 Released |
| 14 | 2025.11.24 | Speech Separation | Slides (to be posted) | - |
| 15 | 2025.12.1 | Self-Supervised Learning for Speech | Slides (to be posted) | - |
| 16 | 2025.12.8 | Speech Processing with Large Language Models | Slides (to be posted) | - |
| 17 | 2025.12.15 | Applications of Speech Processing in Industry | Slides (to be posted) | Invited Talks |
| 18 | 2025.12.22 | Final Project Presentation | - | Last class |
| 20 | 2026.1.5 | - | - | Final Project Due |
Recent papers from top conferences and journals: ICASSP, Interspeech, ACL, T-ASLP, and related venues.
Release: Week 5 | Due: Week 8
Details and submission format will be announced in class and on the course page.
Release: Week 13 | Due: To be announced
This assignment focuses on advanced topics covered in the second half of the course.
Presentation: Week 18 | Final Submission: Week 20 (2026.1.5)
Students are expected to propose, implement, and evaluate a speech-related project with a clear technical report.
The syllabus and schedule are subject to refinement as the semester progresses. Any updates to lectures, deadlines, or materials will be announced in advance.