THUHCSI

Zhiyong Wu (吴志勇)

Computer Science and Technology

Full professor of Shenzhen International Graduate School, Tsinghua University. Coordinator with the Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems.

Research Interests

Speech and language processing
Multimedia / multimodal information Processing
Affective computing

Teaching Courses

Speech Signal Digital Processing
Big Data Analysis

Scholar Profiles

Biography

Dr. Zhiyong Wu is a full Professor and Ph.D. supervisor at Shenzhen International Graduate School, Tsinghua University. He is also a coordinator with the Tsinghua-CUHK Joint Research Center for Media Sciences, Technologies and Systems. He received the B.S. and Ph.D. degrees in computer science and technology from Tsinghua University in 1999 and 2005 respectively. From 2005 to 2007, he was a postdoctoral fellow with the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong (CUHK).

His research interests cover the area of intelligent speech interactions, including audio foundation models (speech, singing voice, audio), expressive and controllable speech generation (style, emotion, prosody, personalization), digital human generation (lip-sync, facial expressions, co-speech gestures, dance), natural language processing (understanding and generation), affective computing, and machine learning. He has published over 200 papers in leading journals and top-tier conferences such as IEEE TASLP, IEEE TMM, IEEE TPAMI, NeurIPS, AAAI, IJCAI, CVPR, ICLR, ACM Multimedia, ICASSP, and INTERSPEECH, co-authored four books, and filed more than 30 patents.

Prof. Wu is a member of CCF, CAAI, IEEE, ACM, and ISCA. He currently serves as Standing Committee Member and Deputy Secretary-General of the Speech Dialogue and Auditory Processing Technical Committee, China Computer Federation (CCF TFSDAP), ommittee member of NCMMS. He is also a reviewer or program committee member for leading journals and conferences including IEEE TASLP, IEEE TMM, NeurIPS, AAAI, ACL, IJCAI, ICASSP, and INTERSPEECH.

He has led and participated in numerous projects funded by the National Natural Science Foundation of China (including NSFC-RGC joint projects), the National 863 and 973 Programs, and major projects of the National Social Science Foundation of China. His research achievements have reached internationally advanced levels and have been recognized with multiple awards, including the Ministry of Education (MoE) Science and Technology Progress Awards (2009, 2016), the Beijing Science and Technology Progress Award (2021), and the Shenzhen Science and Technology Progress Award (2023).

Prof. Wu places strong emphasis on talent cultivation. His students have received National Scholarships, Excellent Dissertation Award of Tsinghua University, and Outstanding Graduate Award of both Tsinghua University and Beijing, and have won championships and best paper awards at international competitions and conferences such as the GeekPwn Global AI Competition, ICASSP Challenges, INTERSPEECH, CVPR, and AAAI. He has been recognized with Tsinghua University's Teaching Excellence Awards (2020, 2023) and was named a "Tsinghua University's 'Mentor and Friend' Award" n 2022.

Member of

Speech Dialogue and Auditory Processing Technical Committee, China Computer Federation (CCF TFSDAP)
Committee of the National Conference on Man-Machine Speech Communication (NCMMSC)
Technical Committee of Intelligent Systems Application under the IEEE Computational Intelligent Society
China Computer Federation (CCF)
Chinese Association for Artificial Intelligence (CAAI)
Institute of Electrical and Electronics Engineers (IEEE)
Association for Computing Machinery (ACM)
International Speech Communication Association (ISCA)

Reviewer for

IEEE/ACM Transactions on Audio, Speech and Language Processing (IEEE/ACM TASLP)
Speech Communications
Multimedia Tools and Applications (MTAP)
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Annual Conference of the International Speech Communication Association (INTERSPEECH)
International Joint Conference on Natural Language Processing (IJCNLP)
International Conference on Computational Linguistics (COLING)
AAAI Conference on Artificial Intelligence (AAAI)
Conference on Neural Information Processing Systems (NeurIPS)
International Symposium on Chinese Spoken Language Processing (ISCSLP)

Committee Member of

Publication Chair of ISCSLP 2012
Special Session Area Chair of INTERSPEECH 2020
Local Arrangement Chair of SLT 2020
Local Chair (China) of ICASSP 2022

个人简介

吴志勇，清华大学深圳国际研究生院教授、博士生导师，清华大学-香港中文大学媒体科学、技术与系统联合研究中心副主任。

主要研究方向为智能语音交互，包括通用音频大模型（语音、歌曲、音效）、表现力语音生成（风格、情感、韵律、个性化）、数字人生成（口型、表情、手势、舞蹈）、自然语言处理（理解与生成）、情感计算与机器学习等。在IEEE TASLP、TMM、TPAMI、NeurIPS、AAAI、IJCAI、CVPR、ICLR、ACM MM、ICASSP、INTERSPEECH等国际权威期刊与顶级会议发表论文200余篇，参与撰写翻译著作4部，申请发明专利30余项。

现为中国计算机学会（CCF）、中国人工智能学会（CAAI）、IEEE、ACM、国际语音通讯协会（ISCA）会员；担任CCF语音对话与听觉专业委员会（CCF TFSDAP）常委兼副秘书长、全国人机语音通讯学术会议（NCMMSC）常设机构委员、IEEE计算智能协会智能系统应用委员会委员、国际互联网联盟语音合成标记语言（SSML）国际化工作组成员。TASLP、TMM等国际学术期刊及NeurIPS、AAAI、ACL、IJCAI、ICASSP、INTERSPEECH等国际学术会议审稿人、程序委员会委员。

主持和参与多项国家自然科学基金（含NSFC-RGC联合项目）、国家863、973、国家社科基金重大项目等，科研成果达到国际先进水平。先后获得教育部科学技术进步奖（2009、2016）、北京市科学技术进步奖（2021）、深圳市科学技术进步奖（2023）等。

在人才培养方面，指导的学生多次获国家奖学金、清华大学优秀学位论文、清华大学优秀毕业生、北京市优秀毕业生等荣誉，并在全球极客大赛、ICASSP挑战赛、INTERSPEECH、CVPR、AAAI等国际竞赛中取得冠军或最佳论文等重要奖项。本人荣获清华大学教学优秀奖（2020、2023），并当选清华大学第十八届“良师益友”。