报告题目:Deep Learning Based Speech Separation
报 告 人:DeLiang Wang教授 (The Ohio State University, USA)
报告时间:2018年5月3日10:00-11:00
报告地点:南一楼中311
Abstract: Speech separation, or the cocktail party problem, has evaded a solution for decades in speech and audio processing. Motivated by auditory perception, I have been advocating a new formulation to this old challenge that estimates an ideal time-frequency mask (binary or ratio). This new formulation has an important implication that the speech separation problem is open to modern machine learning techniques, and deep neural networks (DNNs) are particularly well-suited for this task due to their representational capacity. I will describe recent algorithms that employ DNNs for supervised speech separation, including speech enhancement and speaker separation. DNN-based mask estimation elevates speech separation performance to new levels, and produces the first demonstration of substantial speech intelligibility improvements for both hearing-impaired and normal-hearing listeners in background interference. These advances represent major progress towards solving the cocktail party problem.
Biosketch: DeLiang Wang received the B.S. degree and the M.S. degree from Peking (Beijing) University and the Ph.D. degree in 1991 from the University of Southern California all in computer science. Since 1991, he has been with the Department of Computer Science & Engineering and the Center for Cognitive and Brain Sciences at The Ohio State University, where he is a Professor and University Distinguished Scholar. He also holds a visiting appointment at the Center of Intelligent Acoustics and Immersive Communications, Northwestern Polytechnical University. He received the Office of Naval Research Young Investigator Award in 1996, the 2005 Outstanding Paper Award from IEEE Transactions on Neural Networks, and the 2008 Helmholtz Award from the International Neural Network Society. He is an IEEE Fellow, and currently serves as Co-Editor-in-Chief of Neural Networks.