Chenda Li (Graduate Student Member, IEEE) received the B.Eng. degree from the Department of Electronic and Information Engineering, Huazhong University of Science and Technology, Wuhan, China, in 2018, and the M.Sc. degree in 2020 from the Department of Computer Science, Shanghai Jiao Tong University, Shanghai, China, where he is currently working toward the Ph.D. degree with the X-LANCE Lab, Department of Computer Science and Engineering, under the supervision of Yanmin Qian. His current research interests include speech enhancement and speech separation.
Education
Shanghai Jiao Tong University, Shanghai, China.
PhD in computer science and technology, Sep. 2020 - present.
MS in computer science and technology, Sep. 2018 - Jun. 2020.
Huazhong University of Science and Technology, Wuhan, China.
BS in electronic and information engineering, Sep. 2014 - Jun. 2018.
Research Experience
End-to-end speech enhancement and separation toolkit (ESPnet-SE), main developer, Jun. 2020 - present
An open source project for speech enhancement and separation based on ESPnet.
Participate in as one of the main developers and maintainers.
Research project from Samsung Research China, main participant, Mar. 2021 - Dec. 2021
Research and develop a low-latency algorithm for online continuous speech separation.
Jelenik speech & language technologies workshop, team member, Jun. 2020 - Jul. 2020
Research and develop algorithms for meeting-wise speech separation and long sequence modeling.
Selected awards and hornor
1st place in the 3D Speech Enhancement task of L3DAS22 Challenge, 2022.
3rd place in the audio-visual speech recognition task of Multi-modal Information based Speech Processing Challenge (MISP) 2022.
Best oral presentation in the National Conference on Man–Machine Speech Communication (NCMMSC) 2021.
Skills
Experience in Linux programming: Python, Bash, C/C++, Matlab
Experience in deep learning toolkit: PyTorch, MXNet
Speech processing toolkit: Kaldi, ESPnet
Experience in web programming: Django, Spring, Scrapy, HTML, CSS, Javascript
Cluster administrator in SJTU X-LANCE Lab
Publicataion
Chenda Li, Zhuo Chen and Yanmin Qian, “Dual-Path Modeling With Memory Embedding Model for Continuous Speech Separation,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1508–1520, 2022.
Chenda Li, Lei Yang, Weiqin Wang and Yanmin Qian, “Skim: Skipping Memory Lstm for Low-Latency RealTime Continuous Speech Separation,” in Proc. ICASSP, May 2022, pp. 681–685.
Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, and Yanmin Qian, “Dual-Path Modeling for Long Recording Speech Separation in Meetings,” in Proc. ICASSP, Jun. 2021, pp. 5739–5743.
Chenda Li, et al., “Dual-Path RNN for Long Recording Speech Separation,” in 2021 IEEE Spoken Language Technology Workshop (SLT), Jan. 2021, pp. 865–872.
Chenda Li, Jing Shi, Wangyou Zhang*, et al., “ESPnet-SE: End-To-End Speech Enhancement and Separation Toolkit Designed for ASR Integration,” in 2021 IEEE Spoken Language Technology Workshop (SLT), Jan. 2021, pp. 785–792.
Chenda Li and Yanmin Qian, “Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation,” in Proc. Interspeech, 2020, pp. 1426–1430.
Chenda Li and Yanmin Qian, “Deep Audio-Visual Speech Separation with Attention Mechanism,” in Proc. ICASSP, May 2020, pp. 7314–7318.
Chenda Li and Yanmin Qian, “Prosody usage optimization for children speech recognition with zero resource children speech,” in Proc. Interspeech, 2019, pp. 3446–3450.
Yifei Wu, Chenda Li, Jinfeng Bai, Zhongqin Wu and Yanmin Qian, “Time-Domain Audio-Visual Speech Separation on Low Quality Videos,” in Proc. ICASSP, May 2022, pp. 256–260.
Yifei Wu, Chenda Li, Song Yang, Zhongqin Wu, and Yanmin Qian, “Audio-Visual Multi-Talker Speech Recognition in a Cocktail Party,” in Proc. Interspeech, 2021, pp. 3021–3025.
Wei Wang, Xun Gong, Yifei Wu, Zhikai Zhou, Chenda Li, Wangyou Zhang, Bing Han and Yanmin Qian, “The Sjtu System For Multimodal Information Based Speech Processing Challenge 2021,” in Proc. ICASSP, May 2022, pp. 9261–9265.
Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang and Shinji Watanabe, “Towards Low-Distortion Multi-Channel Speech Enhancement: The ESPNET-SE Submission to the L3DAS22 Challenge,” in Proc. ICASSP, May 2022, pp. 9201–9205.
Wangyou Zhang, Jing Shi, Chenda Li, Shinji Watanabe, and Yanmin Qian, “Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions,” in 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct. 2021, pp. 146–150.
Yi Luo, Zhuo Chen, Cong Han, Chenda Li, Tianyan Zhou, and Nima Mesgarani, “Rethinking The Separation Layers In Speech Separation Networks,” in Proc. ICASSP, Jun. 2021, pp. 1–5.
Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R Hershey, Nima Mesgarani and Zhuo Chen, “Continuous speech separation using speaker inventory for long recording,” in Proc. Interspeech, 2021, pp. 3036–3040.
Pengcheng Guo et al., “Recent Developments on Espnet Toolkit Boosted By Conformer,” in Proc. ICASSP, Jun. 2021, pp. 5874–5878.
S. Watanabe et al., “The 2020 ESPnet Update: New Features, Broadened Applications, Performance Improvements, and Future Plans,” in 2021 IEEE Data Science and Learning Workshop (DSLW), Jun. 2021, pp. 1–6.