+86-15306820677
上海交通大学电院3号楼502
Speech Separation/Enhancement

Biography

Chenda Li(李晨达)

Chenda Li (Graduate Student Member, IEEE) received the B.Eng. degree from the Department of Electronic and Information Engineering, Huazhong University of Science and Technology, Wuhan, China, in 2018, and the M.Sc. degree in 2020 from the Department of Computer Science, Shanghai Jiao Tong University, Shanghai, China, where he is currently working toward the Ph.D. degree with the X-LANCE Lab, Department of Computer Science and Engineering, under the supervision of Yanmin Qian. His current research interests include speech enhancement and speech separation.

Education

  • Shanghai Jiao Tong University, Shanghai, China.
    • PhD in computer science and technology, Sep. 2020 - present.
    • MS in computer science and technology, Sep. 2018 - Jun. 2020.
  • Huazhong University of Science and Technology, Wuhan, China.
    • BS in electronic and information engineering, Sep. 2014 - Jun. 2018.

Research Experience

  • End-to-end speech enhancement and separation toolkit (ESPnet-SE), main developer, Jun. 2020 - present
    • An open source project for speech enhancement and separation based on ESPnet.
    • Participate in as one of the main developers and maintainers.
  • Research project from Samsung Research China, main participant, Mar. 2021 - Dec. 2021
    • Research and develop a low-latency algorithm for online continuous speech separation.
  • Jelenik speech & language technologies workshop, team member, Jun. 2020 - Jul. 2020
    • Research and develop algorithms for meeting-wise speech separation and long sequence modeling.

Selected awards and hornor

  • 1st place in the 3D Speech Enhancement task of L3DAS22 Challenge, 2022.
  • 3rd place in the audio-visual speech recognition task of Multi-modal Information based Speech Processing Challenge (MISP) 2022.
  • Best oral presentation in the National Conference on Man–Machine Speech Communication (NCMMSC) 2021.

Skills

  • Experience in Linux programming: Python, Bash, C/C++, Matlab
  • Experience in deep learning toolkit: PyTorch, MXNet
  • Speech processing toolkit: Kaldi, ESPnet
  • Experience in web programming: Django, Spring, Scrapy, HTML, CSS, Javascript
  • Cluster administrator in SJTU X-LANCE Lab

Publicataion

  • Chenda Li, Zhuo Chen and Yanmin Qian, “Dual-Path Modeling With Memory Embedding Model for Continuous Speech Separation,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 30, pp. 1508–1520, 2022.
  • Chenda Li, Lei Yang, Weiqin Wang and Yanmin Qian, “Skim: Skipping Memory Lstm for Low-Latency RealTime Continuous Speech Separation,” in Proc. ICASSP, May 2022, pp. 681–685.
  • Chenda Li, Zhuo Chen, Yi Luo, Cong Han, Tianyan Zhou, Keisuke Kinoshita, Marc Delcroix, Shinji Watanabe, and Yanmin Qian, “Dual-Path Modeling for Long Recording Speech Separation in Meetings,” in Proc. ICASSP, Jun. 2021, pp. 5739–5743.
  • Chenda Li, et al., “Dual-Path RNN for Long Recording Speech Separation,” in 2021 IEEE Spoken Language Technology Workshop (SLT), Jan. 2021, pp. 865–872.
  • Chenda Li , Jing Shi, Wangyou Zhang*, et al., “ESPnet-SE: End-To-End Speech Enhancement and Separation Toolkit Designed for ASR Integration,” in 2021 IEEE Spoken Language Technology Workshop (SLT), Jan. 2021, pp. 785–792.
  • Chenda Li and Yanmin Qian, “Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation,” in Proc. Interspeech, 2020, pp. 1426–1430.
  • Chenda Li and Yanmin Qian, “Deep Audio-Visual Speech Separation with Attention Mechanism,” in Proc. ICASSP, May 2020, pp. 7314–7318.
  • Chenda Li and Yanmin Qian, “Prosody usage optimization for children speech recognition with zero resource children speech,” in Proc. Interspeech, 2019, pp. 3446–3450.
  • Yifei Wu, Chenda Li, Jinfeng Bai, Zhongqin Wu and Yanmin Qian, “Time-Domain Audio-Visual Speech Separation on Low Quality Videos,” in Proc. ICASSP, May 2022, pp. 256–260.
  • Yifei Wu, Chenda Li, Song Yang, Zhongqin Wu, and Yanmin Qian, “Audio-Visual Multi-Talker Speech Recognition in a Cocktail Party,” in Proc. Interspeech, 2021, pp. 3021–3025.
  • Wei Wang, Xun Gong, Yifei Wu, Zhikai Zhou, Chenda Li, Wangyou Zhang, Bing Han and Yanmin Qian, “The Sjtu System For Multimodal Information Based Speech Processing Challenge 2021,” in Proc. ICASSP, May 2022, pp. 9261–9265.
  • Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang and Shinji Watanabe, “Towards Low-Distortion Multi-Channel Speech Enhancement: The ESPNET-SE Submission to the L3DAS22 Challenge,” in Proc. ICASSP, May 2022, pp. 9201–9205.
  • Wangyou Zhang, Jing Shi, Chenda Li, Shinji Watanabe, and Yanmin Qian, “Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions,” in 2021 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), Oct. 2021, pp. 146–150.
  • Yi Luo, Zhuo Chen, Cong Han, Chenda Li, Tianyan Zhou, and Nima Mesgarani, “Rethinking The Separation Layers In Speech Separation Networks,” in Proc. ICASSP, Jun. 2021, pp. 1–5.
  • Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R Hershey, Nima Mesgarani and Zhuo Chen, “Continuous speech separation using speaker inventory for long recording,” in Proc. Interspeech, 2021, pp. 3036–3040.
  • Pengcheng Guo et al., “Recent Developments on Espnet Toolkit Boosted By Conformer,” in Proc. ICASSP, Jun. 2021, pp. 5874–5878.
  • S. Watanabe et al., “The 2020 ESPnet Update: New Features, Broadened Applications, Performance Improvements, and Future Plans,” in 2021 IEEE Data Science and Learning Workshop (DSLW), Jun. 2021, pp. 1–6.

Demos:

Interactionable Speech Separation Demo

Colab Notebook

Speech Separation: Trump VS Wallace

Updated in 2022-06-07