Yanmin Qian received the B.S. degree from the Department of Electronic and Information Engineering,Huazhong University of Science and Technology, Wuhan, China, in 2007, and the Ph.D. degree from the Department of Electronic Engineering, Tsinghua University, Beijing, China, in 2012. Since 2013, he has been with the Department of Computer Science and Engineering, Shanghai Jiao Tong University (SJTU), Shanghai, China, where he is currently an Associate Professor. From 2015 to 2016, he also worked as an Associate Research in the Speech Group, Cambridge University Engineering Department, Cambridge, U.K. He is a senior member of IEEE and a member of ISCA, and one of the founding members of Kaldi Speech Recognition Toolkit. He has published more than 110 papers on speech and language processing with 4000+ citations, including the top conference: ICASSP, INTERSPEECH and ASRU. His current research interests include the acoustic and language modeling in speech recognition, speaker and language recognition, key word spotting, and multimedia signal processing.

Work Experience

  • 2017-present, Shanghai Jiao Tong University, Department of Computer Science and Engineering : Associate Professor
  • 2015-2016, Cambridge University, Department of Engineering, Machine Intelligence Laboratory : Research Associate
  • 2013-2016, Shanghai Jiao Tong University, Department of Computer Science and Engineering : Assistant Professor


  • 2007-2013, Tsinghua University, Department of Electronic Engineering: Ph.D. Candidate in Electronic Engineering
  • 2003-2007, Huazhong University of Science & Technology, Department of Electronic and Information Engineering: B.E in Electronic and Information Engineering

Research Interests

  • Speech & Language understanding and human computer interaction
  • Large vocabulary continuous speech recognition
  • Discriminative training of acoustic models
  • Robust speech recognition
  • Multilingual speech recognition and Low-resource speech recognition
  • Deep learning based speech signal processing
  • Multimedia Signal Processing
  • GPU and SOC based fast speech recognition


  • Structured Deep Learning Study for the Robust Speech Recognition in the Heterogeneous Noisy Scenario, supported by the NSFC (PI, 220,000¥)
  • Shanghai Sailing Program, supported by the Shanghai Government (PI, 200,000¥)
  • Multi-talker Speech Recognition for Cocktail Party Problem, supported by the Tencent Corporation (PI, 150,000¥)
  • High Performance Speech and Speaker Recognition System, supported by AVIC (PI, 500,000¥)
  • Deep Neural Network based denoising technology, supported by Baidu (PI, 100,000¥)
  • Deep Learning for Noise Robust Speech Recognition, supported by Shanghai Jiao Tong University (PI, 100,000¥)
  • Speech Objective Recognition and Content Transcription under Complex Environment, supported by the NSFC (Co-PI, 2,510,000¥)
  • Joint SJTU-AISpeech Laboratory, supported by AISpeech Corporation (Co-PI, 5,000,000¥)
  • Big Data Driven Natural Language Understanding, QA and Translation, supported by the National Key Research and Development Program of China (involved, ~50,000,000¥)
  • Cloud Service Platform for Service Robot, supported by the National Key Research and Development Program of China (involved, ~28,000,000¥)
  • Natural Speech Technology, supported by UK-EPSRC (involved, ~9,000,000 $)
  • Babel, supported by USA-IARPA (involved, ~20,000,000$)
  • Speech Recognition Technology Under the Low-Data-Resource Conditions, supported by the NSFC (involved , 830,000¥) and the PhD Research and Innovation Fund of Tsinghua University (involved , 40, 000¥)
  • Kaldi Speech Recognition Toolkit Development and Research
  • Large Vocabulary Continuous Speech Recognition System and Spoken Term Detection System Development and Research, Supported by the China 863 Projects, NSFC Projects and the Projects from China's Ministry of National Defense
  • Multilingual Speech Recognition Research, Supported by the Interdisciplinary Fund Support by School of Information Science and Technology in Tsinghua University (involved, 100,000¥)
  • Speech Recognition SOC System Development Under the Low-Hardware-Resource Condition, The SOC system is applied in the 2008 Olympic mascots, and win the High-tech Olympics Advanced Award (involved)


  • IEEE Senior Member & ISCA Member
  • Kaldi Group Member & Developer
  • Regular reviewer for IEEE/ACM Transactions on Audio, Speech and Language, IEEE Journal of Selected Topics in Signal Processing, IEEE Signal Processing Letter, Speech Communication, Computer Speech and Language, Neurocomputing, Multimedia Tools and Applications, etc
  • Regular reviewer for International conferences: ICASSP, INTERSPEECH, ASRU, SLT, ISCSLP, ChinaSip, EUSIPCO, COCOSDA, NCMMSC, ICPR, etc
Open-source toolkit
  • The Kaldi Speech Recognition Toolkit:
  • CUED-RNNLM-An open-source toolkit for efficient training and evaluation of recurrent neural network language models:
International Challenges
  • 2016--BTAS 2016 Speaker Anti-spoofing Competition, ranked 3rd of 7 teams
  • 2015--MGB Recognition Challenge - Recognition of Multi-Genre Broadcast Data, ranked 1st of 20 teams
  • 2015--Automatic Speaker Verification Spoofing and Countermeasures Challenge,ranked 3rd of 16 teams


  • 2019--Speech Communication Journal Best Paper Award
  • 2017--Shanghai Jiao Tong University SMC-Chenxin Level-B Young Scholar Award
  • 2016--ISCSLP Best Student Paper Award
  • 2016--Shanghai Science and Technology Young Scholar Award
  • 2015--The First Prize of the MGB Data Recognition Challenge
  • 2015--The Third Prize of the Automatic Speaker Verification Spoofing and Countermeasures Challenge
  • 2015--Shanghai Jiao Tong University SMC-Chenxin Young Scholar Award
  • 2014--The Second Prize of the Fourth Wu Wenjun Artificial Intelligence Science and Technology Award
  • 2013--The Second Excellent Doctoral Dissertation Award in Tsinghua University
  • 2012--Google Grants Award in InterSpeech2012 (Total 4 PhDs around the world)
  • 2012--Tsinghua-JiangZhen Scholarship, First Class(Total 25 students in Tsinghua University)
  • 2011--Tsinghua-JiangZhen Scholarship, First Class(Total 25 students in Tsinghua University)
  • 2010--Excellent PhD Academic Newcomer Award Nomination of Chinese Education Ministry
  • 2010--PhD Research and Innovation Award of Tsinghua University
  • 2009--Interdisciplinary Fund Support by School of Information Science and Technology in Tsinghua University


Yanmin Qian
SpeechLab, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China 3-515 SEIEE Building, 800 Dongchuan Road, Minhang District, Shanghai
200240, China


  • Total 115 published or under-review papers till now: 26 Journal Papers and 88 International Conference Papers.
  • 70 Papers are published on the top-level Journal and International Conference on Speech and Signal Processing, including IEEE T-ASLP, Speech Communication, ICASSP, InterSpeech, ASRU, etc.
  • 8 IEEE Transaction, 4 Speech Communication; 25 ICASSP, 19 InterSpeech, 8 ASRU
  • Papers are cited 4000+ times (Google Scholar)
  • 19 China National Invention Patents are applied, 2 has been granted; 3 USA Patents are applied
  • 2 co-authored Book chapters
  • 1 co-translated Books
  • List of Publication