Zhikai Zhou, Tian Tan, Yanmin Qian | Punctuation Prediction for Streaming On-Device Speech Recognition | Conference | ICASSP | |
Zhikai Zhou, Wei Wang, Wangyou Zhang, Yanmin Qian | Exploring Effective Data Utilization for Low-Resource Speech Recognition | Conference | ICASSP | |
Wenbin Jiang, Zhijun Liu, Kai Yu, Fei Wen | Speech enhancement with neural homomorphic synthesis | Conference | ICASSP | |
Lingfeng Dai, Chen Lu, Zhikai Zhou, Kai Yu | LatticeBART: Lattice-to-Lattice Pre-Training for Speech Recognition | Conference | ICASSP | |
Bei Liu, Haoyu Wang, Zhengyang Chen, Shuai Wang, Yanmin Qian | Self-Knowledge Distillation via Feature Enhancement for Speaker Verification | Conference | ICASSP | |
Yiwei Guo, Chenpeng Du, Kai Yu | Unsupervised word-level prosody tagging for controllable speech synthesis | Conference | ICASSP | |
Bing Han, Zhengyang Chen , Bei Liu, Yanmin Qian | MLP-SVNET: A Multi-Layer Perceptrons Based Network for Speaker Verification | Conference | ICASSP | |
Bing Han, Zhengyang Chen , Yanmin Qian | Local Information Modeling with Self-Attention for Speaker Verification | Conference | ICASSP | |
Zhengyang Chen ,Sanyuan Chen, Yu Wu, Yao Qian, Chengyi Wang, Shujie Liu, Yanmin Qian, Michael Zeng | Large-Scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification | Conference | ICASSP | |
Wei Wang, Xun Gong, Yifei Wu, Zhikai Zhou, Chenda Li, Wangyou Zhang, Bing Han, Yanmin Qian | The Sjtu System For Multimodal Information Based Speech Processing Challenge 2021 | Conference | ICASSP | |
Yifei Wu, Chenda Li, Jinfeng Bai, Zhongqin Wu, Yanmin Qian | Time-domain Audio-visual Speech Separation On Low Quality Videos | Conference | ICASSP | |
Wei Wang, Shuo Ren, Yao Qian, Shujie Liu, Yu Shi, Yanmin Qian, Michael Zeng | Optimizing Alignment of Speech and Language Latent Spaces for End-To-End Speech Recognition and Understanding | Conference | ICASSP | |
Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu and Kenny Q. Zhu | Can Audio Captions Be Evaluated with Image Caption Metrics? | Conference | ICASSP | |
Wen Wu, Mengyue Wu and Kai Yu | Climate and Weather: Inspecting Depression Detection via Emotion Recognition | Conference | ICASSP | |
Xuenan Xu, Mengyue Wu and Kai Yu | Diversity-controllable and Accurate Audio Captioning Based on Neural Condition | Conference | ICASSP | |
Siyu Lou, Xuenan Xu, Mengyue Wu and Kai Yu | Audio-Text Retrieval in Context | Conference | ICASSP | |
Guangwei Li, Xuenan Xu, Mengyue Wu, and Kai Yu | Navigating Audio-Visual Event Detection Across Mismatched Modalities | Conference | ICASSP | |
Guangwei Li, Xuenan Xu, Mengyue Wu, and Kai Yu | Category-Adapted Sound Event Enhancement with Weakly Labeled Data | Conference | ICASSP | |
Guangwei Li, Xuenan Xu, Mengyue Wu, and Kai Yu | Text Adaptive Detection for Customizable Keyword Spotting | Conference | ICASSP | |
Zihan Zhao, Lu Chen, Ruisheng Cao, Hongshen Xu, Xingyu Chen, Kai Yu | TIE: Topological Information Enhanced Structural Reading Comprehension on Web Pages | Conference | NAACL | |
Yanmin Qian, Zhikai Zhou | Optimizing Data Usage for Low-Resource Speech Recognition | Journal | TASLP | |
Chenda Li, Zhuo Chend and Yanmin Qian | Dual-Path Modeling With Memory Embedding Model for Continuous Speech Separation | Journal | TASLP | |
Sanyuan Chen, Chengyi Wang, Zhengyang Chen, Yu Wu, Shujie Liu, Zhuo Chen, Jinyu Li, Naoyuki Kanda, Takuya Yoshioka, Xiong Xiao, Jian Wu, Long Zhou, Shuo Ren, Yanmin Qian, Yao Qian, Jian Wu, Michael Zeng, Xiangzhan Yu, Furu Wei | WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing. IEEE Journal of Selected Topics in Signal Processing | Journal | JSTSP |
Heinrich Dinkel, Mengyue Wu and Kai Yu | Towards Duration Robust Weakly Supervised Sound Event Detection | Journal | TASLP | |
Yanmin Qian, Zhengyang Chen and Shuai Wang | Audio-Visual Deep Neural Network for Robust Person Verification | Journal | TASLP | |
Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu | Audio Caption in a Car Setting with a Sentence-Level Loss | Conference | ISCSLP | |
Xun Gong, Zhengyang Chen, Yexin Yang, Shuai Wang, Lan Wang and Yanmin Qian | Speaker Embedding Augmentation with Noise Distribution Matching | Conference | ISCSLP | |
Shuai Wang, Yexin Yang, Yanmin Qian and Kai Yu | Revisiting the Statistics Pooling Layer in Deep Speaker Embedding Learning | Conference | ISCSLP | |
Chenda Li, Jing Shi, Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Naoyuki Kamo, Moto Hira, Tomoki Hayashi, Christoph Boeddeker, Zhuo Chen and Shinji Watanabe | ESPnet-SE: end-to-end speech enhancement and separation toolkit designed for ASR integration | Conference | SLT | |
Chenda Li, Yi Luo, Cong Han, Jinyu Li, Takuya Yoshioka, Tianyan Zhou, Marc Delcroix, Keisuke Kinoshita, Christoph Boeddeker, Yanmin Qian, Shinji Watanabe and Zhuo Chen | Dual-path RNN for Long Recording Speech Separation | Conference | SLT | |
Chenpeng Du, Hao Li, Yizhou Lu, Lan Wang and Yanmin Qian | Data Augmentation for End-to-end Code-Switching Speech Recognition | Conference | SLT |
Chen Zhang, Daihui Peng,Lu Lv,Kaiming Zhuo,Kai Yu,Tian Shen,Yifeng Xu and Zhen Wang | Individual Perceived Stress Mediates Psychological Distress in Medical Workers During COVID-19 Epidemic Outbreak in Wuhan | Journal | Neuropsychiatric Disease and Treatment | |
Kai Yu, Rao Ma, Kaiyu Shi and Qi Liu | Neural Network Language Model Compression With Product Quantization and Soft Binarization | Journal | TASLP | |
Zhi Chen, Lu Chen, Xiaoyuan Liu and Kai Yu | Distributed Structured Actor-Critic Reinforcement Learning for Universal Dialogue Management | Journal | TASLP | |
Shuai Wang, Yexin Yang, Zhanghao Wu, Yanmin Qian and Kai Yu | Data Augmentation using Deep Generative Models for Embedding based Speaker Recognition | Journal | TASLP | |
Qi Liu, Zhehuai Chen, Hao Li, Mingkun Huang, Yizhou Lu and Kai Yu | Modular End-to-end Automatic Speech Recognition Framework for Acoustic-to-word Model | Journal | TASLP | |
Su Zhu, Ruisheng Cao and Kai Yu | Dual Learning for Semi-Supervised Natural Language Understanding | Journal | TASLP | |
Wangyou Zhang, Xuankai Chang, Yanmin Qian and Shinji Watanabe | Improving End-to-End Single-Channel Multi-Talker Speech Recognition | Journal | TASLP | |
Su Zhu, Zijian Zhao, Rao Ma and Kai Yu | Prior Knowledge Driven Label Embedding for Slot Filling in Natural Language Understanding | Journal | TASLP | |
Wangyou Zhang and Yanmin Qian | Learning Contextual Language Embeddings for Monaural Multi-Talker Speech Recognition | Conference | INTERSPEECH | |
Wangyou Zhang, Aswin Shanmugam Subramanian, Xuankai Chang, Shinji Watanabe and Yanmin Qian | End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming | Conference | INTERSPEECH | |
Zhijun Liu, Kuan Chen and Kai Yu | Neural Homomorphic Vocoder | Conference | INTERSPEECH | |
Yefei Chen, Heinrich Dinkel, Mengyue Wu and Kai Yu | Voice activity detection in the wild via weakly supervised sound event detection | Conference | INTERSPEECH | |
Chen Liu, Su Zhu, Zijian Zhao, Ruisheng Cao, Lu Chen and Kai Yu | Jointly Encoding Word Confusion Network and Dialogue Context with BERT for Spoken Language Understanding | Conference | INTERSPEECH | |
Yizhou Lu, Mingkun Huang, Hao Li, Jiaqi Guo and Yanmin Qian | Bi-encoder Transformer Network for Mandarin-English Code-switching Speech Recognition using Mixture of Experts | Conference | INTERSPEECH | |
Zhengyang Chen, Shuai Wang and Yanmin Qian | Adversarial Domain Adaptation for Speaker Verification Using Partially Shared Network | Conference | INTERSPEECH | |
Zhengyang Chen, Shuai Wang and Yanmin Qian | Multi-Modality Matters: A Performance Leap on VoxCeleb | Conference | INTERSPEECH | |
Hongji Wang, Heinrich Dinkel, Shuai Wang, Yanmin Qian and Kai Yu | Dual-Adversarial Domain Adaptation for Generalized Replay Attack Detection | Conference | INTERSPEECH | |
Chenda Li and Yanmin Qian | Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation | Conference | INTERSPEECH | |
Zihan Zhao, Yuncong Liu, Lu Chen, Qi Liu, Rao Ma, and Kai Yu | An Investigation on Different Underlying Quantization Schemes for Pre-trained Language Models | Conference | NLPCC | |
Zihan Xu, Zhi Chen, Lu Chen, Su Zhu and Kai Yu | Memory Attention Neural Network For Multi-Domain Dialogue State Tracking | Conference | NLPCC | |
Chen Liu, Su Zhu, Lu Chen and Kai Yu | Robust Spoken Language Understanding with RL-Based Value Error Recovery | Conference | NLPCC | |
Xuenan Xu, Heinrich Dinkel, Mengyue Wu and Kai Yu | A CRNN-GRU Based Reinforcement Learning Approach to Audio Captioning | Conference | DCASE | |
Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu and Weiyao Lin | Multiple Sound Sources Localization from Coarse to Fine | Conference | ECCV | |
Rui Qian, Di Hu, Heinrich Dinkel, Mengyue Wu, Ning Xu and Weiyao Lin | A Two-Stage Framework for Multiple Sound-Source Localization | Conference | CVPR | |
Lu Chen, Yanbin Zhao, Boer Lv, Lesheng Jin, Zhi Chen, Su Zhu and Kai Yu | Neural Graph Matching Networks for Chinese Short Text Matching | Conference | ACL | |
Yanbin Zhao, Lu Chen, Zhi Chen, Ruisheng Cao, Su Zhu and Kai Yu | Line Graph Enhanced AMR-to-Text Generation with Mix-Order Graph Attention Networks | Conference | ACL | |
Ruisheng Cao, Su Zhu, Chenyu Yang, Chen Liu, Rao Ma, Yanbin Zhao, Lu Chen and Kai Yu | Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing | Conference | ACL | |
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux and Shinji Watanabe | End-To-End Multi-Speaker Speech Recognition With Transformer | Conference | ICASSP | |
Heinrich Dinkel and Kai Yu | Duration Robust Weakly Supervised Sound Event Detection | Conference | ICASSP | |
Yexin Yang, Shuai Wang, Xun Gong, Yanmin Qian and Kai Yu | Text Adaptation for Speaker Verification with Speaker-Text Factorized Embeddings | Conference | ICASSP | |
Chenda Li, and Yanmin Qian | Deep Audio-Visual Speech Separation with attention Mechanism | Conference | ICASSP | |
Zhengyang Chen, Shuai Wang, Yanmin Qian and Kai Yu | Channel Invariant Speaker Embedding Learning with Joint Multi-Task and Adversarial Training | Conference | ICASSP | |
Chenpeng Du, and Kai Yu | Speaker Augmentation for Low Resource Speech Recognition | Conference | ICASSP | |
Chenda Li, and Yanmin Qian | Deep Audio-Visual Speech Separation with attention Mechanism | Conference | ICASSP | |
Shuai Wang, Johan Rohdin, Lukáš Burget, Oldřich Plchot, Kai Yu and Jan Černocký | Investigation of Specaugment for Deep Speaker Embedding Learning | Conference | ICASSP | |
Federico Landini, Shuai Wang, Mireia Diez, Lukáš Burget, Pavel Matějka, Kateřina Žmolíková, Ladislav Mošner, Anna Silnova, Oldřich Plchot, Ondřej Novotný Hossein Zeinali and Johan Rohdin | BUT System for the Second Dihard Speech Diarization Challenge | Conference | ICASSP | |
Mireia Diez, Lukáš Burget, Federico Landini, Shuai Wang and Honza Černocký | Optimizing Bayesian Hmm Based X-Vector Clustering for the Second Dihard Speech Diarization Challenge | Conference | ICASSP | |
Rao Ma, Hao Li, Qi Liu, Lu Chen and Kai Yu | Neural Lattice Search for Speech Recognition | Conference | ICASSP | |
Rao Ma, Lesheng Jin, Qi Liu, Lu Chen and Kai Yu | Addressing the Polysemy Problem in Language Modeling with Attentional Multi-Sense Embeddings | Conference | ICASSP | |
Lu Chen, Boer Lv, Chi Wang, Su Zhu, Bowen Tan and Kai Yu | Schema-Guided Multi-Domain Dialogue State Tracking with Graph Attention Neural Networks | Conference | AAAI | |
Yanbin Zhao, Lu Chen, Zhi Chen and Kai Yu | Semi-Supervised Text Simplification with Back-Translation and Asymmetric Denoising Autoencoders | Conference | AAAI |
Yanmin Qian and Xu Xiang | Binary Neural Networks for Speech Recognition | Journal | FITEE | |
Yanmin Qian, Hu Hu and Tian Tan | Data Augmentation Using Generative Adversarial Networks for Robust Speech Recognition. Speech Communication, vol | Journal | SC | |
Lu Chen, Zhi Chen, Bowen Tan, Sishan Long, Milica Gasic and Kai Yu | AgentGraph: Toward Universal Dialogue Management With Structured Deep Reinforcement Learning. | Journal | TASLP | |
Shuai Wang, Zili Huang, Yanmin Qian and Kai Yu | Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification. | Journal | TASLP | |
Xu Xiang, Shuai Wang, Houjun Huang, Yanmin Qian and Kai Yu | Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition | Conference | APSIPA | |
Xuankai Chang, Wangyou Zhang, Yanmin Qian, Jonathan Le Roux and Shinji Watanabe | MIMO-Speech: End-to-End Multi-Channel Multi-Speaker Speech Recognition | Conference | ASRU | |
Wangyou Zhang, Man Sun, Lan Wang and Yanmin Qian | End-to-End Overlapped Speech Detection and Speaker Counting with Raw Waveform | Conference | ASRU | |
Mingkun Huang, Yizhou Lu, Lan Wang, Yanmin Qian and Kai Yu | Exploring Model Units and Training Strategies for End-to-End Speech Recognition | Conference | ASRU | |
Rao Ma, Qi Liu and Kai Yu | Highly Efficient Neural Network Language Model Compression Using Soft Binarization Training | Conference | ASRU | |
Peiyao Sheng, Zhuolin Yang and Yanmin Qian | GANs for Children: A Generative Data Augmentation Strategy for Children Speech Recognition | Conference | ASRU | |
Zijian Zhao, Su Zhu and Kai Yu | Data Augmentation with Atomic Templates for Spoken Language Understanding | Conference | EMNLP | |
Yefei Chen, Shuai Wang, Yanmin Qian and Kai Yu | End-to-End Speaker-Dependent Voice Activity Detection | Conference | NCMMSC | |
Hao Li, Zhehuai Chen, Qi Liu, Yanmin Qian and Kai Yu | OOV Words Extension for Modular Neural Acoustics-to-Word Model | Conference | NCMMSC | |
Hao Li, Chen Liu, Su Zhu and Kai Yu | Robust Spoken Language Understanding with Acoustic and Domain Knowledge | Conference | ICMI | |
Su Zhu, Zijian Zhao, Tiejun Zhao, Chengqing Zong and Kai Yu | CATSLU: The 1st Chinese Audio-Textual Spoken Language Understanding Challenge | Conference | ICMI | |
Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael Seltzer and Christian Fuegen | Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR | Conference | InterSpeech | |
Jiaqi Guo, Yongbin You, Yanmin Qian and Kai Yu | Joint Decoding of CTC Based Systems for Speech Recognition | Conference | InterSpeech | |
Chenda Li and Yanmin Qian | Prosody Usage Optimization for Children Speech Recognition with Zero Resource Children Speech | Conference | InterSpeech | |
Wangyou Zhang, Xuankai Chang and Yanmin Qian | Knowledge Distillation for End-to-End Monaural Multi-talker ASR System | Conference | InterSpeech | |
Wangyou Zhang, Ying Zhou and Yanmin Qian | Robust DOA Estimation Based on Convolutional Neural Network and Time-Frequency Masking | Conference | InterSpeech | |
Zhanghao Wu, Shuai Wang, Yanmin Qian and Kai Yu | Data Augmentation using Variational Autoencoder for Embedding based Speaker Verification | Conference | InterSpeech | |
Yexin Yang, Hongji Wang, Heinrich Dinkel, Zhengyang Chen, Shuai Wang, Yanmin Qian and Kai Yu | The SJTU Robust Anti-spoofing System for the ASVspoof 2019 Challenge | Conference | InterSpeech | |
Hongji Wang, Heinrich Dinkel, Shuai Wang, Yanmin Qian and Kai Yu | Cross-domain Replay Spoofing Attack Detection using Domain Adversarial Training | Conference | InterSpeech | |
Shuai Wang, Johan Rohdin, Lukáš Burget, Oldřich Plchot, Yanmin Qian, Kai Yu and Jan Černocký | On the Usage of Phonetic Information for Text-independent Speaker Embedding Extraction | Conference | InterSpeech | |
Mireia Diez, Lukáš Burget, Shuai Wang, Johan Rohdin and Jan Černocký | Bayesian HMM based x-vector clustering for Speaker Diarization | Conference | InterSpeech | |
Ruisheng Cao, Su Zhu, Chen Liu, Jieyu Li and Kai Yu | Semantic Parsing with Dual Learning | Conference | ACL | |
Mengyue Wu, Heinrich Dinkel and Kai Yu | Audio Caption: Listen and Tell | Conference | ICASSP | |
Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael Seltzer and Christian Fuegen | End-to-end Contextual Speech Recognition using Class Language Models and a Token Passing Decoder | Conference | ICASSP | |
Zijian Zhao, Su Zhu and Kai Yu | A Hierarchical Decoding Model for Spoken Language Understanding from Unaligned Data | Conference | ICASSP | |
Shuai Wang, Yexin Yang, Tianzhe Wang, Yanmin Qian and Kai Yu | Knowledge Distillation for Small Foot-print Deep Speaker Embedding | Conference | ICASSP | |
Xuankai Chang, Yanmin Qian, Kai Yu and Shinji Watanabe | End-to-end Monaural Multi-speaker ASR System without Pretraining | Conference | ICASSP |
Zhehuai Chen, Jasha Droppo, Jinyu Li, Wayne Xiong | Progressive Joint Modeling in Unsupervised Single-channel Overlapped Speech Recognition | Journal | TASLP | |
Tian Tan, Yanmin Qian, Hu Hu, Wen Ding, Ying Zhou, Kai Yu | Adaptive very deep convolutional residual network for noise robust speech recognition | Journal | TASLP | |
Kai Yu, Zijian Zhao, Xueyang Wu, Hongtao Lin and Xuan Liu | Rich Short Text Conversation Using Semantic Key Controlled Sequence Generation | Journal | TASLP | |
Xuan Liu, Di Cao and Kai Yu | Binarized LSTM Language Model | Conference | NAACL | |
Ying Zhou and Yanmin Qian | Robust Mask Estimation by Integrating Neural Network-Based and Clustering-Based Approaches for Adaptive Acoustic Beamforming | Conference | ICASSP | |
Lu Chen, Cheng Chang, Zhi Chen, Bowen Tan, Milica Gasic and Kai Yu | Policy Adaptation for Deep Reinforcement Learning-based Dialogue Management | Conference | ICASSP | |
Shuai Wang, Yanmin Qian and Kai Yu | Focal KL-Divergence based Dilated Convolutional Neural Networks for Co-channel Speaker Identification | Conference | ICASSP | |
Ruinian Chen and Kai Yu | Fast OOV Words Incorporation using Structured Word Embeddings for Neural Network Language Model | Conference | ICASSP | |
Hu Hu, Tian Tan and Yan min Qian | Generative Adversarial Networks based Data Augmentation for Noise Robust Speech Recognition | Conference | ICASSP | |
Wen Ding, Tian Tan and Yanmin Qian | Fast Adaptation on Deep Mixture Generative Network Based Acoustic Modeling | Conference | ICASSP | |
Xuankai Chang, Yanmin Qian and Dong Yu | Adaptive Permutation Invariant Training with Auxiliary Information for Monaural Multi-Talker Speech Recognition | Conference | ICASSP | |
Tian Tan, Yanmin Qian and Dong Yu | Knowledge Transfer in Permutation Invariant Training for Single-channel Multi-talker Speech Recognition | Conference | ICASSP | |
Zhehuai Chen, Qi Liu, Hao Li, Kai Yu | On Modular Training of Neural Acoustics-to-word Model for LVCSR | Conference | ICASSP | |
Zhehuai Chen, Jasha Droppo | Sequence Modeling in Unsupervised Single-channel Overlapped Speech Recognition | Conference | ICASSP | |
Su Zhu, Ouyu Lan, Kai Yu | Robust Spoken Language Understanding with Unsupervised ASR-error Adaptation | Conference | ICASSP | |
Ouyu Lan, Su Zhu, Kai Yu | Semi-Supervised Training Using Adversarial Multi-Task Learning for Spoken Language Understanding | Conference | ICASSP | |
Zili Huang, Shuai Wang and Yanmin Qian | Joint I-Vector with End-to-End System for Short Duration Text-Independent Speaker Verification | Conference | ICASSP |
Yanmin Qian, Nanxin Chen, Heinrich Dinkel and Zhizheng Wu | Deep Feature Engineering for Noise Robust Spoofing Detection | Journal | TASLP | |
Zhehuai Chen, Yimeng Zhuang, Yanmin Qian and Kai Yu | Phone Synchronous Speech Recognition with CTC Lattices | Journal | TASLP | |
Ruinian Chen, Ying Zhou and Yanmin Qian | Emotion Recognition Using Support Vector Machine and Deep Neural Network | Conference | NCMMSC | |
Cheng Chang, Huifeng Zhang, Zhangxuan Gu and Yanmin Qian | Fusion Model for Speech Emotion Recognition with Low Level Descriptor Features | Conference | NCMMSC | |
Yue Wu, Qi Liu, Kai Yu | The adaptive adjustment of learning rate is applied in the language model | Conference | NCMMSC | |
Xuan Liu, Kai Yu | GLEU-Guided Multi-resolution Network for Short Text Conversation | Conference | NCMMSC | |
Kaiyu Shi, Xuan Liu and Yanmin Qian | Speech Emotion Recognition Based on SVM and GMM-HMM Hybrid System | Conference | NCMMSC | |
Yue Wu, Tianxing He, Zhehuai Chen, Yanmin Qian, Kai Yu | Multi-view LSTM Language Model with Word-synchronized Auxiliary Feature for LVCSR | Conference | CCL | |
Zhehuai Chen, Yanmin Qian, and Kai Yu | A unified confidence measure framework using auxiliary normalization graph | Conference | IScIDE | |
Di Cao and Kai Yu | Deep Attentive Structured Language Model Based on LSTM | Conference | IScIDE | |
Xiaowei Jiang, Shuai Wang, Xu Xiang and Yanmin Qian | Integrating Online i-vector into GMM-UBM for Text-dependent Speaker Verification | Conference | APSIPA | |
Qi Liu, Yanmin Qian and Kai Yu | Future Vector Enhanced LSTM Language Model for LVCSR | Conference | ASRU | |
Lu Chen, Xiang Zhou, Cheng Chang, Runzhe Yang and Kai Yu | Agent-Aware Dropout DQN for Safe and Efficient On-line Dialogue Policy Learning | Conference | EMNLP | |
Cheng Chang, Runzhe Yang, Lu Chen, Xiang Zhou and Kai Yu | Affordable On-line Dialogue Policy Learning | Conference | EMNLP | |
Xu Xiang, Yanmin Qian and Kai Yu | Binary Deep Neural Networks for Speech Recognition | Conference | InterSpeech | |
Dong Yu, Xuankai Chang and Yanmin Qian | Recognizing Multi-Talker Speech with Permutation Invariant Trainin | Conference | InterSpeech | |
Bo Chen, Jiahao Lai and Kai Yu | Comparison of Modeling Target in LSTM-RNN Duration Model | Conference | InterSpeech | |
Bo Chen, Tianling Bian and Kai Yu | Discrete Duration Model For Speech Synthesis | Conference | InterSpeech | |
Shuai Wang, Yanmin Qian and Kai Yu | What Does the Speaker Embedding Encode? | Conference | InterSpeech | |
Heinrich Dinkel, Yanmin Qian, Kai Yu | Small-footprint convolutional neural network for spoofing detection | Conference | IJCNN | |
Lu Chen, Runzhe Yang, Cheng Chang, Zihao Ye, Xiang Zhou and Kai Yu | On-line Dialogue Policy Learning with Companion Teaching | Conference | EACL | |
Heinrich Dinkel, Nanxin Chen, Yanmin Qian and Kai Yu | End-To-End Spoofing Detection With Raw Waveform Cldnns | Conference | ICASSP | |
Su Zhu and Kai Yu | Encoder-decoder with Focus-mechanism for Sequence Labelling Based Spoken Language Understanding | Conference | ICASSP | |
Zhehuai Chen, Yimeng Zhuang and Kai Yu | Confidence Measures for CTC-based Phone Synchronous Decoding | Conference | ICASSP |