On June 20, 2017, Pror. Hui Jiang from York University comes and gives a talk about “A New General Deep Learning Approach for Natural Language Processing”.

Hui Jiang received B.Eng. and M.Eng. degrees from University of Science and Technology of China (USTC), China and the Ph.D. degree from the University of Tokyo, Japan, all in electrical engineering. Since 2002, he has been working at Department of Electrical Engineering and Computer Science, York University, Toronto, Canada, initially as an assistant professor, then an associate professor and currently a full professor. His current research interests include machine learning, especially deep learning or neural networks, with its applications to speech and audio processing, natural language processing and computer vision. He served as an associate editor for IEEE Trans. on Audio, Speech and Language Processing (T-ASLP) between 2009-2013, and some technical committees of international conferences. He has recently received the 2016 IEEE Signal Processing Society (SPS) Best Paper Award.

Focus on the topic, “A New General Deep Learning Approach for Natural Language Processing”. Word embedding techniques, representing each discrete word as a dense vector in continuous high-dimension space, have achieved successes in many natural language processing (NLP) tasks. However, more NLP tasks rely on modelling variable-length sequences of words, not just isolated words. The conventional approach is to formulate these NLP tasks as sequence labelling problems and apply conditional random fields (CRF), convolutional neural networks (CNN) and recurrent neural networks (RNN). In this talk, I will introduce a new, general deep learning approach applicable to almost all NLP tasks, not limited to sequence labelling problems. The proposed method is built upon a simple but theoretically-guaranteed lossless encoding method, named fixed-size ordinally-forgetting encoding (FOFE), which can almost uniquely encode any variable-length word sequence into fixed-size representation [1]. Next, simple feedforward neural networks are used as universal function approximators to map fixed-size FOFE codes to various NLP targets. This framework is appealing since it is elegant and well-founded in theory and meanwhile fairly easy and fast to train in practice. It is totally data-driven without any feature engineering, and equally applicable to a wide range of NLP tasks. In this talk, I will introduce our recent work to apply this approach to several important NLP tasks, such as word embedding, language modelling [1], named entity recognition (NER) and mention detection [2], coreference resolution and text categorization. Experiments have shown that the proposed approach yields strong performance in all examined tasks, including Google 1-billion-word language modelling, KBP EDL contests, Pronoun Disambiguation Problem (PDP) in Winograd Schema Challenge [3]. Finally, as our on-going work, we are investigating this approach for more NLP problems, such as factoid Q/A, word sense disambiguation (WSD), parsing and machine translation.


[1] S. Zhang, H. Jiang, M. Xu, J. Hou, L. Dai, “The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models,” Proc. of ACL 2015.

[2] M. Xu, H. Jiang and S. Watcharawittayakul, “A FOFE-based Local Detection Approach for Named Entity Recognition and Mention Detection,” Proc. of ACL 2017.

[3] Q. Liu, H. Jiang, A. Evdokimov, Z. Ling, X. Zhu, S. Wei and Y. Hu, "Cause-Effect Knowledge Acquisition and Neural Association Model for Solving A Set of Winograd Schema Problems," Proc. of IJCAI 2017.

Next Post