This course surveys the principal difficulties of working with written language data, the fundamental techniques that are used in processing natural language, and the core applications of NLP technology. Topics covered in the course include language modeling, text classification, labeling sequential data (tagging), parsing, information extraction, question answering, machine translation, and semantics. The dominant paradigm in contemporary NLP uses supervised machine learning to train models based on either probability theory or deep neural networks. Both formalisms will be covered. A practical approach is emphasized in the course, and students will write programs and use open source toolkits and to solve a variety of problems.

Course prerequisite(s): 

6There are no formal prerequisites, though having taken any of 605.649 Introduction to Machine Learning, 605.744 Information Retrieval, or 605.645 Artificial Intelligence is helpful.

Course instructor(s) :