JFL484H1: Computational Tools for Language Corpora

24L/12T

This is a practical course dealing with digital collections of written language or transcribed oral language (corpora). It will introduce both the practical steps involved in building digital text corpora (text normalization, digitizing, tag set construction, and so on), as well as provide an understanding of, and hands-on experience with, fundamental techniques from computational linguistics and natural language processing, including techniques using machine learning, such as part-of-speech tagging, language models, and vector semantics. By working with real corpora, students will use these techniques to construct and defend hypotheses about texts, about languages, and about human language in general. An emphasis will be placed on using French language corpora from several periods (including medieval and modern) to situate French historically and in the Canadian context. Students will be given opportunities to work with other languages. Lectures will be in English, and students will take tutorials either in English or in French.

Priority enrolment for students in major or specialist programs in Linguistics, French, or the Applied Data Science minor


The Physical and Mathematical Universes (5)