Methods form machine learning are indispensable tools for computational studies of language. This seminar covers some of the important concepts and a number of prominent machine learning methods ranging from early foundational methods to current state-of-the-art techniques. Objectives of the course are two-fold. First, the knowledge gained during the course will aid the students in understanding the literature on computational linguistics and related fields where majority of work includes applications of machine learning methods. Second, after completing this course, students should be able to choose the right machine learning techniques and apply them correctly in their work.

The course assumes basic programming skills and ability process linguistic data (the 'Text Technology' course or equivalent coursework or experience is required). Although our focus will be on intuitive explanations and practical exercises, the students should be prepared to digest some mathematical notation. Some of the foundational topics, such as probability theory and statistics, will be introduced during the first lectures.

This course provides practical training in the use of modern regression
techniques for understanding linguistic and psycholinguistic data.  In the
first part of the course, the standard linear model is introduced, with special
attention to model diagnostics, methods for dealing with collinearity, the
dummy coding of factors, and the use of link functions.  The second part of the
course introduces the linear mixed-effects model, which is essential for
modeling data sets with repeated observations for predictors such as
participants in experiments, and linguistic units such as words, sentences, or
texts.  The focus in this part of the course will be on the interpretation of
the parameters for these so-called random-effect factors.  The third part of
the course moves on to generalized additive models, a relatively recent
development in regression modeling that makes it possible to capture nonlinear
relations between predictors and the response variable, including wiggly curves
and wiggly (hyper)surfaces.

This course introduces a series of cognitive models that address the question
of how language is processed during reading, listening, and speaking.  Classic
computational models, both connectionist and symbolic, are discussed, as well
as more recent Bayesian approaches, models using Act-R, and
approaches that make use of discrimination learning.  The course covers models
for single word processing, including the comprehension of morphologically
complex words in auditory and visual comprehension, models addressing
syntactic processing, and models developed for accounting for semantic
effects in speech production.  Students will be familiarized with the computational
implementations of those models for which code is publicly available.