This course provides practical training in the use of modern regression

techniques for understanding linguistic and psycholinguistic data. In the

first part of the course, the standard linear model is introduced, with special

attention to model diagnostics, methods for dealing with collinearity, the

dummy coding of factors, and the use of link functions. The second part of the

course introduces the linear mixed-effects model, which is essential for

modeling data sets with repeated observations for predictors such as

participants in experiments, and linguistic units such as words, sentences, or

texts. The focus in this part of the course will be on the interpretation of

the parameters for these so-called random-effect factors. The third part of

the course moves on to generalized additive models, a relatively recent

development in regression modeling that makes it possible to capture nonlinear

relations between predictors and the response variable, including wiggly curves

and wiggly (hyper)surfaces.

Each class consist of a lecture introducing basic concepts and methods,

followed by a hands-on lab session in which participants receive training in

using the R statistical programming environment. Data sets discussed in the

lab sessions range from dialectometry to eye-movements and from reaction time

data to evoked response potentials. By the end of this course, participants

will be able to apply state-of-the-art methods in regression to their own data

sets, as well as critically evaluate analyses reported in the literature.

- Dozierende: Rolf Harald Baayen

Mathematical methods are essential for understanding and working in theoretical and computational linguistics. This course introduces the key concepts from the areas of set theory, algebra and logic, which belong to the basic repertoire of linguistic methods. The main goal of the course is to provide the students with sufficient competence in basic notations, terminology and concepts of discrete mathematics for their studies in theoretical and computational linguistics. Familiarity with concepts such as sets, functions and propositions, and the ability to work with simple proof techniques are a crucial prerequisite for subsequent courses.

Students should acquire sufficient competence in basic notations,
terminology and concepts of mathematics for their studies in
linguistics. The topics of the course comprise the most essential
mathematical notions needed in general linguistics, computational
linguistics, document processing and information management. Familiarity
with concepts such as sets, functions and propositions, and the ability
to work with simple proof techniques will be expected in subsequent
courses. The main purpose of the course is to equip the participants
with the most basic mathematical tools which they will need in their
linguistics courses.

- Dozierende: Petrus Hendrix
- Dozierende: Julia Krings
- Dozierende: Konstantin Sering

Given that natural languages cannot be characterized by simply listing all possible sentences and their meaning, a range of grammar formalisms have been developed to characterize form and meaning in a general and compact way. The approaches differ in terms of their focus, empirical coverage, formal foundations, expressive power, conceptualization of generalizations, and the processing regimes that have been developed for those formalisms.

After a general overview of grammar types in the Chomsky Hierarchy, we will discuss plain context-free grammars as a baseline on which we will introduce and compare several current grammar formalisms. The plan is to include a discussion of unification-based phrase structure grammars and dependency grammars like Head-Driven Phrase Structure Grammar (HPSG), Lexical Functional Grammar (LFG), Slot Grammar, but, if time allows, also others like Categorial Grammar. The focus will be on obtaining a sound working knowledge of how different formalisms capture some of the fundamental phenomena of natural language syntax: argument and adjunct realization, agreement and government, middle-distance phenomena (e.g.,equi, raising), long-distance phenomena (e.g., fronting).

- Dozierende: Natalie Sonja Désirée Clarius
- Dozierende: Kurt Eberle

Data structures and algorithms are core topics in linguistic programming. Data structures are used to store and retrieve data and algorithms are the recipes used to process data. This course emphasizes the understanding and Java implementation of basic data structures such as linked lists and trees, and the algorithms used to store and retrieve the information stored in them. We will see how these data structures are used in natural language processing programs.

- Dozierende: Jochen Saile

Texts in digital form are an essential preliminary for any subsequent analyses. The course offers a multi-faceted perspective how texts are represented in computers, with topics including (among other) character encodings (e.g. UTF-8), text structuring and data modeling (e.g. XML, HTML format), text licensing (e.g. creative commons licenses), text visualization (e.g. CSS), and text querying tools (e.g. XQUERY). the course combines a theoretical discussion with a practical approach as an illustration of of the concepts.

- Dozierende: Xiaobin Chen
- Dozierende: Björn Rudzewitz

- Dozierende: Ching-Chu Sun

- Dozierende: Ching-Chu Sun

This course introduces a number of core methods and applications in natural language processing (NLP). On the one hand, the course will

focus on the core tasks in computational linguistic (e.g., part of speech tagging, and statistical parsing) and major NLP applications

(e.g., named entity recognition or machine translation), on the other hand, a selection of relevant concepts and methods from probability theory,

statistics and machine learning will be introduced.

The course is compulsory for the BA degree International Studies in Computational Linguistics. For other degree programs, please contact the instructor before signing up.

- Dozierende: Cagri Cöltekin
- Dozierende: Kuan Yu