Universität Wien
Warning! The directory is not yet complete and will be amended until the beginning of the term.

136102 UE Statistics and Machine Learning for (Computational) Linguists (2025S)

Continuous assessment of course work

Registration/Deregistration

Note: The time of your registration within the registration period has no effect on the allocation of places (no first come, first served).

Details

max. 25 participants
Language: English

Lecturers

Classes (iCal) - next class is marked with N

  • Friday 14.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 21.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 28.03. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 04.04. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 11.04. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 02.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 09.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 16.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 23.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 30.05. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 06.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 13.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 20.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5
  • Friday 27.06. 09:45 - 11:15 Seminarraum 6 Hauptgebäude, Tiefparterre Stiege 9 Hof 5

Information

Aims, contents and method of the course

This course gives a basic introduction to statistical methods for (computational) linguists.

Requirements:
• Computer literacy (e.g. Computational Background Skills for Digital Humanities (EC))
• Introduction to DH Tools and Methods (Skills I)
• Data Structures and Data Management in the Humanities (Skills I)

The contents covered in the course are:

Descriptive Statistics
• Levels of measure
• Measures of central tendencies
• Normal distribution
• Correlation (both Pearson and Spearman)
• Interreliability (Cohen’s/Fleiss Kappa)
Inferential Statistics
• Concept of statistical significance testing
Basic probability theory
• Naïve Bayes
• Pointwise Mutual Information
Machine Learning
• Introduction of the concept of supervised machine learning
• Overfitting
• Logistic Regression
• Feature Engineering
• Vector space models/word embeddings

Assessment and permitted materials

Course evaluation will consist of a combination of in-class participation (20%) and homework assignments (80%).

There are 3 types of exercises in this course:
• theoretical questions
• pen-and-paper calculation exercises
• programming tasks (Python!)

Minimum requirements and assessment criteria

Attendance is required; regular participation is the key to completing the course; all students must provide their computing environment; homework assignments must be submitted on time.

Examination topics

There is no examination for the course.

Reading list

Christopher Butler: Statistics in Linguistics, 1985.

Association in the course directory

DH-S II
S-DH Cluster I: Language and Literature

Last modified: We 29.01.2025 14:06