Prospectus

nl en

Python for Linguists

Course
2022-2023

Admission requirements

None.

Description

Scientists have always been building their own research tools. For linguists this is no different, and many of our tools take the form of software, such as programs that let us collect data, analyze it, run experiments, or simulate aspects of language processing. Python is arguably the most accessible and most popular programming language in which such tools can be created.

This course introduces Linguistics students to Python, with the help of a large collection of tailor-made exercises, focused on incremental discovery of the Python language, development of practical skills (how to get stuff done), conceptual understanding (why to do it that way) and acquisition of powerful coding habits and way of thinking. Furthermore, a series of classroom coding adventures will acquaint students with a selection of more advanced topics from the primary computational linguistics literature.

Course objectives

•Students will be able to code Python programs that read, analyze and/or write textual and tabular data, involving such steps as tokenization, parsing, counting, searching, sorting, aggregating, sampling and plotting.

  • Students will be able to explain, in outline, how a computer interprets Python code, including control flow (loops, break, continue, if-else), function calls, stack trace and variable scope.

  • Students will be able to decompose a larger task into sub-tasks, and solve those by choosing appropriate built-in functions (e.g., range, enumerate, zip) and datastructures (e.g., list, dictionary, tuple, set) as well as defining their own reusable functions.

  • Students will be able to use some existing libraries, such as TextBlob and Spacy for language processing, and Pandas, Scikit-Learn and Seaborn for data analysis, and will moreover be able to find and learn to use additional libraries on their own, with the help of documentation.

  • Students will gain basic familiarity with selected topics and approaches from the primary research literature in computational linguistics, such as distributional semantics, dependency parsing, sentiment analysis, and probabilistic language generation.

Timetable

The timetables are available through My Timetable.

Mode of instruction

Seminar

Assessment method

Assessment

There will be two written exams, one halfway and one at the end, with a mix of closed questions, short open questions and short programming exercises. Throughout the course, portions of the homework will be marked as mandatory, to be submitted for a simple pass/fail grade. Only students with at least 80% of these assignments passed can pass the course.

Weighing

Your final grade will be computed as the average of the two exams, with a maximum grade of 5.0 (fail) if insufficient homework assignments are passed.

Resit

A single resit will be offered for the two written exams jointly, at the end of the course. Resitting only one of the written exams is not possible. A resit for the mandatory portion of the homework will be offered in the form of a substantial programming assignment at the end of the course.

Inspection and feedback

How and when an exam review will take place will be disclosed together with the publication of the exam results at the latest. If a student requests a review within 30 days after publication of the exam results, an exam review will have to be organized.

Reading list

An extensive collection of exercises and notes will be provided by the instructor.

Registration

Enrolment through My Studymap is mandatory

Contact

  • For substantive questions, contact the lecturer listed in the right information bar.

  • For questions about enrolment, admission, etc, contact the Education Administration Office: Reuvensplaats

Remarks

not applicable