Prospectus

nl en

Natural Language Processing

Course
2024-2025

Admission requirements

  • Recommended prior knowledge

The below list indicates useful prior knowledge for this course, and undergraduate courses where you may have obtained it:

  • Working knowledge of Python (Introduction to Programming)

  • Linear algebra (Linear Algebra for Computer Scientists 1 & 2)

  • Some knowledge of grammars and tree structures (Automata theory, Algorithms and Data structures)

  • Basic knowledge of machine learning (Machine Learning)

Description

Natural Language Processing (NLP) is a rapidly evolving field at the intersection of computer science, linguistics, and artificial intelligence. This undergraduate course provides an in-depth exploration of NLP techniques and methodologies.

The course focusses on the fundamental concepts and tools of NLP, distinguishing between theoretical concepts and their implementation. On the theoretical side the students will learn about parts-of-speech, constituency and dependency parsing, distributional and compositional semantics. On the implementation side the course covers statistical and neural approaches to language modelling, sequence labelling and parsing, and selected applications. The course additionally discusses recurrent neural network architectures for NLP, as well as transformer models that form the basis for large language models, where encoder models like BERT and decoder models like GPT-n are detailed.

In the course, students get acquainted with the material by means of theoretical exercises combined with practical programming challenges.

Course objectives

Upon completion of the course, students can:

  • Have a theoretical understanding of the broad field of NLP

  • Understand the distinction and use of formal, statistical and neural approaches in NLP

  • Describe technical details of NLP techniques like n-gram language modelling, part of speech tagging and constituency parsing

  • Implement a statistical language model, part of speech tagger and parser in Python.

  • Describe technical details of NLP techniques like distributional semantics, and compositional semantics

  • Partially implement and apply an RNN on a NLP task, such as sentiment analysis or machine translation.

  • Finetune and evaluate a neural language model (e.g. BERT) on an NLP sequence tagging or sequence classification task such as Part-of-Speech tagging, Natural Language Inference, review classification.

Timetable

In MyTimetable, you can find all course and programme schedules, allowing you to create your personal timetable. Activities for which you have enrolled via MyStudyMap will automatically appear in your timetable.

Additionally, you can easily link MyTimetable to a calendar app on your phone, and schedule changes will be automatically updated in your calendar. You can also choose to receive email notifications about schedule changes. You can enable notifications in Settings after logging in.

Questions? Watch the video, read the instructions, or contact the ISSC helpdesk.

Note: Joint Degree students from Leiden/Delft need to combine information from both the Leiden and Delft MyTimetables to see a complete schedule. This video explains how to do it.

Mode of instruction

Lectures, background reading, assignments and preparatory exercises.

Assessment method

  • a written individual exam, closed book (50% of course grade)

  • practical assignments (50% of course grade)

    • two assignments of 15%
    • one assignment of 20%

The grade for the written exam should be 5.5 or higher in order to complete the course. The average grade for the practical assignments should be 5.5 or higher in order to complete the course. If one of the tasks is not submitted the grade for that task is 0. Each assignment has a re-sit opportunity (a later submission). The maximum grade for a re-sit assignment is 6.

The teacher will inform the students how the inspection of and follow-up discussion of the exams will take place.

Reading list

The majority of the course literature comes from Dan Jurafsky and James H. Martin, Speech and Language Processing (3rd ed), [freely available online][https://web.stanford.edu/~jurafsky/slp3/].
For a few lectures, additional material will be made available through the course webpage.

Registration

As a student, you are responsible for enrolling on time through MyStudyMap.

In this short video, you can see step-by-step how to enrol for courses in MyStudyMap.
Extensive information about the operation of MyStudyMap can be found here.

There are two enrolment periods per year:

  • Enrolment for the fall opens in July

  • Enrolment for the spring opens in December

See this page for more information about deadlines and enrolling for courses and exams.

Note:

  • It is mandatory to enrol for all activities of a course that you are going to follow.

  • Your enrolment is only complete when you submit your course planning in the ‘Ready for enrolment’ tab by clicking ‘Send’.

  • Not being enrolled for an exam/resit means that you are not allowed to participate in the exam/resit.

Contact

Lecturer: [dr. G. J. Wijnholds][g.j.wijnholds@liacs.leidenuniv.nl]
Course website: TBD

Remarks

Software
Starting from the 2024/2025 academic year, the Faculty of Science will use the software distribution platform Academic Software. Through this platform, you can access the software needed for specific courses in your studies. For some software, your laptop must meet certain system requirements, which will be specified with the software. It is important to install the software before the start of the course. More information about the laptop requirements can be found on the student website.