Studiegids

nl en

Distributed Systems

Vak
2024-2025

Admission requirements

This is a practical-intensive course, which involves low-level systems programming. Mastering all this is also highly rewarding. Students are assumed to have taken courses in advanced programming, computer architecture, computer networks, and operating systems at a BSc level. The HPC course is not a requirement, but will make things easier.

Description

Distributed systems are pervasive, most members of our society interact with them daily. Social networks, government services, and media streaming are all powered by distributed systems. Such systems are composed of many physically distributed computers, all connected through a network. Distributed systems are applied in areas such as data storage (e.g., HDFS, Ceph), data transmission and queueing (e.g., gRPC, ZeroMQ), key-value stores (e.g., Redis, Cassandra), analytics (e.g., Spark, Hive), batch (e.g., Hadoop) and stream processing (e.g., Flink), distributed supercomputing (e.g., MPI), as well as machine learning (e.g., Tensorflow, pytorch).

As computer scientists, systems engineers, or devops, most of the large-scale systems we work with, either directly or indirectly, are actually distributed systems. Both academia and industry invest significant effort into: (i) defining theory and design processes for building such systems; (ii) understanding the performance of these systems; (iii) understanding the interaction between these systems and their underlying computing infrastructure; and (iv) building more efficient systems, that seamlessly scale with the number of users, machines, and workloads.

This course is a practical, systems-first approach at understanding distributed systems. We will discuss general distributed systems topics, such as communication, consistency, fault-tolerance, consensus. We will discuss the design of distributed systems, such as which parts are these composed of (e.g., storage, resource management, scheduling, communication). We will also treat general topics on performance evaluation, such as: benchmarking, workloads, metrics, statistical analysis, and how to design repeatable experiments.

Course objectives

After following this course, students will be able to:
1. Understand the main concepts of distributed systems, e.g., communication, resource management and scheduling, consistency, fault-tolerance, performance.
2. Explain and identify trade-offs for designing certain components of distributed systems, e.g., erasure-coding vs. replication for fault-tolerance.
3. Analyse research papers, critically discuss their merit
4. Design, build, and evaluate distributed systems.
5. Evaluate performance using state-of-the-art experiment design techniques for reproducible performance evaluation.

Timetable

The most recent timetable can be found at the Computer Science (MSc) student website.
You will find the timetables for all courses and degree programmes of Leiden University in the tool MyTimetable (login). Any teaching activities that you have sucessfully registered for in MyStudyMap will automatically be displayed in MyTimeTable. Any timetables that you add manually, will be saved and automatically displayed the next time you sign in.

MyTimetable allows you to integrate your timetable with your calendar apps such as Outlook, Google Calendar, Apple Calendar and other calendar apps on your smartphone. Any timetable changes will be automatically synced with your calendar. If you wish, you can also receive an email notification of the change. You can turn notifications on in ‘Settings’ (after login).

For more information, watch the video or go the the 'help-page' in MyTimetable. Please note: Joint Degree students Leiden/Delft have to merge their two different timetables into one. This video explains how to do this.

Mode of instruction

The course is composed of three components:
1. Lectures given by the instructor, to setup course format, assignments, and to teach key topics in distributed systems.
2. Self-study Lab Assignments. The assignments can be done alone or in pairs of two students. Students are expected to work on the lab assignments autonomously, outside of the lectures. This is a hands-on course, where the lab constitutes a large part of the final grade. There are two large-scale lab components: (1) a reproducibility study, where students implement several experiments from existing scientific papers, to get acquainted with modern technology and state-of-the-art in experiment design; (2) a build-your-own-distributed-system, where students learn how to build a prototype distributed system.
3. Student Presentations and in-class discussion. Each group must prepare a 15-min presentation on the reproducibility study of assignment 1. This is followed by a 5-min Q&A discussion session, where all other students participate.

Assessment method

The assessment method is based on the following student assignments:

  • Reproducibility study: 20% (group-based, mandatory)

  • Build a system: 40% (group-based, mandatory)

  • Presentation: 10% (group-based, mandatory)

  • Exam: 30% (Individual, mandatory)
    Each student must score a sufficient score (>= 5.5) in each of the four sub-parts of the grade. Partial scores will not be kept between academic years.

Reading list

M. van Steen and A.S. Tanenbaum, Distributed Systems, 4th ed., distributed-systems.net, 2023. https://www.distributed-systems.net/index.php/books/ds4/

Registration

Contact

Rob van Nieuwpoort

Remarks

N/a