nl en

Introduction to Data Science

Course 2018-2019

Admission requirements



Data Science emerged at the crossroads of many different fields, including statistics, machine learning, natural language processing, databases, and others. This course serves a dual purpose. On the one hand, several speakers will introduce students to a number of the topics that they will encounter during the data science master specialization or perhaps later as a professional data scientist. Company visits are included in the schedule, familiarising students with what they can expect in practice. On the other hand, the course teaches basic technical skills, namely, learning to do basic programming in Python and use the popular data analysis libraries.

Course objectives

The goal is to gain a better understanding of the very diverse field of data science, and to be able to write small but readable and robust Python programs to solve statistical problems.

Mode of Instruction

There are two kind of lectures: invited speakers and excursions are interwoven with technical lectures. After each technical lecture there is a practical session. During the practicals, homework in the form of a Jupyter notebook is distributed via blackboard. The homework may include some additional theory, and can have both theoretical questions and practical exercises that have to be completed in the form of Python programs.

Time Table

See the Leiden University students' website for the Statistical Science programme -> Schedules 2018-2019

Assessment method

Completion of the course depends on three factors: (1) attending the company visits and guest lectures, (2) scoring at least 5/10 on the homework problems and (3) scoring at least 5/10 on the written exam. On meeting these requirements, the final grade is determined as the average of the homework and exam grades. The written exam requieres the use of an offline laptop.

Exam and resit information about the date can be found in the time table. The exams take place in the Snellius building, the room will be announced on the electronic billboard, to be found at the opposite of the entrance, the content can also be viewed online at:
If the exam does not take place in the Snellius building, then an announcement will be sent via blackboard.
For succesful completion of the course, both the average homework grade and the exam grade should not be below 5.

Reading list

There is no compulsory literature. The course involves programming in Python, and some statistical topics. For students who desire backup material, here are recommended textbooks for these three topics. (Consider that some of these textbooks may be compulsory reading for other courses.)
• John A. Rice. Mathematical Statistics and Data Analysis. Brooks/Cole
• Normal Matloff. The Art of R Programming. No Starch Press
• Allen B. Downey. Think Stats. O'Reilly. (Freely available online.)

Course Registration

Enroll in Blackboard for the course materials and course updates.
To be able to obtain a grade and the ECTS for the course, sign up for the (re-)exam in uSis ten calendar days before the actual (re-)exam will take place. Note, the student is expected to participate actively in all activities of the program and therefore uses and registers for the first exam opportunity.
Exchange and Study Abroad students, please see the Prospective students website for information on how to apply.

Contact information

Steven de Rooij: steven [dot] de [dot] rooij [at] gmail [dot] com


This is a compulsory course of the Master Statistical Science with the specialisation Data Science.