Prospectus

nl en

Introduction to Data Science

Course
2016-2017

Admission requirements

-

Description

Data Science emerged at the crossroads of many different fields, including statistics, machine learning, natural language processing, databases, and others. This course serves a dual purpose. On the one hand, several speakers will introduce students to a number of the topics that they will encounter during the data science master specialization or perhaps later as a professional data scientist. Company visits are included in the schedule, familiarising students with what they can expect in practice. On the other hand, the course teaches basic technical skills. These are: programming in Python, and basic statistics and probability theory.

Course objectives

The goal is to gain a better understanding of the very diverse field of data science, to get acquainted with the basics of statistics and the R software, and to be able to write small but readable and robust Python programs to solve statistical problems.

Mode of Instruction

There are two kind of lectures: invited speakers and excursions are interwoven with technical lectures. Some of the technical lectures are extended to make room for lab sessions. After each technical lecture, homework in the form of a Python notebook is distributed via blackboard. The homework may include some additional theory, and can have both theoretical questions and exercises that have to be completed in the form of Python programs.

Time Table

For the course days, course location and class hours check the Time Table 2016/17 under the
tab “Masters Programme” at http://www.math.leidenuniv.nl/statscience or http://liacs.leidenuniv.nl/education/master/schedules/.

Assessment method

The homework counts for half of the final grade; an exam at the end of the semester determines the other half. The homework will be distributed at the end of each technical lecture, and should be uploaded on blackboard.
The written exam is similar to the homework exercises, but may also include questions about the field trips and guest lectures. It requires the use of an offline laptop.

Exam and resit information about the date can be found in the time table. The exams take place in the Snellius building, the room will be announced on the electronic billboard, to be found at the opposite of the entrance, the content can also be viewed online at: http://info.liacs.nl/math/
If the exam does not take place in the Snellius building, then an announcement will be sent via blackboard.
For succesful completion of the course, both the average homework grade and the exam grade should not be below 5.5.

Reading list

There is no compulsory literature. The course involves programming in R and Python, and some statistical topics. For students who desire backup material, here are recommended textbooks for these three topics. (Consider that some of these textbooks may be compulsory reading for other courses.)

  • John A. Rice. Mathematical Statistics and Data Analysis. Brooks/Cole

  • Normal Matloff. The Art of R Programming. No Starch Press

  • Allen B. Downey. Think Stats. O'Reilly. (Freely available online.)

Course Registration

Enroll in Blackboard for the course materials and course updates.
To be able to obtain a grade and the ECTS for the course, sign up for the (re-)exam in uSis ten calendar days before the actual (re-)exam will take place. Note, the student is expected to participate actively in all activities of the program and therefore uses and registers for the first exam opportunity.
Exchange and Study Abroad students, please see the Prospective students website for information on how to apply.

Contact information

steven [dot] de [dot] rooij [at] gmail [dot] com

Remarks

This is a compulsory course in the master's program of the specialisation Data Science.