Prospectus

nl en

Statistical Learning - new curriculum

Course
2022-2023

Admission requirements

  • Familiarity with least squares linear regression

  • Ability to program in R (preferred) or in Python

  • Basic knowledge of university-level probability theory, calculus, and linear algebra

Description

Supervised statistical learning involves building a model for predicting an output (response, dependent) variable based on one or more input (predictor) variables. There are many areas where such a predictive question is of interest - for example, Netflix recommendations, self-driving cars, predicting disease status/vulnerability and finding early markers of diseases.

In unsupervised statistical learning, there are only input variables but no supervising output (dependent) variable; nevertheless, we can learn relationships and structures from such data using cluster analysis and methods for dimension reduction.

This course provides a basis for understanding statistical learning techniques and teaches the skills to apply and evaluate them.

The supervised learning methods discussed will include classical and state-of-the-art classification methods: regularized regression (Ridge, Lasso, and other L1- methods), naive Bayes, decision trees, and random forests. We explain the interrelations between these methods and analyze their behavior.

We will also discuss model selection, where we consider both classical and state-of-the-art methods, including various forms of cross-validation.

Concerning unsupervised learning, we consider methods for clustering (i.e., the classic k-means) and dimension reduction methods (like PCA).

Course Objectives

an introduction to statistical learning

Timetable

You will find the timetables for all courses and degree programmes of Leiden University in the tool MyTimetable (login). Any teaching activities that you have sucessfully registered for in MyStudyMap will automatically be displayed in MyTimeTable. Any timetables that you add manually, will be saved and automatically displayed the next time you sign in.

MyTimetable allows you to integrate your timetable with your calendar apps such as Outlook, Google Calendar, Apple Calendar and other calendar apps on your smartphone. Any timetable changes will be automatically synced with your calendar. If you wish, you can also receive an email notification of the change. You can turn notifications on in ‘Settings’ (after login).

For more information, watch the video or go the the 'help-page' in MyTimetable. Please note: Joint Degree students Leiden/Delft have to merge their two different timetables into one. This video explains how to do this.

Mode of instruction

Lectures and computer practicals. We will use Brightspace to share all course material.

Assessment method

The final grade is based on (each with a weight of 1/3):

1) a written structured assignment (individual, half way the course)
2) a written structured assignment (individual, at the end of the course)
3) oral presentation regarding the analysis of a data set of students’ own choice (in group, at the end of the course)

Students receive (during the lecture) feedback on the assignments and the oral presentation.

Reading list

  • James, G., Witten, D., Hastie, T., & Tibshirani, R. (2021). An introduction to statistical learning: with applications in R. New York: Springer. A free copy and online tutorials are available online

  • Beaujean, A. A. (2014). Latent variable modeling using R. A step by step guide. New York: Routledge.

Additional resources:

  • Berk, R. A. (2008). Statistical learning from a regression perspective. Springer. (a PDF is available via Leiden University Library)

  • Kuhn, M. & Johnson, K. (2013). Applied predictive modelling. Springer. (a PDF is available via Leiden University Library)

  • T. Hastie, R. Tibshirani, J. Friedman (2009). The Elements of Statistical Learning, (2nd edition) (available for free at https://web.stanford.edu/~hastie/Papers/ESLII.pdf)

  • Bishop, C. M. (2006). Pattern recognition and machine learning (1st edition). Springer.

  • Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.

Registration

From the academic year 2022-2023 on every student has to register for courses with the new enrollment tool MyStudyMap. There are two registration periods per year: registration for the fall semester opens in July and registration for the spring semester opens in December. Please see this page for more information.

Please note that it is compulsory to both preregister and confirm your participation for every exam and retake. Not being registered for a course means that you are not allowed to participate in the final exam of the course. Confirming your exam participation is possible until ten days before the exam.

Extensive FAQ's on MyStudymap can be found here.

Contact

Julian Karch: j.d.karch@fsw.leidenuniv.nl

Remarks