High-dimensional data analysis


Admission requirements

Basic knowledge on statistics and probability, linear algebra (e.g., matrices, eigenvalues and eigenvectors, singular value decomposition), generalized linear models (linear regression, logistic regression) and Bayesian methods is required.


Modern day high-throughput techniques characterize many traits (easily thousands) of an individual simultaneously. Often the resulting data are available for a comparatively small number of individuals. This unbalance in number of covariates to the sample size is typical to high-dimensional data. Such data arise in genomics, where genetic information is measured for many thousands of genes simultaneously, but also in economics and psychometrics. Analysis of high-dimensional data requires adjustments to well-known statistical methods, and the introduction of several novel concepts.

The course teaches students the adjustments to classical statistical methodology necessary to analyse high-dimensional data. This encompasses estimation methods, testing procedures, and shrinkage. More specifically, a) model-based inference for Gaussian and count data (classical and Bayesian methods); b) multiple testing (family-wise error rate and false discovery rate control); c) penalized regression (lasso and ridge); and d) shrinkage. Several types of high-dimensional data will be discussed and used during the course.

Course objectives

At the end of the course, the student
1) is familiar with the pros and cons of the novel/adjusted statistical techniques.
2) can apply the novel/adjusted techniques to data and calculate, e.g., a prediction.
3) can reflect on the suitability of, e.g., the multiple testing procedure, to the situation at hand.
4) can motivate why certain methods, e.g. shrinkage, are beneficial in high-dimensional settings.
5) can discuss the limitations of the conclusions drawn from the results generated by, e.g., penalized regression.


Mode of Instruction

The course consists of a series of lectures and practicals (partly computer practicals, partly exercises).

Assessment method

Hand-in bonus assignments + written exam

Reading list

Literature will be specified during course, no books are required.


