# High-dimensional data analysis

Course
2024-2025

Basic knowledge on statistics and probability, linear algebra (e.g., matrices, eigenvalues and eigenvectors, singular value decomposition), generalized linear models (linear regression, logistic regression) and Bayesian methods is required.

## Description

Modern day high-throughput techniques characterize many traits (easily thousands) of an individual simultaneously. Often the resulting data are available for a comparatively small number of individuals. This unbalance in number of covariates to the sample size is typical to high-dimensional data. Such data arise in genomics, where genetic information is measured for many thousands of genes simultaneously, but also in economics and psychometrics. Analysis of high-dimensional data requires adjustments to well-known statistical methods, and the introduction of several novel concepts.

The course teaches students the adjustments to classical statistical methodology necessary to analyse high-dimensional data. This encompasses estimation methods, testing procedures, and shrinkage. More specifically, a) model-based inference for Gaussian and count data (classical and Bayesian methods); b) multiple testing (family-wise error rate and false discovery rate control); c) penalized regression (lasso and ridge); and d) shrinkage. Several types of high-dimensional data will be discussed and used during the course.

## Course objectives

At the end of the course, the student
1) is familiar with the pros and cons of the novel/adjusted statistical techniques.
2) can apply the novel/adjusted techniques to data and calculate, e.g., a prediction.
3) can reflect on the suitability of, e.g., the multiple testing procedure, to the situation at hand.
4) can motivate why certain methods, e.g. shrinkage, are beneficial in high-dimensional settings.
5) can discuss the limitations of the conclusions drawn from the results generated by, e.g., penalized regression.

## Timetable

See the Leiden University students' website for the Statistical Science programme -> Schedules

In MyTimetable, you can find all course and programme schedules, allowing you to create your personal timetable. Activities for which you have enrolled via MyStudyMap will automatically appear in your timetable.

Questions? Watch the video, read the instructions, or contact the ISSC helpdesk.

Note: Joint Degree students from Leiden/Delft need to combine information from both the Leiden and Delft MyTimetables to see a complete schedule. This video explains how to do it.

## Mode of Instruction

The course consists of a series of lectures and practicals (partly computer practicals, partly exercises).

## Assessment method

Hand-in bonus assignments + written exam

Literature will be specified during course, no books are required.

## Registration

As a student, you are responsible for enrolling on time through MyStudyMap.

In this short video, you can see step-by-step how to enrol for courses in MyStudyMap.
Extensive information about the operation of MyStudyMap can be found here.

There are two enrolment periods per year:

• Enrolment for the fall opens in July

• Enrolment for the spring opens in December

Note:

• It is mandatory to enrol for all activities of a course that you are going to follow.

• Your enrolment is only complete when you submit your course planning in the ‘Ready for enrolment’ tab by clicking ‘Send’.

• Not being enrolled for an exam/resit means that you are not allowed to participate in the exam/resit.

## Contact

mark.vdwiel@vumc.nl and w.n.van.wieringen@vu.nl

## Remarks

Software
Starting from the 2024/2025 academic year, the Faculty of Science will use the software distribution platform Academic Software. Through this platform, you can access the software needed for specific courses in your studies. For some software, your laptop must meet certain system requirements, which will be specified with the software. It is important to install the software before the start of the course. More information about the laptop requirements can be found on the student website.