Prerequisites
a basic understanding of introductory statistical concepts and some familiarity with R as taught in Inleiding Mathematische Statistiek.
Description
An overview about each of the four topics topic presented in this course is given here below
Safe Testing (Prof. Dr. P. D. Grünwald).
In traditional hypothesis testing, the sample size or at least the sampling protocol must be
determined in advance. In practice, it is desirable to use more flexible stopping rules.
Researchers do this even though the methods do not allow for it, leading to false results
appearing in the literature. We will outline some exciting recent techniques that can guarantee
small error probabilities with 'optional stopping' after all. The underlying mathematics builds upon
the insight that, in a casino, you do not expect to get rich, no matter what is your rule for
continuing to gamble or going home.
Bayesian methods (Dr. M. A. Hadji)
Bayesian inference is based on the Bayesian interpretation of probabilities. In Bayesian statistics,
we assume the parameter is a random variable which we endow with our prior belief. The data
will update our belief about the parameter through the computation of a posterior distribution. It
can be difficult to directly access the posterior distribution. In these cases, it is common to use
Markov chains Monte Carlo (MCMC) methods. The most common choices of priors in wellknown
models will be presented. Some MCMC methods to sample from the posterior will be
introduced.
Survival analysis (Prof. Dr. M. Fiocco)
This area of statistics deals with time to event data, whose analysis is complicated not only by
the dynamic nature of events occurring in time but also by censoring where some events are not
observed directly but it is only known that they fall in some interval or range. Different types of
censored and truncated data, non-parametric methods to estimate the survival function and
regression models to study the effect of risk factors on survival outcomes will be discussed.
Special aspects such as time-dependent covariates and stratification will be introduced.
Longitudinal data analysis (Dr. M. Signorelli)
Longitudinal data (sometimes called panel data) are data collected through a series of repeated
observations of the same subjects over time. Since repeated measurements from the same
subject are typically correlated, the analysis of longitudinal data requires statistical methods that
do not rely on the usual independence assumptions. In this part of the course, the two most
widely used statistical models for longitudinal data - linear mixed models, and generalized linear
mixed models – will be discussed. Estimation of the models will be performed using the R
software environment.
Course objectives
The overall aim of the course is to introduce students to four different areas of statistics. By the
end of the course, students are expected to have a basic understanding of the topics discussed
and to be able to use existing software to apply the methods covered during the course.
Mode of instruction
Weekly 2 × 45 min of lecture in class, and 2 × 45 min of practical sessions with exercises. Laptop
with the statistical package R (http://www.r-project.org) already installed is required for each
practical section.
Assessment method
Four individually written reports (20% each), and a presentation (20%) on a selected topic. The presentations will be held individually or in pairs, depending on the group size. The reports are regarded as practical assignments, and can not be retaken. The presentation can be retaken.
Literature
Lecture material provided in class.
Registration
Enroll in Usis to obtain the course material and course updates from Brightspace.
Contact
Tijn Jacobs - t.jacobs.3@umail.leidenuniv.nl
