This course gives an overview of techniques for automated learning from ill-understood data for which it is hard or impossible to formulate a model that is even approximately correct. Here “learning” means: “finding structure, patterns, regularities” and using these patterns to predict future data. NOTE: This course is also part of the Master of Statistical Science, and, like the other courses in this master, it has been scheduled as a ‘block course’ in the month October. It will therefore inevitably overlap with other courses in the mathematics master. During the first lecture we will discuss any scheduling issues that may arise and we will see whether we can resolve these in some way. The field “statistical learning” is also known as “machine learning”, since many contributions in this field have their origin in computer science areas (pattern recognition, artificial intelligence). Main topics in the course will be (1) supervised learning (regression and classification, but with a strong focus on the latter); (2) model selection and model averaging, (3) predictive analysis including sequential prediction. The methods discussed will include various classical and state-of-the-art classification methods: naive Bayes, perceptrons (1960s), neural networks, decision trees (1980s), logistic regression , boosting, support vector machines, Gaussian processes and other kernel approaches (2000s). We explain interrelations between these methods and analyze their large-sample behaviour. As for model selection and averaging, we again consider both classical and state of the art methods including AIC, BIC, Bayes factor model averaging, Minimum Description Length (MDL), Structural Risk Minimization (SRM), Shrinkage, Lasso and other L1-methods. We explain how all these methods are related to Bayesian and non-Bayesian methods for combining predictors, and again we analyze their large-sample behaviour. Form: Lectures and practicals (partly computer practicals, partly exercises). Written exam. Literature: T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, 2nd edition, 2009. Handouts of some (very few) papers.

**Lecture hours**

Lectures and practicals (partly computer practicals, partly exercises) Each 2 hrs/day; Written exam and practical assignment

**Literature**

T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning, 2nd edition, 2009. Handouts of some (very few) papers

**Links**

home page of teacher

(will contain link to course page once it’s ready)