In the study of the effect of one or more explanatory variables on a response variable, linear regression and analysis of variance are important techniques. In linear regression we study how a quantitative variable, like the dose of a medicine, influences a quantitative response variable, like blood pressure. In analysis of variance we compare different groups with respect to a quantitative response, e.g. comparing the yields of different corn varieties. The statistical models that underlie these techniques are special cases of linear models. In this course we discuss linear models with a thorough treatment of the matrix algebra.
Although linear models are widely used, sometimes alternatives are preferred. Therefore, we discuss how to check the assumptions underlying linear model: independent errors, with a normal distribution and constant variance. When the assumptions of normality and constant variance are violated, the wider class of generalized linear models may be employed. Examples are logistic regression for a binary response (assuming a binomial distribution), or log-linear models for counts (using a Poisson distribution). Data are still assumed to be independent. Analysis of dependent data will be discussed in the course on mixed and longitudinal modeling. Emphasis will be on gaining understanding of the models, the kind of data that can be analyzed with these models, and with the statistical analysis of empirical data itself.
Students should understand the basic concepts of linear models (regression, ANOVA, ANCOVA) and generalized linear models, and the proper statistical inference methods. Students, when confronted with practical data for a linear or generalized linear model assuming independence should be able (1) understand the statistical analysis of the empirical data itself, (2) check for violations on the assumptions (2), and perform a proper data analysis. Students should acquaint themselves with the basics of linear algebra, especially the matrix algebra that is needed to understand Linear Models.
Mode of Instruction
Lectures and practicals (partly computer practicals, partly exercises).
Classes are on Tuesday + Wednesday, starting at Tuesday Nov 5, 2013; 7 weeks; from 10.00h – 16.15h.
For the course days, course location and class hours check the Time Table 2013-14 under the tab “Masters Programme” at http://www.math.leidenuniv.nl/statscience
Assessment of a student will be based on written exam (2/3), a case study report (1/3), and an oral presentation of the case study report (pass/ fail).
Case study report: In week four, students will be asked to analyze a practical data set or study a theoretical topic. A report should be handed and the student will give a 15 minutes short oral presentation on the topic of his or her report.
On January 17, 2014 from 14.00h-17.00h the written exam is scheduled. The resit is scheduled for 18 June 2014 at 14.00-17.00
- Fox (2008). Applied Regression Analysis and Generalized Linear Models. Sage
- Faraway: Practical Regression and ANOVA using R. Text available as PDF at http://cran.r-project.org/doc/contrib/Faraway-PRA.pdf
- Faraway (2006). Extending the linear model with R. Generalized linear, mixed effects and nonparametric regression models. Chapman & Hall/CRC
Besides the registration for the (re-)exam in uSis, course registration via blackboard is compulsory.
Exchange and Study Abroad students, please see the Prospective students website for information on how to apply.
gerrit [dot] gort [at] wur.nl
- This is a compulsory course in the Master’s programme of the specialisation Statistical Science for the Life & Behavioural sciences.