Prospectus

nl en

Data Management, Measurement and Analysis in R for Social Sciences

Course
2022-2023

Description

In this course you will be learning on how to do data analysis in R and R Studio. Generally speaking data analysis comes in natural in two phases when doing research: firstly when you have collected data for finding general trends or patterns in your data, or, secondly, when selecting one or more (new) cases that either fit or deviate from the general trend you found in the former step. This, in a nutshell, could be seen as a formulation of some of the core ideas of Seawright and Gerring when writing about how to select cases for doing case study analysis. Their ideas will be a guiding line going throughout this course, and a reference for writing your research paper.

‘Traditionally’ trends in data are approached using statistical models, and some of the more simple cases will be introduced in this course, but we will also be learning on how to combine this with some ‘simple’ graphical approaches, using the R package ggplot. Corruption, for example, could be considered a country, region and political system dependent phenomenon. Making a worldwide country map of corruption could be considered interesting (see for instance the publications of Transparency International), but (statistically) modelling data from other sources in your ‘corruption map’ might reveal aspects which would otherwise remain hidden (for instance relations with economic indicators).

We start however from the bottom with a short introduction of R and R studio. Simultaneously we shall consider what exactly is meant with a data frame, types of variables, data structures, ‘dependencies’ in data, organizing data and linking data frames upon common ‘key’ variables. The most common and well known statistical models will be briefly introduced in a practical and hands-on fashion: we look at the models and the assumptions upon which they rest. As mentioned we will be using the ggplot package for making various kinds of graphs.

In this course we will be using online tutorials that can be found in the Brightspace course companion. For each meeting you will be doing a number of exercises that illustrate the homework for that particular week. In class we will be discussing your work. For this it is necessary that you have installed on your laptop the open source freely available R and R studio software packages that are downloadable from the Cran website.

Your final grade for this course will be determined upon a research paper using ‘existing data’, formulating a research question and hypotheses from theoretical knowledge you might have gained in other courses. The paper complies with APA or APSA rules. Deadlines for handing in the proposal (20%), first version (30%) and final version (50%) can be found in the short course syllabus. Participation during the meetings might influence the rounding of the final course grade.

Mode of instruction

Assignments, exercises and discussions

We use R (4.0.3 (or later)), which is available for free (i.e. public domain software) at www.r-project.org. Please make sure you have R installed on your computer before the course starts and bring your computer to class.

Literature

See Brightspace

Assessment method

Research paper. Participation.

Timetable

See 'MyTimetable' for actual schedule and locations.

Registration

See 'Practical Information'

Registration Exchange and Study Abroad students

Exchange and Study Abroad students, please see the Exchange students website for information on how to apply.