Admission requirements
None
Description
Due to a relatively recent surge of large-scale text digitization and spoken language transcription projects (by corpus linguists, digital humanities scholars, but also commercial companies), an unprecedented amount of naturally occurring (rather than experimentally elicited) linguistic data from a wide range of languages and language varieties is ready to be queried and analyzed. Yet, as these electronic text databases – or ‘corpora’ – are not only growing in number but also in size, it is no longer feasible to subject them to more traditional types (manual) of data retrieval, annotation and analysis.
In this course, students will be familiarized with a range of computational methods (in Python) to collect, process and analyze corpus data. At the same time, students will also be introduced to different strands of computational corpus research and the questions they can answer. To this end, we survey case studies from different fields. Finally, ethical issues (e.g. copyright, privacy and bias) associated with large-scale corpus analysis will also be discussed.
Course objectives
After successful completion of this course, students will:
have gained in-depth knowledge of recent developments in computational corpus analysis;
be able to explain the challenges, opportunities and pitfalls of using corpora in linguistic analysis;
be able to collect, process and analyse corpus data using computational methods;
be able to structure and write a detailed corpus research report.
Timetable
The timetables are available through My Timetable.
Mode of instruction
Seminar.
Assessment method
Assessment
The course is assessed by means of a final research paper plus a number of practical assignments throughout term.
To pass the course, you can miss no more than two sessions for the semester.
To pass this course, a total score of 5.5 must be obtained. Students who score below 5.5 may submit a resit essay.
Weighing
The final mark is based on the grade for the final paper plus the additional requirement that the practical assignments throughout the term are completed with a sufficient result.
Weighting is as follows: paper 90%; practical assignments throughout the term 10%.
Resit
The end-of-term essay can be revised and submitted as a resit essay if the score is between 4.5 and 5.5.
If the end-of-term essay has a score below 4.5, a resit essay should be submitted on a new topic.
The essay resit will constitute 100% of the final grade.
Please note that there is no resit for the practical assignments score throughout term.
Inspection and feedback
How and when an exam review will take place will be disclosed together with the publication of the exam results at the latest. If a student requests a review within 30 days after publication of the exam results, an exam review will have to be organized.
Reading list
Course readings will be made available on Brightspace via open access and Leiden University Library resources.
Registration
Enrolment through MyStudyMap is mandatory.
General information about course and exam enrolment is available on the website.
Registration Studeren à la carte en Contractonderwijs
Not applicable.
Contact
For substantive questions, contact the lecturer listed in the right information bar
For questions related to the content of the course, please contact the lecturer, you can find their contact information by clicking on their name in the sidebar.
For questions regarding enrollment please contact the Education Administration Office Reuvensplaats
For questions regarding your studyprogress contact the Coordinator of Studies
Remarks
All other information.