Studiegids

nl en

Corpus Lexicography

Vak
2010-2011

Admission requirements

Description

In this course, students are introduced to the field of lexicography with a focus on computational and corpus linguistic aspects in lexicography. Lexicography is concerned with the theory and practice of composing dictionaries.

The course discusses various computational tools lexicographers use whilst making a dictionary. In technologically advanced dictionary-making, lexicographers work with two main systems on their computer: the Corpus Query System for analysis and the Dictionary Writing System for synthesis. Both systems will be covered in this course.

This course also teaches students about theoretical and practical issues involved in compiling complex data sets for lexicographic purposes. Students will learn about corpus design and annotation, computational lexicons, semantic networks, etc. They will learn how to manipulate text using regular expressions and get a basic background in databases for lexicography.

We will illustrate the theory with practical activities such as compiling a corpus and preparing it for lexicographic use as well as using a dictionary writing system and a corpus query system.

Course objectives

By the end of the programme, students will have acquired knowledge on the following notions and subjects:

  • Computational lexicography

  • Dictionary Writing Sytems

  • Corpus linguistics

    • History of corpus linguistics
    • Corpus design Size, sampling, text type, genre, XML/TEI encoding, metadata
    • Corpus analysis and annotation POS-tagging, tokenisation, lemmatisation, parsing, frequency lists
    • Corpus Query Systems Concordance, collocations, word sketches
  • Computational lexicons

    • Knowledge representation
    • Inheritance
  • Perl/ Databases

    • Regular expressions
    • Simple Perl scripts
    • Database structure for lexicography
  • Semantic Networks

    • WordNet, FrameNet, semagram, ontology, taxonomy, hierarchy
    • Semantic relations
  • Dictionary use

    • Log files
    • Forum

Timetable

Two-hour lecture per week.

Mode of instruction

Lectures and tutorials/practical sessions

Assessment method

Essay

Blackboard

A blackboard site will be made available.

Reading list

To be announced.

Registration

Exchange and Study Abroad students, please see the Study in Leiden website for information on how to apply

Contact information

http://www.hum.leiden.edu/lucl/research-master-linguistics/

Remarks