Computational Lexicography

9.Computational Lexicography

Quiz 1

1.What is a KWIc concordance?

A KWIC concordance is a corpus based dictionary where each word is a text corpus(each word is paired with the words it is in relation to,in a context given)

2.Which are the two main components of lexicon
construction based on empirical data?

Thw two main components of lexicon construction based on empirical data are the two levels we look into to make the distinction:corpus(with primary and secondary data)and the lexicon.

3.Which layers of abstraction are involved in corpus acquisition?

There are two layers involved in corpus acquisition:primary(audio/video recording) and secondary data(transcription,annotation,metadata).

4.Which layers of abstraction are involved in lexicon construction?Describe them.

The layers of abstraction are layers of corpus lexicon with word lists.At a first level there is a lexicon matrix with data categories,but no generalisation,then a lexicon with selected generalisations.Then a lexicon is created ,with hierarchies.

5.Which layer do standard dictionary types tipically belong to?

To the third layer:a lexicon with selected generalisation-semasiological or onomasiological dictionaries.

Quiz 2

1.What are the 6 main steps in KWIC concordance construction?Explain each of these steps.

a)Corpus creation

create a written text (or transform a spoken message into a written text)


first is the deletion of the punctuation marks,followed by the division of the text into units(words).

c)Keywordlist extraction

create a list of the words that occur in the text in alfabetically order and delete the duplicates.

d)Context collation

keywords are put in context (left and right context)

e)Keyword search

search for a keyword with its left end right context

f)Output formatting

the output is a list of keywords alfabetically arranged with left and right context.

Quiz 3

1.In which programming languages could the concordance software be implemented?

2.What are the problems with the demonstration software which need to be removed in a later realistic project?


HOMEWORK-see quizzes

1 Kommentar 29.1.07 21:21, kommentieren