AUTOR:     Roland Meyer
AFILIACJA: Universität Regensburg (Niemcy)
TYTUŁ:     Tagging a historical corpus

STRESZCZENIE

The task of annotating a historical corpus with morphosyntactic
information poses problems which partly overlap with those of
synchronic tagging, and partly call for new methods. In this talk I
will outline the specific problems involved and explore two
approaches, using data from a diachronic corpus of Russian: (i)
finite-state morphology and statistical disambiguation (the common
approach for synchronic corpora), and (ii) annotation projection from
tagged modern translations. A key issue with which both attempts have
to cope is the considerable variation in orthography, phonology and
morphology we find in historical texts.