In the Proceedings of the Fourth International Conference on Language Resources and Evaluation, LREC 2004, pp. 1235--1238.
This article describes Poliqarp, a corpus indexing and query tool, which understands positional tagsets and which does not assume that word forms are annotated with unique morphosyntactic tags. Poliqarp is designed to be applicable to a variety of languages and tagsets: it works with XML-encoded texts, uses the UTF-8 character set, and allows for an external specification of the tagset. Currently, Poliqarp is used for indexing and searching a morphosyntactically annotated corpus of Polish.
Electronically available formats:
BibTeX entry:
@InProceedings{prz:etal:04a, author = "Adam Przepiórkowski and Zygmunt Krynicki and Dębowski, Łukasz and Marcin Woliński and Daniel Janus and Piotr Bański", title = "A Search Tool for Corpora with Positional Tagsets and Ambiguities", booktitle = "Proceedings of the Fourth International Conference on Language Resources and Evaluation, LREC\,2004", pages = "1235--1238", year = 2004}