In the Proceedings of the Fourth International Conference on Language Resources and Evaluation, LREC 2004, pp. 1235--1238.
This article describes Poliqarp, a corpus indexing and query tool, which understands positional tagsets and which does not assume that word forms are annotated with unique morphosyntactic tags. Poliqarp is designed to be applicable to a variety of languages and tagsets: it works with XML-encoded texts, uses the UTF-8 character set, and allows for an external specification of the tagset. Currently, Poliqarp is used for indexing and searching a morphosyntactically annotated corpus of Polish.
Electronically available formats:
BibTeX entry:
@InProceedings{prz:etal:04a,
author = "Adam Przepiórkowski and Zygmunt Krynicki and
Dębowski, Łukasz and Marcin Woliński and Daniel
Janus and Piotr Bański",
title = "A Search Tool for Corpora with Positional Tagsets
and Ambiguities",
booktitle = "Proceedings of the Fourth International Conference on
Language Resources and Evaluation, LREC\,2004",
pages = "1235--1238",
year = 2004}