A Search Tool for Corpora with Positional Tagsets and Ambiguities

Adam Przepiórkowski, Zygmunt Krynicki, Łukasz Dębowski, Marcin Woliński, Daniel Janus and Piotr Bański

In the Proceedings of the Fourth International Conference on Language Resources and Evaluation, LREC 2004, pp. 1235--1238.


Abstract:

This article describes Poliqarp, a corpus indexing and query tool, which understands positional tagsets and which does not assume that word forms are annotated with unique morphosyntactic tags. Poliqarp is designed to be applicable to a variety of languages and tagsets: it works with XML-encoded texts, uses the UTF-8 character set, and allows for an external specification of the tagset. Currently, Poliqarp is used for indexing and searching a morphosyntactically annotated corpus of Polish.


Electronically available formats:


BibTeX entry:

@InProceedings{prz:etal:04a,
  author =       "Adam Przepiórkowski and Zygmunt Krynicki and
                  Dębowski, Łukasz and Marcin Woliński and Daniel
                  Janus and Piotr Bański",
  title =        "A Search Tool for Corpora with Positional Tagsets
                  and Ambiguities",
  booktitle =    "Proceedings of the Fourth International Conference on
                  Language Resources and Evaluation, LREC\,2004",
  pages =        "1235--1238",
  year =         2004}

Valid XHTML 1.0! Valid CSS!

Creation Date: Tuesday, December 23, 2003
Last Modified: Tue Jun 7 22:24:05 CEST 2005
AP