In the proceedings of Intelligent Information Systems 2004 (New Trends in Intelligent Information Processing and Web Mining).
The aim of this article is to present the initial results of adapting SProUT, a multi-lingual Natural Language Processing platform developed at DFKI, Germany, to the processing of Polish. The article describes some of the problems posed by the integration of Morfeusz, an external morphological analyzer for Polish, and various solutions to the problem of the lack of extensive gazetteers for Polish. The main sections of the article report on some initial experiments in applying this adapted system to the Information Extraction task of identifying various classes of Named Entities in financial and medical texts, perhaps the first such Information Extraction effort for Polish.
Electronically available formats:
BibTeX entry:
@string{sv = "Springer-Verlag"} @InCollection{pis:etal:03, author = "Jakub Piskorski and Peter Homola and Małorzata Marciniak and Agnieszka Mykowiecka and Adam Przepiórkowski and Marcin Woliński", title = "Information Extraction for {P}olish Using the {SProUT} Platform", crossref = "klo:etal:04:ed", pages = "227--236"} @Book{klo:etal:04:ed, editor = "Mieczysław A. Kłopotek and Sławomir T. Wierzchoń and Krzysztof Trojanowski", title = "Intelligent Information Processing and Web Mining", booktitle = "Intelligent {I}nformation {P}rocessing and {W}eb {M}ining", publisher = sv, year = 2004, series = "Advances in Soft Computing", address = "Berlin"}