Bibliography

AuthorTitleYearJournal/ProceedingsReftypeDOI/URL
Geyken, A. & Hanneforth, T. TAGH: A Complete Morphology for German based on Weighted Finite State Automata 2006 Finite State Methods and Natural Language Processing. 5th International Workshop, FSMNLP 2005, Helsinki, Finland, September 1-2, 2005. Revised Papers   incollection DOIURL  
Abstract: TAGH is a system for automatic recognition of German word forms. It is based on a stem lexicon with allomorphs and a concatenative mechanism for inflection and word formation. Weighted FSA and a cost function are used in order to determine the correct segmentation of complex forms: the correct segmentation for a given compound is supposed to be the one with the least cost. TAGH is based on a large stem lexicon of almost 80.000 stems that was compiled within 5 years on the basis of large newspaper corpora and literary texts. The number of analyzable word forms is increased considerably by more than 1000 different rules for derivational and compositional word formation. The recognition rate of TAGH is more than 99% for modern newspaper text and approximately 98.5% for literary texts.
BibTeX:
@incollection{Geyken2006a,
  author = {Geyken, Alexander and Hanneforth, Thomas},
  title = {T{AGH}: {A} {C}omplete {M}orphology for {G}erman based on {W}eighted {F}inite {S}tate {A}utomata},
  booktitle = {Finite {S}tate {M}ethods and {N}atural {L}anguage {P}rocessing. 5th International Workshop, FSMNLP 2005, Helsinki, Finland, September 1-2, 2005. Revised Papers},
  publisher = {Springer},
  year = {2006},
  volume = {4002},
  pages = {55-66},
  url = {http://www.dwds.de/dokumente/Geyken_Hanneforth_fsmnlp_2005.pdf},
  doi = {http://dx.doi.org/10.1007/11780885_7}
}
Geyken, A. & Schrader, N. LexikoNet, a lexical database based on role and type hierarchies 2006 Proceedings of LREC   inproceedings URL  
Abstract: In this paper LexikoNet, a large lexical ontology of German nouns is presented. Unlike GermaNet and the Princeton WordNet, LexikoNet has distinguished type and role hypernyms right from the outset and organizes those lexemes in a parallel, independent hierarchy. In addition to roles and types, LexikoNet uses meronymic and holonymic relations as well as the instance relation. LexikoNet is based on a conceptual hierarchy of currently 1,470 classes to which approximately 90,000 word senses taken from a large German monolingual dictionary, the Wörterbuch der deutschen Gegenwartssprache (WDG), are attached. The conceptual classes provide a useful degree of abstraction for the lexicographic description of selectional restrictions, thus making LexikoNet a useful filtering tool for corpus based lexicographic analysis. LexikoNet is currently used in-house as a filter for lexicographic extraction tasks in the DWDS project. Furthermore, it is used as an classification tool of the 'words of the week' provided for the newspaper Die ZEIT on www.zeit.de.
BibTeX:
@inproceedings{Geyken2006,
  author = {Geyken, Alexander and Schrader, Norbert},
  title = {{L}exiko{N}et, a lexical database based on role and type hierarchies},
  booktitle = {Proceedings of LREC},
  year = {2006},
  url = {http://www.dwds.de/dokumente/LREC2006_Geyken_Schrader.pdf}
}
Hanneforth, T. Longest-Match Pattern Matching with Weighted Finite-State Automata 2006 Finite-State Methods and Natural Language Processing, 5th International Workshop, FSMNLP 2005, Helsinki, Finland, September 1-2, 2005. Revised Papers   incollection DOI  
BibTeX:
@incollection{Hanneforth2006,
  author = {Hanneforth, Thomas},
  title = {{L}ongest-Match {P}attern {M}atching with {W}eighted {F}inite-State {A}utomata},
  booktitle = {Finite-State Methods and Natural Language Processing, 5th International Workshop, FSMNLP 2005, Helsinki, Finland, September 1-2, 2005. Revised Papers},
  publisher = {Springer},
  year = {2006},
  volume = {4002},
  pages = {78-85},
  doi = {http://dx.doi.org/10.1007/11780885_9}
}