Allomorfessor: Towards Unsupervised Morpheme Analysis

Reference:

Oskar Kohonen, Sami Virpioja, and Mikaela Klami. Allomorfessor: Towards unsupervised morpheme analysis. In Evaluating Systems for Multilingual and Multimodal Information Access: 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008 Aarhus, Denmark, September 17-19, 2008, Revised Selected Papers, volume 5706 of Lecture Notes in Computer Science, pages 975–982. Springer, 2009.

Abstract:

We extend the unsupervised morpheme segmentation method Morfessor Baseline to account for the linguistic phenomenon of allo- morphy, where one morpheme has several different surface forms. Our method discovers common base forms for allomorphs from an unanno- tated corpus. We evaluate the method by participating in the Morpho Challenge 2008 competition 1, where inferred analyses are compared against a linguistic gold standard. While our competition entry achieves high precision, but low recall, and therefore low F-measure scores, we show that a small model change gives state-of-the-art results.

Suggested BibTeX entry:

@incollection{okohonenvirpiojaklami_2009,
    author = {Oskar Kohonen and Sami Virpioja and Mikaela Klami},
    booktitle = {Evaluating Systems for Multilingual and Multimodal Information Access: 9th Workshop of the Cross-Language Evaluation Forum, CLEF 2008 Aarhus, Denmark, September 17-19, 2008, Revised Selected Papers},
    pages = {975--982},
    publisher = {Springer},
    series = {Lecture Notes in Computer Science},
    title = {Allomorfessor: Towards Unsupervised Morpheme Analysis},
    volume = {5706},
    year = {2009},
}

PDF (139 kB)