Using Correlation Dimension for Analysing Text Data


Ilkka Kivimäki, Krista Lagus, Ilari T. Nieminen, Jaakko Väyrynen, and Timo Honkela. Using Correlation Dimension for Analysing Text Data. Proceedings of ICANN 2010, Artificial Neural Networks, pages 368–373, 2010.


In this article, we study the scale-dependent dimensionality properties and overall structure of text data with a method that measures correlation dimension in different scales. As experimental results, we present the analysis of text data sets with the Reuters and Europarl corpora, which are also compared to artificially generated point sets. A comparison is also made with speech data. The results reflect some of the typical properties of the data and the use of our method in improving various data analysis applications is discussed.

Suggested BibTeX entry:

    author = {Ilkka Kivim{\"a}ki and Krista Lagus and Ilari T. Nieminen and Jaakko V{\"a}yrynen and Timo Honkela},
    journal = {Proceedings of ICANN 2010, Artificial Neural Networks},
    pages = {368--373},
    publisher = {Springer},
    title = {{Using Correlation Dimension for Analysing Text Data}},
    year = {2010},