Sequential Information Bottleneck for Finite Data

Reference:

Jaakko Peltonen, Janne Sinkkonen, and Samuel Kaski. Sequential information bottleneck for finite data. In Russ Greiner and Dale Schuurmans, editors, Proceedings of ICML 2004, the Twenty-First International Conference on Machine Learning, pages 647–654, Madison, WI, 2004. Omnipress.

Abstract:

The sequential information bottleneck (sIB) algorithm clusters co-occurrence data such as text documents vs. words. We introduce a variant that models sparse co-occurrence data by a generative process. This turns the objective function of sIB, mutual information, into a Bayes factor, while keeping it intact asymptotically, for non-sparse data. Experimental performance of the new algorithm is comparable to the original sIB for large data sets, and better for smaller, sparse sets.

Suggested BibTeX entry:

@inproceedings{Peltonen04,
    address = {Madison, WI},
    author = {Jaakko Peltonen and Janne Sinkkonen and Samuel Kaski},
    booktitle = {Proceedings of ICML 2004, the Twenty-First International Conference on Machine Learning},
    editor = {Russ Greiner and Dale Schuurmans},
    pages = {647-654},
    publisher = {Omnipress},
    title = {Sequential Information Bottleneck for Finite Data},
    year = {2004},
}

See www.cis.hut.fi ...