(aside image)

Finding Dependent and Independent Components from Related Data Sets

In this work, we consider extension of independent component analysis (ICA) and blind source separation (BSS) for separating mutually dependent and independent components from two different but related data sets. This problem is important in practice, because such data sets are common in real-world applications. We propose a new method which first uses canonical correlation analysis (CCA) for detecting subspaces of independent and dependent components. The data sets are then mapped onto these subspaces. Even plain CCA can provide a coarse separation in simple cases, and we justify this property somewhat heuristically. Better separation results are obtained by applying a suitable ICA or BSS method to the mapped data sets. Any ICA or BSS method based on somewhat different assumptions on the data such as non-Gaussianity, temporal correlatedness, or nonstationary variances and applicable to somewhat different situations can be used for post-processing the results given by CCA.

The proposed method is straightforward to implement and computationally not too demanding. CCA preprocessing improves often quite markedly the separation results of the chosen ICA or BSS method especially in difficult separation problems. Not only are the signal-to-noise ratios of the separated sources clearly higher, but CCA also helps a method to separate sources that it alone is not able to separate. In the publications below, we present results for several well-known ICA and BSS methods such as FastICA and TDSEP for source signals that are difficult to separate. In the first paper we have applied our method successfully to real-world robot grasping data and in the second paper to real-world fMRI (functional magnetic resonance imaging) data.

Recently, we have generalized our method using variance maximization generalization of CCA, implemented using a least-squares formulation, to three or more data sets in the last publication below.

Our results are summarized in the journal paper below published in Neurocomputing.

Publications

J. Karhunen and T. Hao, Finding dependent and independent components from two related data sets. In Proc. of Int. Joint Conf. on Neural Networks (IJCNN 2011), San Jose, California, USA, July 31 - August 5, 2011, pp. 457-466.

J. Karhunen, T. Hao, and J. Ylipaavalniemi, A canonical correlation analysis based method for improving BSS of two related data sets. In Proc. of the 10th Int. Conf. on Latent Variable Analysis and Signal Separation (Lecture Notes in Computer Science, Springer, Vol. 7191, pp. 91-98, 2012.), Tel-Aviv, Israel, March 12-15, 2012.

J. Karhunen, T. Hao, and J. Ylipaavalniemi, A generalized canonical correlation analysis based method for blind source separation from related data sets. In Proc. of Int. Joint Conf. on Neural Networks (IJCNN 2012), Brisbane, Australia, June 2012. Publisher: IEEE.

J. Karhunen, T. Hao, and J. Ylipaavalniemi, Finding dependent and independent components from related data sets: A generalized canonical correlation analysis based method. Neurocomputing, Vol. 113, pp. 153-167, August 2013.

Figure caption

Winter in Nuuksio national park about 25 km northwest from our department house. Figure: Espoon matkailu / Tourism in Espoo.