Functional Connectivity during Resting-State Functional MR Imaging: Study of the Correspondence between Independent Component Analysis and Region-of-Interest−Based Methods

BACKGROUND AND PURPOSE: The connectivity across brain regions can be evaluated through fMRI either by using ICA or by means of correlation analysis of time courses measured in predefined ROIs. The purpose of this study was to investigate quantitatively the correspondence between the connectivity information provided by the 2 techniques. MATERIALS AND METHODS: In this study, resting-state fMRI data from 40 healthy participants were independently analyzed by using spatial ICA and ROI−based analysis. To assess the correspondence between the results provided by the 2 methods, for all combinations of ROIs, we compared the time course correlation coefficient with the corresponding “ICA coactivation index.” RESULTS: A strongly significant correspondence of moderate intensity was found for 20 ICA components (r = 0.44, P < .001). Repeating the analysis with 10, 15, 25, 30, 35, and 40 components, we found that the correlation remained but was weaker (r = 0.35–0.41). CONCLUSIONS: There is a significant but not complete correspondence between the results provided by ICA and ROI−based analysis of resting-state data.

F unctional connectivity can be defined as the coordination of activity across brain regions supporting the emergence of complex behavior. This coherence translates into temporal correlations of neural activity among anatomically distinct brain areas. 1,2 Functional connectivity can be investigated through fMRI by means of a range of methods that detect time course coherencies in the BOLD signal intensity across brain regions during the performance of active tasks and in the resting condition. 3,4 The study of resting-state functional connectivity is important from 2 perspectives: First, it provides information on the spontaneous activity that is intrinsically generated within the brain, which subserves communication across regions; and it provides integration of information, memory consolidation and introspection, and overall consumes more energy than stimulus-evoked activity. 5 Second, resting-state fMRI may be the only form of functional imaging viable in cognitively impaired patients who are unable to perform active tasks adequately.
While it is well-established that the observed BOLD signalintensity correlations are a consequence of neural activity as indexed by the local field potentials, [6][7][8] there is also a significant component related to systematic physiologic phenomena, the exact entity of which remains difficult to establish. 9 In clinical settings, this approach has enabled researchers to consistently identify differences between various patient groups and controls. For instance, it has revealed that patients with AD have reduced connectivity in both hippocampi, which is associated with decreased cognitive ability 10 and also across the DMN, the most commonly observed resting-state network, which encompasses the posterior cingulate cortex, the precuneus, and the inferior parietal and medial prefrontal regions. 11 There are 2 distinct methodologic approaches for studying functional connectivity through fMRI: One is to perform a completely data-driven analysis, for example, through ICA; the other is to rely on prior anatomic hypotheses to restrict the analysis to a predefined set of ROI or to a specific seed region.
ICA is a statistical technique that separates a set of signals into independent components (ie, it minimizes mutual information or maximizes non-Gaussianity). It assumes that the observed data are a linear combination of statistically independent source signals. More specifically, given a set of n temporally discrete signals [x 1 (t),x 2 (t) . . . x n (t)] represented by the data matrix X i,t , it is assumed that they are generated as a linear mixture of m independent source signals [s 1 (t), such that x i (t) ϭ a i,1 s 1 (t) ϩ a i,2 s 2 (t) ϩ . . . ϩ a i,m s m (t) for i ϭ 1 . . . n. Here, a represents an unknown mixing matrix, in which each element specifies the relative contribution of each independent-source signal intensity s j (t) to each mixture x i (t). The goal of ICA is to determine an unmixing matrix w ϭ a Ϫ1 such that s j (t) ϭ w j,1 x 1 (t) ϩw j,2 x 2 (t)ϩ . . . ϩw j,n x n (t), and, thereafter, the source signals themselves, on the assumption that they are statistically independent. The approach adopted in ICA is substantially different from that in PCA, which more simply minimizes correlation. ICA is widely preferred over PCA for fMRI because the latter emphasizes source variance and orthogonality, which do not lead to easily interpretable activation maps, whereas the higher order statistics used by ICA enhance solution sparsity, giving a better separation of artifacts and activity components. 12 When applied to fMRI, ICA allows one to discover the spatiotemporal structure contained in the data by extracting statistically independent spatial maps and their associated time courses. 12 ICA-based studies have identified components that appear to correspond to functionally relevant cortical networks such as visual and sensory-motor circuits, 13,14 as well as components that reflect physiologic processes, such as cardiac and respiratory activity, and nonphysiologic noise, such as imaging artifacts. 15 The alternative approach, ROIϪbased analysis, is based on a priori selection of regions, followed by extraction of regionally averaged BOLD signal-intensity time courses, which are fed into a linear correlation analysis. ROIϪbased and seedbased analyses are conceptually equivalent in the sense that they both infer connectivity from the temporal correlation of regional BOLD time courses, even though they are differentiated by the fact that ROIϪbased analysis compares regionally averaged signals over pairs of ROI (eg, Wang et al 16 ), whereas seed-based analysis compares the regionally averaged signal intensity from 1 seed ROI with that of all other individual voxels of the brain (eg, Cordes et al 17 ).
Each method has strengths and weaknesses, as reported by Fox and Raichle. 4 Because the ICA technique is data-driven, 1 advantage is that a temporal model of activation is not needed. Most important, ICA can automatically isolate sources of noise; however, it can be difficult to determine whether a component represents physiologic noise or a cortical network. Furthermore, the decomposition results can vary depending on the choice of the number of components, and the exact separation pattern may not be repeatable from 1 participant to another. 18 On the other hand, ROIϪbased analysis does not introduce interpretative issues, and results are relatively straightforward. However, it is based on a priori anatomic hypotheses, and the presence of non-neuronal fluctuations in the BOLD signal intensity can bias the observed correlations. 4 In principle, ICA and ROIϪbased analysis should lead to similar inferences because both index the same underlying connectivity. However, in practice, the 2 methods process the time-series in very different ways, and there are several reasons that they may not provide overlapping information. For example, let us consider 2 hypothetic ROI having strongly correlated BOLD signal-intensity time courses. If their common time course corresponds to 1 independent component or to the sum of a small number of independent components, then the 2 regions will appear coactivated on 1 or more ICA spatial map. If, however, the common time course is the combination of a large number of independent components, then it is likely that the activation of these 2 regions will not reach statistical significance on any individual ICA spatial map. Evaluating the degree to which ICA and ROIϪbased analysis leads to analogous inferences on connectivity appears necessary as a validity check.
A number of neuroimaging studies have shown that these 2 methods yield converging results [19][20][21][22][23][24] : For instance, the studies by Bluhm et al 20 and Long et al 21 have indicated that these 2 approaches identify the areas included in the DMN consistently, resulting in connectivity maps that are visually similar. Van Dijk et al 24 have examined the similarities between the 2 methods with a more quantitative analysis, albeit for 1 functional network only, showing that the correlation between ICA and the seed-based approach is moderate for the DMN (r ϭ 0.45). However, existing literature is lacking a comprehensive quantitative evaluation of the extent to which the results obtained with the ICA and ROIϪbased analyses are consistent at the level of a whole dataset rather than a specific circuit.
In this study, we addressed this issue, assessing quantitatively the degree of correspondence between the functional connectivity information provided by ICA and ROIϪbased analysis, in a group of healthy participants in a resting-state condition. As ancillary hypotheses, we aimed to determine whether the correspondence 1) was influenced by the number of components used for ICA, 2) was sensitive to the choice of the ICA coactivation index formula (see below), or was 3) specifically driven by a few intensely coactivated regions and specifically driven by particular combinations of regions. To this end, we included a large number of regions of interest (38 for hemisphere) because we did not want to limit our analyses to a predefined set of regions.

Participants and Procedure
Forty right-handed healthy volunteers (21 women and 19 men; mean, 40.8 Ϯ 9.3 years of age) with no history of neurologic or psychiatric disease participated in the study. The purpose of the experiment was explained at enrollment, and all participants, unpaid, provided written informed consent on standard institutional forms for research MR imaging. Participants were instructed to keep their eyes open, fixate on a cross centered on the screen, and relax, concentrating on their own breathing.

Data Acquisition
MR imaging was performed on a Magnetom Avanto 1.5T scanner (Siemens, Erlangen, Germany), by using an 8-channel phased-array receive-only head coil. Anatomic images were acquired with a magnetization-prepared gradient echo volumetric T1-weighted sequence (magnetization-prepared rapid acquisition of gradient echo, 1-mm 3 isotropic voxels, TR ϭ 1640 ms, TE ϭ 2 ms). Two hundred functional volumes were acquired by means of a gradient-echo echo-planar sequence (TR ϭ 1700 ms, TE ϭ 50 ms); twenty-one 5-mm sections were obtained in interleaved order, aligned parallel to the bicommissural plane. In-plane voxel size was 2 ϫ 2 mm, with a matrix size of 160 ϫ 256. The duration of the functional sequence was approximately 5 minutes.

Data Preprocessing
Image preprocessing was performed by using SPM5 software (Wellcome Department of Imaging Neuroscience, London, UK) running under Matlab 7 (MathWorks, Natick, Massachusetts). After realignment with 6 degrees-of-freedom and section-timing correction, functional images were coregistered with the corresponding anatomic volumes and subsequently transformed into Montreal Neurologic Institute space. Maximum intrasession head movement was 0.5 Ϯ 0.4 mm (range, 0.2-2 mm) across participants. Smoothing was, thereafter, performed with an 8-mm full width at half maximum isotropic Gaussian kernel.

ICA
Group ICA was performed by using GIFT software (http://icatb. sourceforge.net 25 ). As described previously, 25,26 the procedure consisted of the following steps: 1) data reduction at the individual level through PCA, 2) concatenation into a group dataset, 3) further data reduction with PCA, 4) decomposition into group-independent components by using the Infomax Algorithm, and 5) regular backreconstruction of individual maps and calculation of t-scores. For the group ICA, the MDL criterion indicated that the optimal number was 20 components. The MDL is an information-theoretic criterion, which corresponds to choosing the model permitting the most compact encoding of the data and model itself; this criterion is the one most frequently adopted to determine the optimal number of ICA components for a given dataset. 27 Furthermore, we repeated the ICA, setting the number of components to 10,15,25,30,35, and 40, with the sole purpose of evaluating the effect of the number of components on the correspondence with the ROIϪbased analysis. To verify the general validity of the dataset, we visually assessed the ICA spatial maps, to confirm the identification of the functional networks consistently described in previous work. 13,14,28

Region-of-Interest Analysis (ROI)
The cortical surface included in the section packet was subdivided into 76 ROIs (38 for each hemisphere) according to the AAL atlas (see the On-line Table and Fig 1). 29 The BOLD signal-intensity percentage change was calculated and averaged over all voxels in each region of interest. An individual bi-nary mask produced by SPM was used to remove all nonbrain parenchyma voxels. 24 The average time course calculated over all brain voxels was subtracted from the data, because this is the major criterion to remove artifactual inter-regional correlations caused by the biasing effect of non-neural brain-wide signal-intensity fluctuations. 30 Furthermore, low-pass filtering with a second-order Butterworth filter having f Ϫ3dB ϭ 0.15 Hz was applied to attenuate nonneuronal noise.
Then, we performed linear regressions to obtain the correlation coefficient for all possible pairs of regional time courses, resulting in a 76 ϫ 76 symmetric matrix of Pearson r values for each subject, subsequently averaged over all subjects. Since ICA analysis does not embed temporal filtering, we repeated these analyses on unfiltered data to verify the effect of the chosen filter settings on the observed correspondence.

Statistical Analysis
To address our primary hypothesis-that is, to evaluate the correspondence between the 2 methods-for each pair of regions of interest (A and B) an "ICA co-activation index" was defined as where ͗t i,A ͘ and ͗t i,B ͘ represent the average group-level t-scores in regions of interest A and B, and i ϭ 1. . . m corresponds to the summation over all extracted components. As implemented in GIFT, the group-level t-scores are derived from 1-sample t tests performed over the individual ␤ coefficients of the general linear models used to generate the component maps from the time courses extracted by ICA. In this process, all estimated components were included and no thresholding was applied to the ICA maps. The purpose of this index was to provide a metric resembling a correlation coefficient (albeit unbounded) but calculated on the basis of the ICA spatial maps rather than the regional BOLD signal-intensity time courses. The first calculation step (ie, multiplication of the average t-scores of the 2 regions of interest) ensured that for a given spatial map, the coactivation index would be positive for regions commonly correlating (or anticorrelating) within the ICA component map, negative for regions displaying opposite correlations, and near zero if either or both regions of interest were uncorrelated with the component time-series. The second calculation step (ie, summation of the products over all ICA components) ensured that all component maps contributed to determining Examples from a representative participant of the subdivision of the cortex into the AAL atlas ROIs used for the connectivity analysis. 29 The full list of ROIs is given in the On-line Table. the overall value of the coactivation index for a given pair of regions of interest.
To further explore the structure of our data, we added a power parameter k to the above formula, yielding For k ϭ 1, we have the original equation. For k Ͼ 1, the relative weight of intense coactivation of a small number of regions is increased with respect to that of less intense coactivation of a larger number of regions. For k Ͻ1, we have the converse effect. Evaluating the correspondence between the 2 techniques, we performed a correlation analysis at group level between the r values from the ROI analysis and the ICA coactivation indices. This was done by using both a linear correlation analysis and a nonparametric Spearman rank-order test and was repeated for 10, 15, 25, 30, 35, and 40 ICA components. For this test, diagonal entries (corresponding to each region compared with itself) were removed. Furthermore, to explore whether the correspondence between the 2 techniques was driven by specific regions, we performed an ANOVA on the average rank distance (as calculated during the Spearman rank-order test, where zero indicates equal and 1, the opposite position on the 2 sorted lists) by using ROI location as a factor.
We also calculated the mean and SD of the time course r value for each pair of regions of interest across subjects and following Cohen's criteria, 31 correlations were considered small (r Ͻ 0.3), moderate (0.3 Ͻ r Ͻ 0.5), or large (r Ͼ 0.5).

ICA Spatial Maps
Representative sections of the ICA spatial maps obtained by performing the decomposition with 20 components are shown in Fig 2. All components extracted by the decomposition are displayed. Overall, these maps display the connectivity patterns that have been previously described. 13,14,28 For example, components 2 and 3 exhibit lateralized activations in the frontal and parietal regions. Component 5 encompasses the medial prefrontal cortex, anterior and posterior cingulate, precuneus, and angular gyrus, which are collectively known as the DMN. Component 9 includes activations in the pre-and postcentral gyri and the supplementary motor area, corresponding to the sensory-motor network. Component 14 involves the superior temporal, insular, and postcentral cortices, which are acknowledged to form an auditory network. Component 19 involves the frontal polar, middle frontal, and anterior cingulate regions and has been linked to the executivecontrol component. By contrast, activation in component 16 closely follows the outline of the ventricles, demonstrating that it is physiologically determined, namely by cardiac-induced pulsation of the CSF. In addition, component 6 includes speckles following the brain outline, likely representing movement-related noise.

Region-of-Interest؊Based Analysis
The mean r value was 0.08 Ϯ 0.22 (range Ϫ0.44 -0.85), with 59% positive r values; for unfiltered data, Ͼ99% of r values were positive, yielding an average r value of 0.33 Ϯ 0.18 (range, Ϫ0.02-0.89). According to Cohen criteria, for filtered data the time course correlation was small (r Ͻ 0.3) for 83% of regionof-interest pairs, moderate (0.3 Ͻ r Ͻ 0.5) for 10%, and large (r Ͼ 0.5) for 7%. For each ROI, the relative number of other ROIs displaying a time course correlation with r Ͼ 0.3 and r Ͼ 0.5 is given in the On-line Table. Overall, regions belonging to the DMN had significantly higher scores than the rest (P ϭ .004).

Correspondence between ICA and ROI Analysis
For 20 components, the correspondence was moderate, according to the linear correlation analysis (r ϭ 0.44, P Ͻ .001) and to the corresponding Spearman rank-order tests (r ϭ 0.39, P Ͻ .001). As depicted in Fig 3A, repeating the analysis with 10,15,25,30,35, and 40 components, we found that the correlation remained but was weaker, according to both parametric and nonparametric tests. Performing the correspondence analysis on unfiltered data led to lower r values but otherwise overlapping results: The correlation between the 2 methods was strongest for 20 components with both parametric and nonparametric tests (r ϭ 0.36, F ϭ 449.7, P Ͻ .001 and r ϭ 0.32, P Ͻ .001, respectively) and weaker for all the other components. Figure 3B reports the results obtained sweeping the power parameter k between 0.25 and 4. The correspondence is strongest for k ϭ 1 and decreases for both k Ͼ 1 and k Ͻ 1. The effect is more marked for the parametric test, due to deviation from normality with increasing power. Figure 4 shows the time course r values and ICA coactivation indices, visualized as color-map matrices. The 2 matrices corresponding to filtered and unfiltered time series display analogous features; however, the average r value is markedly lower for filtered signals, due to removal of brain-wide biasing fluctuations. As expected, along the diagonal, the r values are unitary, and the ICA indices are largest, corresponding to each  region correlating with itself. Furthermore, values in the vicinity of the diagonal are relatively large, for both time course correlations and ICA, in intra-and interhemispheric quadrants, representing intense connectivity between each region, its neighbors, and their contralateral homologues. The ICAand ROIϪderived matrices, however, also demonstrate a marked difference. The time course r values tend to be relatively large for regions 20 -35 and 40 -50, corresponding to the medial frontal, cingulate, and occipital regions, whereas no such effect is observed for the ICA coactivation indices, for which the maps show a more diffuse and scattered pattern.
Irrespective of the number of components, the ANOVA on the average rank distance, though statistically significant (P Ͻ .001) due to the degrees of freedom (75,76 2 -76), yielded a negligibly small effect size ( p 2 Ͻ 0.04), indicating that the correspondence between the ICA coactivation indices and the ROIderived time course r values was only very weakly driven by specific combinations of regions.

Discussion
Our results reveal a correspondence between the time course correlation r values and the ICA coactivation indices, which is significant (P Ͻ .001) and moderate in intensity (r ϭ 0.44) according to Cohen criteria. This findings is novel in that while a number of previous investigations have highlighted a convergence in the information provided by the 2 methods, [19][20][21][22][23][24] they have done so mainly on the basis of a qualitative judgment. To our knowledge, this study is the first to explore the issue in a quantitative manner on a whole dataset.
The presence of a significant correspondence reassuringly confirms that, despite differences, the 2 methods successfully represent the underlying connectivity. This is expected, considering that they are based on the same information content (ie, on the presence of coherent time course fluctuations over spatially distinct regions). The correspondence was, however, not strong, as also evident on the comparison of the matrices shown in Fig 4. The main similarity was the presence of relatively large values in the proximity of the diagonal, where cells generally correspond to a region correlating with its neighbors or their contralateral homologues (see the On-line Table). The main difference was the observation of high r values in areas distant from the diagonal, for which the ICA coactivation index was, by contrast, relatively low. Even though a graph-theory-based analysis was not conducted, our results appear to imply that the 2 techniques have a higher correspondence in the extraction of small-world rather than long-distance connections; this qualitative observation needs to be substantiated by further work. 32 Incomplete correspondence is not unexpected, given the conceptual differences between the 2 methods (ie, the use of data-reduction algorithms direct extraction of temporal series from each region). Furthermore, discrepancies in the results provided by the 2 methods may also reflect different sensitivities to physiologically determined systematic fluctuations (eg, Beckmann et al 13 ). Finally, the relative merits and differences of the 2 techniques are, in principle, independent of whether the observed BOLD signal-intensity fluctuations are due to resting-state activity or related to the performance of an active task.
The intensity of the correspondence was modulated by the number of components used for ICA decomposition: The correlation coefficient was 0.44 for 20 components and smaller (range, 0.35-0.41) for all other component numbers (10, 15, 25, 30, 35, and 40). In fact, the number indicated by MDL estimation is 20 components, the value most commonly assumed as optimal throughout the literature. 13,20,21,25 This effect suggests that the representation of the statistical properties of the dataset by ICA may be dependent on the number of components chosen, a finding in line with previous work that reported that the spatial and temporal discriminative ability of ICA is critically dependent on this parameter. 22,33 Indeed, 1 reason for partial correspondence between ICA and time course correlation analysis is that ICA can produce "fragmented" networks, whereby given networks of coherent activity, which would appear together in a single seed-based map, are scattered across multiple components. It has been shown that this effect is critically dependent on the choice of the number of components: As this is increased, the decomposition becomes less stable and some networks (such as the visual components) branch into clearly distinct subcomponents, whereas others apparently do not (such as the sensorimotor network). 34 This effect may partly account for the decreased correspondence observed when decomposition was performed with a large number of components.
We chose to define the main formula used for the ICA coactivation index through the product of the average t-scores from the 2 ROIs. We extended our main findings by exploring the consequences of this choice and have inserted an explicit power parameter, k, and swept its value to determine the effect on the observed correspondence. The intensity of the correspondence was highest for k ϭ 1 and decreased on either side of this value. This finding confirmed that the choice of the ICA coactivation index formula (ie, considering the product of the 2 average t values rather than, for example, the square root of the product) was appropriate to represent the structure of the data. Additionally, it indicated that the observed correspondence was not especially driven by a few intense coactivations (as would be the case if it had increased with k Ͼ 1) or by many weaker coactivations (k Ͻ 1), but it was, instead, representative of a characteristic of the dataset as a whole.
Furthermore, the observed correlations were not strongly driven by specific combinations of ROIs; rather, they were a primarily distributed feature of the data as confirmed by the ANOVA on rank orders.
ROIϪbased analyses indicate that in the absence of temporal filtering, most observed regional correlations are positive. Removing the average time course and performing low-pass filtering resulted in a considerably increased number of negative correlations, a finding already reported in the literature (eg, Van Dijk et al 24 ); however, the correspondence with ICA did not change, as seen in Figs 3 and 4.
The present study has a number of limitations. First, the correspondence was assessed only during resting state. However, a recent study 25 has shown that the components identified during resting state and an auditory task substantially overlapped. Second, we used a relatively long TR, leading to aliasing problems. While this limitation, in common with most studies in this area (eg, Damoiseaux et al 14 36 ). Third, due to the previous limitation and to the fact that no physiologic monitoring was performed, we were unable to identify a robust objective criterion applicable to both ICA and ROIϪ based analysis to reject the signal-intensity fluctuations due to cardiac and respiratory activity. Fourth, the intersubject variability in the ROIϪbased analysis was not characterized. However, numerous studies have shown that functional connectivity data are strongly reliable across sessions and individuals. 14,21,24 Future work should evaluate how variable the correlation between the 2 techniques is at the level of individual subjects, comparing ICA decompositions and time course correlations performed in a completely separate way for each subject.

Conclusions
We have quantified the correspondence between the connectivity information provided by ICA and ROI-based analysis. We have found a significant correspondence of moderate intensity, which was modulated by the number of components used for ICA decomposition and was most intense for 20 ICs (r ϭ 0.44). It was strongest when the product of the ICA map t-scores was considered (ie, k ϭ 1), and the correspondence was not driven by specific combination of regions. The 2 techniques, however, do not provide completely overlapping information, and our data alone are not sufficient to elaborate guidelines regarding which one to adopt in a given study. A plausible theoretic criterion would be to adopt a regional time course correlation analysis whenever clear anatomic a priori hypotheses are available, and ICA otherwise; however, in practice the 2 techniques are frequently used jointly. A paradigmatic example is the application of resting-state studies to presurgical mapping of the motor areas. A region of interest is typically used as an initial seed to highlight the sensorimotor component of spontaneous activity, but this is always supplemented by ICA analysis, to remove anatomic assumptions that may be misleading in the presence of gross lesions and functional reorganization. 37 Another example is in the study of AD: Given that a central involvement of the hippocampus is expected, a ROI approach is well-motivated. 10 However, even for this application, ICA is frequently used because it reduces the risk of missing relevant activity also in other areas due to a priori anatomic assumptions. 11 If possible, the 2 approaches should be applied jointly, to obtain an independent confirmation of the findings and to support further work aimed at determining the suitability of the 2 approaches for given applications.