Image intra-class correlation coefficient (I2C2)
Replication is the cornerstone of science. Its absence reduces any scientific endeavor to a set of unverified beliefs. Brain imaging studies are no exception, though they have several specific characteristics that conspire to make quantification of reliability especially difficult. First, measurements are complex and idiosyncratic for each modality. Secondly, the definition of the actual target to be measured is often imperfect. Thirdly, the data sets are very large and not amenable to standard investigations of replication. Fourthly, there is relatively little cross-pollination of research between different imaging modalities. Finally, setting up replication experiments can be difficult under many scenarios.
Here we describe a simple measure of reliability based on imaging replication studies based on standard measurement error concepts developed for scalar measurements. Consider the case when Wij(v) are observed images for the same subject, i=1,...,I, at visit j=1,...,Ji for voxel v=1,...,V. Here we assume that the images are "registered", that is, the meaning of "v" is the same across all images. A generalization of the classical measurement error model2 can then be written as
Wij(v)=Xi(v)+Uij(v),
where Xi(v) is the unknown true image, Wij(v) are the proxy measurements of Xi(v) at voxel v, and Uij(v) are subject/visit specific image measurement error. In many applications it makes sense to assume that the true images, Xi(v), and the meausrement error processes, Uij(v), are mutually independent. From now on we will refer to this as the classical image replication model. This model was first introduced by Di et al, 20094 in the context of replicated functional data and was extended to ultra-high dimensional problems by Zipunnikov et al, 20116. Due to the independence assumption the covariance operator of the observed data can be written as KW=KX+KU, where KW, KX, KU are the covariance operators of the W, X, and U processes respectively. Thus the Image Intraclass Correlation Coeffcient (I2C2)5 is defined as
ρ=tr(KX)/tr(KW)={tr(KW)-tr(KU)}/tr(KW)=1-tr(KU)/tr(KW),
where tr(.) denotes the trace of an operator. When measurement error is zero, tr(KU)=0, the I2C2 indicates perfect relaibility (replication), that is, ρ=1, whereas when the signal is nonexistent, tr(KX)=0 or tr(KU)=tr(KW), the I2C2 indicates perfectly unrelated replicated images, that is, ρ=0. It can be shown2 that the Method of Moments (MoM) of the I2C2 components are (estimator hats not included)
tr(KW)=[ΣIi=1ΣJij=1ΣVv=1{Wij(v)-W..(v)}2]/(ΣIi=1Ji-1)
and
tr(KU)=[ΣIi=1ΣJij=1ΣVv=1{Wij(v)-Wi.(v)}2]/{ΣIi=1(Ji-1)},
where W..(v)=Σi,jWij(v)/ΣiJi, and Wi.(v)=ΣjWij(v)/Ji. The I2C2 estimator is obtained by plugging these MoMs into the I2C2 parameter definition.
Variability estimation
The variability of the I2C2 estimator can be obtained using a nonparametric bootstrap of subjects approach5. This respects the structure of the within-subject correlation by sampling all replicated images of a subject, whenever that subject is included in the bootstrap subject.
Testing for zero reliability
In some cases one might be interested in testing whether the reliability of the replication experiment is actually zero. The null distribution (under the assumption of zero reliability) can be obtained using a permutation of images approach. In this case all images are simply permuted, the within-subject replication mechanism is ignored, and the I2C2 is calculated for each such random permutation.
Region of interest (ROI)
The approach can be applied to any particular region of interest (ROI) from the voxel to the image level. A multiresolution approach may reveal patterns of measurement error that may not be obvious with a fixed resolution.
Software and examples
The I2C2 is very easy to program and scalable to very large dimensional matrices. Bootstrap requires some computing time, though it remains feasible on modest computational resources (standard laptops). For RAVENS3 images of the Kirby 211 replication study R code can be obtained here.
Bibliography
1. C.M. Bennett and M.B. Miller. How reliable are the results from functional magnetic resonance imaging? The Year in Cognitive Neuroscience 2010, 1191:133–155, 2010.
2. Carroll RJ, Ruppert D, Stefanski, LA, Crainiceanu CM. Measurement Error in Nonlinear Models: A Modern Perspective, Chapman & Hall/CRC, 2006
3. Davatzikos C, Genc A, Xu D, and Resnick S.M. Voxel-based morphometry using the ravens maps:methods and validation using simulated longitudinal atrophy. NeuroImage, 14(6):1361–1369, 2001.
4. Di C, Crainiceanu CM, Caffo BS, Punjabi NM. Multilevel Functional Principal Component Analysis, The Annals of Applied Statistics, 3(1), 458-488, 2009
5. Shou H, Eloyan A, Lee S, Caffo B, Lindquist M, Crainiceanu CM. The image intra-class correlation coefficient (I2C2) for replication studies, under review
6. Zipunnikov V, Caffo BS, Davatzikos C, Schwartz B, Crainiceanu CM. Multilevel functional principal component analysis for high dimensional data, Journal of Computational and Graphical Statistics, 20(4), 852-873, 2011