A comparison of different automated methods for the detection of white matter lesions in MRI data
Research highlights
► Comparison of automated methods for the identification of white matter hyperintensities. ►Support vector machines show best performance with only FLAIR and T1 available. ►Simple thresholding produces high numbers of false positives. ►Replacing non-support vectors during training further improves results.
Introduction
White matter hyperintensities (WMH) as detected using magnetic resonance images (MRI) have been linked to small vessels disease, degeneration of myelin, enlarged Virchow–Robin spaces but also reduced blood supply at the endzone of choroidal arteries and loss of ependymal cells (Schmahmann et al., 2008, Wahlund et al., 2001). They are found primarily in the periventricular white matter (WM) and increase with age, cognitive impairment and depression (Schmahmann et al., 2008). A number of studies have recently focused on the connection between amount and location of WMH and these clinical symptoms and several employed visual scales to rate WMH (Kapeller et al., 2002, Prins et al., 2004, Scheltens et al., 1993, Wahlund et al., 2001). These scales estimate the lesion severity rather than explicitly detecting every single lesion and therefore entail a loss of information.
A number of approaches towards an automatic detection of WMH have been proposed. Morphologically, most lesions caused by large vessel disease (e.g., stroke) or multiple sclerosis show much sharper boundaries than those from by small vessel disease. In addition, older stroke lesions are typically hypointense in the T1 weighted images (T1w) while those caused by smaller vessel alterations are not (Schmahmann et al., 2008). Detection methods are optimized for the simultaneous analyses of multiple images including Fluid Attenuated Inversion Recovery (FLAIR), T2 weighted (T2w) and proton density (PD) images that are typically acquired alongside the standard T1w images. FLAIR images have similar characteristics as the T2w but with cerebrospinal fluid (CSF) suppressed. WMH are brighter than intact tissue and CSF which makes the FLAIR image suitable for simple thresholding or histogram based segmentation methods (Hirono et al., 2000, Jack et al., 2001). While these methods require the identification of a threshold or cut-point, Admiraal-Behloul and colleagues reported an excellent detection rate from the combination of PD, FLAIR and T2w imaging with a clustering algorithm (Admiraal-Behloul et al., 2005). Each of the images was segmented separately using a fuzzy C-means clustering that categorized voxels into bright, medium-bright and dark in case of the FLAIR image. The final identification of lesioned voxels was made using decision rules based on human experts. Although other approaches exist (Maillard et al., 2008, Wu et al., 2006), supervised learning methods are increasingly being used (Lao et al., 2008, Quddus et al., 2005). These methods are supervised in a sense that they learn to separate intact WM voxels from those in a WMH from training data. These training data usually consist of scans with manually outlined lesions. Each individual voxel is classified as either intact or affected based on various characteristics, often referred to as features. The feature vector (FV) from each sample usually includes the intensity of the index voxel and its neighborhood in all available modalities but may include other information such as the stereotactic coordinates (Anbeek et al., 2004a). Those coordinates may carry relevant information as WMH are not uniformly distributed throughout the brain (Schmahmann et al., 2008). K-nearest neighbor (KNN) classification is an example of supervised learning techniques and employs a voting strategy. It effectively assigns the index voxel to the same group as the k most similar examples of the training set. Anbeek et al., 2004a, Anbeek et al., 2004b demonstrated encouraging results with this approach on subjects with different vascular disorders based on T1w, T2w, PD, inversion recovery and FLAIR data. More recent studies have used support-vector machines (SVM) (Lao et al., 2008, Quddus et al., 2005), an approach that generates a decision boundary from the training data to separate voxel types (Müller et al., 2001, Vapnik, 1998). In addition to other applications to the voxel level (Schnell et al., 2009) this approach has been used to separate diagnostic groups (Fu et al., 2008, Klöppel et al., 2008) or to predict the future course of a disease (Davatzikos et al., 2009, Koutsouleris et al., 2009, Vemuri et al., 2009). The SVM should show a better generalization to new data not used in the training of the algorithm compared to KNN. This is because the SVM learns the relevance of each feature in the FV, while the KNN just uses an identical weight for each feature. In other words, the SVM is less influenced by information that contains noise rather than signal relevant for the classification. Quddus et al.(2005) combined SVMs with a boosting technique to optimize the detection of WML in PD images. Lao et al.(2008) combined T1w, T2w, PD and FLAIR imaging from several centers and classified according to the neighborhood of the index voxel in all modalities.
Although all groups reported good performance of the respective approach, they did not compare it against alternative strategies and it is likely that the best method will depend on lesion type and imaging modality. We aimed to compare supervised and unsupervised approaches frequently used in previous studies on a dataset that is limited to FLAIR and T1w data. Such a limitation can be useful for studies where an extensive testing battery or the severity of the disease puts substantial constraints on the duration of an imaging session.
Section snippets
Materials and methods
20 patients with either mild cognitive impairment (MCI) or mild dementia were recruited into this study (see Table 1). The diagnoses were made on the basis of clinical and neuropsychological data. Subjects with MCI complained about cognitive deficits and showed a decline of cognitive abilities (> 1 SD in age-corrected standardized tests). Patients with mild dementia showed cognitive decline (> 1 SD) from a previous level in at least 2 domains affecting activities of daily living (Kornhuber et
Results
As shown in Fig. 3 the SVM approach gave best performance compared to all other automated approaches. Differences between the different FV were subtle. A completely different picture emerged for KNN classifiers where the full length FV outperformed all others. An FV with the neighborhood of the FLAIR image rated second best with the KNN approach (see supplementary Figs. 4 and 5). Default cut-off values for the SVM and KNN grouping (symbols on the graphs in Fig. 3) resulted in a good recall
Discussion
The aim of this study was a comparison between different automated methods for the problem of detecting WMH from T1w and FLAIR images. All methods presented here depend on a large number on internal parameters. This starts at the preprocessing stage where several algorithms for segmentation and bias correction are available, continues with the scaling and feature extraction step and ends with the value of k in the KNN approach and the kernel type of the SVM. It is not feasible to search the
Acknowledgment
We would like to thank Christian Gaser for helpful comments on the design of this study and the image processing pipeline. This study was supported by the following grants: Federal Ministry of Education and Research, 01GW0661 to MH, German Research Council MA 2343/4-1 to IM and German Federal and State Governments (EXC 294) to OR.
References (30)
Fully automatic segmentation of white matter hyperintensities in MR images of the elderly
Neuroimage
(2005)Probabilistic segmentation of white matter lesions in MR imaging
Neuroimage
(2004)Automatic segmentation of different-sized white matter lesions by voxel probability estimation
Med. Image Anal.
(2004)- et al.
Unified segmentation
Neuroimage
(2005) Pattern classification of sad facial processing: toward the development of neurobiological markers in depression
Biol. Psychiatry
(2008)Computer-assisted segmentation of white matter lesions in 3D MR images using support vector machine
Acad. Radiol.
(2008)A semiquantative rating scale for the assessment of signal hyperintensities on magnetic resonance imaging
J. Neurol. Sci.
(1993)Fully automated classification of HARDI in vivo data using a support vector machine
Neuroimage
(2009)Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain
Neuroimage
(2002)A fully automated method for quantifying and localizing white matter hyperintensities on MR images
Psychiatry Res.
(2006)
Pattern Recognition and Machine Learning
LIBSVM: a library for support vector machines
Longitudinal progression of Alzheimer's-like patterns of atrophy in normal older adults: the SPARE-AD index
Brain
Theory of communication
J. Inst. Electr. Eng.
Impact of white matter changes on clinical manifestation of alzheimer's disease: a quantitative study
Stroke
Cited by (45)
Automatic detection of white matter hyperintensities via mask region-based convolutional neural networks using magnetic resonance images
2022, Deep Learning for Medical Applications with Unique DataMultiple sclerosis lesion detection in multimodal MRI using simple clustering-based segmentation and classification
2020, Informatics in Medicine UnlockedCitation Excerpt :Researchers have applied machine-learning algorithms to MRI datasets to analyze various CNS conditions [1–4]. Various segmentation and detection algorithms have also been proposed for MS lesion detection and classification from MRI using various image processing and analysis methods [5–7]. These algorithms have been developed based on using a single MRI modality as well as using multiple MRI modalities such as T1-weighted (T1w), fluid-attenuated inversion recovery (FLAIR), T2-weighted (T2w) [8].
Segmentation of white matter hyperintensities using convolutional neural networks with global spatial information in routine clinical brain MRI with none or mild vascular pathology
2018, Computerized Medical Imaging and GraphicsCitation Excerpt :We compare the performance of the proposed CNN with GSI (CNN-GSI) framework with those of existing CNN (i.e., CNN without GSI), Support Vector Machine (SVM), Random Forest (RF) and Deep Boltzmann Machine (DBM) frameworks. Both of SVM and RF have been reported to work well for WMH segmentation (Ithapu et al., 2014; Klöppel et al., 2011) whereas DBM is a semi-supervised deep neural network which works well for feature extraction of MRI (Liu et al., 2012). In this study, we use greyscale value and texton features as features for SVM and RF, as per (Ithapu et al., 2014).
Machine learning of neuroimaging for assisted diagnosis of cognitive impairment and dementia: A systematic review
2018, Alzheimer's and Dementia: Diagnosis, Assessment and Disease MonitoringCitation Excerpt :It is unlikely that the conclusions of the present analysis, based on a substantial body of work to late 2016, would change by the inclusion of these most recent papers. Some nonsystematic reviews and surveys on machine learning have been published [6–12]. Our work included more recent papers, assessed more outcomes, and included sensitivity analyses to assess the impact of key study and population characteristics than prior reviews [13–15], Applications of deep learning not only in brain but, more in general, in medical imaging have been reviewed in a recently published survey [16].