Skip to main content
Advertisement

Main menu

  • Home
  • Content
    • Current Issue
    • Publication Preview--Ahead of Print
    • Past Issue Archive
    • Case of the Week Archive
    • Classic Case Archive
    • Case of the Month Archive
    • COVID-19 Content and Resources
  • For Authors
  • About Us
    • About AJNR
    • Editors
    • American Society of Neuroradiology
  • Submit a Manuscript
  • Podcasts
    • Subscribe on iTunes
    • Subscribe on Stitcher
  • More
    • Subscribers
    • Permissions
    • Advertisers
    • Alerts
    • Feedback
  • Other Publications
    • ajnr

User menu

  • Subscribe
  • Alerts
  • Log in

Search

  • Advanced search
American Journal of Neuroradiology
American Journal of Neuroradiology

American Journal of Neuroradiology

  • Subscribe
  • Alerts
  • Log in

Advanced Search

  • Home
  • Content
    • Current Issue
    • Publication Preview--Ahead of Print
    • Past Issue Archive
    • Case of the Week Archive
    • Classic Case Archive
    • Case of the Month Archive
    • COVID-19 Content and Resources
  • For Authors
  • About Us
    • About AJNR
    • Editors
    • American Society of Neuroradiology
  • Submit a Manuscript
  • Podcasts
    • Subscribe on iTunes
    • Subscribe on Stitcher
  • More
    • Subscribers
    • Permissions
    • Advertisers
    • Alerts
    • Feedback
  • Follow AJNR on Twitter
  • Visit AJNR on Facebook
  • Follow AJNR on Instagram
  • Join AJNR on LinkedIn
  • RSS Feeds
LetterLETTER

The “Peeking” Effect in Supervised Feature Selection on Diffusion Tensor Imaging Data

S. Diciotti, S. Ciulli, M. Mascalchi, M. Giannelli and N. Toschi
American Journal of Neuroradiology September 2013, 34 (9) E107; DOI: https://doi.org/10.3174/ajnr.A3685
S. Diciotti
aDepartment of Clinical and Experimental Biomedical Sciences University of Florence Florence, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
S. Ciulli
aDepartment of Clinical and Experimental Biomedical Sciences University of Florence Florence, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
M. Mascalchi
aDepartment of Clinical and Experimental Biomedical Sciences University of Florence Florence, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
M. Giannelli
bUnit of Medical Physics Pisa University Hospital Azienda Ospedaliero-Universitaria Pisana Pisa, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
N. Toschi
cMedical Physics Section, Department of Biomedicine and Prevention Faculty of Medicine University of Rome Tor Vergata Rome, Italy
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Info & Metrics
  • References
  • PDF
Loading

We read with great interest the article by Haller et al1 in the February 2013 issue of the American Journal of Neuroradiology. The authors used whole-brain diffusion tensor imaging–derived fractional anisotropy (FA) data, skeletonized through use of the standard tract-based spatial statistics (TBSS) pipeline, to achieve the following: 1) report significant group differences in FA among mild cognitive impairment (MCI) subtypes, and 2) perform individual classification of MCI subtypes by using a supervised feature selection procedure combined with a support vector machine (SVM) classifier. The study reports extremely high classification performances (100% sensitivity and 94%–100% specificity), which the authors describe as perhaps “too optimistic” and partially ascribe to “some degree of overfitting,” possibly also due to the use of feature selection.

The above-mentioned study presents a questionable use of supervised feature selection, which was performed on the entire dataset (ie, on both training and test data) instead of only on the training set of each partition generated during the cross-validation procedure. It is well-known that using test set labels to perform inference on a feature subset during the learning process can cause an overestimation of the generalization capabilities of the classifier (sometimes called the “peeking” effect) and that this effect is particularly severe when a large number of features are removed (like in this whole-brain DTI study, in which approximately 150,000 features were reduced to 1000).2,3 In other words, training the classifier with the same instances (ie, data “points”) used for feature selection corresponds to providing it with “hints” about the solution of the classification problem, and Haller et al1 recognized this circumstance as a “limitation” of their study. However, this methodologic mistake3 (which unfortunately appears in several recent studies in the MR imaging literature) does not constitute a mere theoretic concern but rather can have important consequences on the final results.3

To better clarify and exemplify our point, we have analyzed DTI data in a patient cohort presented in a previous MCI-Alzheimer disease (AD) classification study.4 Specifically, we attempted to discriminate between 30 patients with amnesic MCI and 21 with mild AD by using the processing pipeline (a Relief-F feature selection of the top 1000 features followed by an SVM classifier and 10 repetitions of a 10-fold cross-validation) and the same type of data (skeletonized whole-brain FA data) used by Haller et al.1 We repeated the analysis by using either incorrect cross-validation (ie, feature selection on the entire dataset followed by classification in cross-validation, as carried out by Haller et al1) or correct cross-validation (feature selection within each training set of the cross-validation).

In the former analysis, patients with mild AD were classified with 80.0% sensitivity and 96.7% specificity, while in the latter analysis, results dropped to 45.3% sensitivity and 67.3% specificity. These data demonstrate the remarkable amount of possible overestimation of the generalization capabilities due to the “peeking” effect in a cross-validation study which uses whole-brain TBSS data, and we speculate that the sensitivity/specificity values reported by Haller et al1 would be substantially lowered if an orthodox feature-selection procedure was applied to their data.

In conclusion, given the relevance and potential of MCI subtype discrimination through MR imaging feature extraction and selection, full consideration of the methodologic pitfalls of combining supervised feature selection procedures with SVM in whole-brain imaging data analysis is highly recommended.

REFERENCES

  1. 1.↵
    1. Haller S,
    2. Missonnier P,
    3. Herrmann FR,
    4. et al
    . Individual classification of mild cognitive impairment subtypes by support vector machine analysis of white matter DTI. AJNR Am J Neuroradiol 2013;34:283–91
    Abstract/FREE Full Text
  2. 2.↵
    1. Pereira F,
    2. Mitchell T,
    3. Botvinick M
    . Machine learning classifiers and fMRI: a tutorial overview. Neuroimage 2009;45(1 suppl):S199–209
    CrossRefPubMed
  3. 3.↵
    1. Smialowski P,
    2. Frishman D,
    3. Kramer S
    . Pitfalls of supervised feature selection. Bioinformatics 2010;26:440–43
    FREE Full Text
  4. 4.↵
    1. Diciotti S,
    2. Ginestroni A,
    3. Bessi V,
    4. et al
    . Identification of mild Alzheimer's disease through automated classification of structural MRI features. Conf Proc IEEE Eng Med Biol Soc 2012;2012:428–31
    PubMed
  • © 2013 by American Journal of Neuroradiology
PreviousNext
Back to top

In this issue

American Journal of Neuroradiology: 34 (9)
American Journal of Neuroradiology
Vol. 34, Issue 9
1 Sep 2013
  • Table of Contents
  • Index by author
  • Complete Issue (PDF)
Advertisement
Print
Download PDF
Email Article

Thank you for your interest in spreading the word on American Journal of Neuroradiology.

NOTE: We only request your email address so that the person you are recommending the page to knows that you wanted them to see it, and that it is not junk mail. We do not capture any email address.

Enter multiple addresses on separate lines or separate them with commas.
The “Peeking” Effect in Supervised Feature Selection on Diffusion Tensor Imaging Data
(Your Name) has sent you a message from American Journal of Neuroradiology
(Your Name) thought you would like to see the American Journal of Neuroradiology web site.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Citation Tools
The “Peeking” Effect in Supervised Feature Selection on Diffusion Tensor Imaging Data
S. Diciotti, S. Ciulli, M. Mascalchi, M. Giannelli, N. Toschi
American Journal of Neuroradiology Sep 2013, 34 (9) E107; DOI: 10.3174/ajnr.A3685

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Share
The “Peeking” Effect in Supervised Feature Selection on Diffusion Tensor Imaging Data
S. Diciotti, S. Ciulli, M. Mascalchi, M. Giannelli, N. Toschi
American Journal of Neuroradiology Sep 2013, 34 (9) E107; DOI: 10.3174/ajnr.A3685
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One
Purchase

Jump to section

  • Article
    • REFERENCES
  • Info & Metrics
  • References
  • PDF

Related Articles

  • Reply:
  • PubMed
  • Google Scholar

Cited By...

  • The incremental value of computed tomography of COVID-19 pneumonia in predicting ICU admission
  • Crossref
  • Google Scholar

This article has not yet been cited by articles in journals that are participating in Crossref Cited-by Linking.

More in this TOC Section

  • Regarding “Rates of Epidural Blood Patch following Lumbar Puncture Comparing Atraumatic versus Bevel-Tip Needles Stratified for Body Mass Index”
  • CAA-ri and ARIA: Two Faces of the Same Coin?
  • Fair Performance of CT in Diagnosing Unilateral Vocal Fold Paralysis
Show more Letters

Similar Articles

Advertisement

News and Updates

  • Lucien Levy Best Research Article Award
  • Thanks to our 2022 Distinguished Reviewers
  • Press Releases

Resources

  • Evidence-Based Medicine Level Guide
  • How to Participate in a Tweet Chat
  • AJNR Podcast Archive
  • Ideas for Publicizing Your Research
  • Librarian Resources
  • Terms and Conditions

Opportunities

  • Share Your Art in Perspectives
  • Get Peer Review Credit from Publons
  • Moderate a Tweet Chat

American Society of Neuroradiology

  • Neurographics
  • ASNR Annual Meeting
  • Fellowship Portal
  • Position Statements

© 2023 by the American Society of Neuroradiology | Print ISSN: 0195-6108 Online ISSN: 1936-959X

Powered by HighWire