Synthetic MRI for Clinical Neuroimaging: Results of the Magnetic Resonance Image Compilation (MAGiC) Prospective, Multicenter, Multireader Trial

The authors performed a prospective multireader, multicasenoninferiority trial of 1526 images read by 7 blinded neuroradiologists with prospectively acquired synthetic and conventional brain MR imaging case-control pairs from 109 subjects with neuroimaging indications. Each case included conventional T1- and T2-weighted, T1 and T2 FLAIR, and STIR and/or proton density and synthetic reconstructions from multiple-dynamic multiple-echo imaging. Images were randomized and independently assessed. Overall synthetic MR imaging quality was similar to that of conventional proton-density, STIR, and T1- and T2-weighted contrast views across neurologic conditions. Artifacts were more common in synthetic T2 FLAIR, but were readily recognizable and did not mimic pathology. BACKGROUND AND PURPOSE: Synthetic MR imaging enables reconstruction of various image contrasts from 1 scan, reducing scan times and potentially providing novel information. This study is the first large, prospective comparison of synthetic-versus-conventional MR imaging for routine neuroimaging. MATERIALS AND METHODS: A prospective multireader, multicase noninferiority trial of 1526 images read by 7 blinded neuroradiologists was performed with prospectively acquired synthetic and conventional brain MR imaging case-control pairs from 109 subjects (mean, 53.0 ± 18.5 years of age; range, 19–89 years of age) with neuroimaging indications. Each case included conventional T1- and T2-weighted, T1 and T2 FLAIR, and STIR and/or proton density and synthetic reconstructions from multiple-dynamic multiple-echo imaging. Images were randomized and independently assessed for diagnostic quality, morphologic legibility, radiologic findings indicative of diagnosis, and artifacts. RESULTS: Clinical MR imaging studies revealed 46 healthy and 63 pathologic cases. Overall diagnostic quality of synthetic MR images was noninferior to conventional imaging on a 5-level Likert scale (P < .001; mean synthetic-conventional, −0.335 ± 0.352; Δ = 0.5; lower limit of the 95% CI, −0.402). Legibility of synthetic and conventional morphology agreed in >95%, except in the posterior limb of the internal capsule for T1, T1 FLAIR, and proton-density views (all, >80%). Synthetic T2 FLAIR had more pronounced artifacts, including +24.1% of cases with flow artifacts and +17.6% cases with white noise artifacts. CONCLUSIONS: Overall synthetic MR imaging quality was similar to that of conventional proton-density, STIR, and T1- and T2-weighted contrast views across neurologic conditions. While artifacts were more common in synthetic T2 FLAIR, these were readily recognizable and did not mimic pathology but could necessitate additional conventional T2 FLAIR to confirm the diagnosis.

S ynthetic MR imaging uses quantitative probing of multiple physical properties to reconstruct multiple contrasts from 1 scan. Parameters like TR, TE, and TI can be modified with mathematic inferences rather than being predetermined. [1][2][3] The speed of diagnostic brain studies can thus be reduced to only about 5 minutes with synthetic MR imaging. 4 This advancement may help improve throughput and reduce rescanning, while also providing quantitative information of research interest. [4][5][6] Clinical studies of synthetic MR imaging are highly heterogeneous in that they examine a variety of conditions with widely varying scan parameters, with a paucity of large, randomized trials to inform clinical usage. 3,6 Blystad et al 5 (2012) reported that synthetic images had diagnostic utility similar to that of conventional imaging series, though with some quality issues like granulation and contrast particularly apparent in FLAIR views. Other studies reported good quality and contrast for synthetic images among certain indications, such as multiple sclerosis, brain metastasis, 6,7 and myelination patterns. 8 Because image quality endpoints are reliant on reader judgment (reported to have up to 41% variability and only fair-tomoderate interrater agreement 9,10 ) and scanning conditions, drawing clinically relevant inferences from diverse small trials is challenging. Furthermore, the broad diversity of both healthy and pathologic morphologic variants encountered in routine neuroimaging necessitates more robust clinical studies of synthetic MR imaging for clinical neuroimaging.
This study was designed to compare the overall image quality of synthetic MR imaging with conventional MR imaging in a general neuroimaging population. Secondary aims included legibility of anatomic and morphologic features, artifact prevalence, and diagnostic performance across a range of cases helpful in informing clinical usage and adoption of synthetic MR imaging.

Participants and Clinical Assessments
Subjects (n ϭ 117) were enrolled prospectively into a multireader multicenter case-control study across 6 hospitals from November 2015 to January 2016 (ClinicalTrials.gov Identifier NCT02596854). Of these, all complete cases (n ϭ 109; 45 men, 64 women; mean, 53.0 Ϯ 18.5 years of age; range, 19 -89 years) with synthetic and conventional (control) acquisitions were read. Subjects were 18 years of age or older with clinical indications for neuroimaging and without contraindications to MR imaging or previously diagnosed congenital conditions or extensive trauma prohibiting scanning. Governing ethics committees at each site approved this study, and subjects provided written informed consent.

Image Acquisition
Images were prospectively acquired by using a fixed set of scanning parameters closely approximating current standard of care brain MR imaging (as detailed for 1.5T and 3T scanners in Online Table 1). First, conventional images were acquired by using conventional 2D axial plane T1-and T2-weighted, T1 and T2 fluid-attenuated inversion recovery, short tau inversion recovery, and proton density (PD) sequences. Then, a multiple-dynamic multiple-echo (MDME) sequence was performed for synthetic reconstruction, for a complete conventional and synthetic casecontrol series. MDME uses a repeat version of the same gradientreversal process used to create a single gradient-echo to produce additional gradient-echoes after a single radiofrequency pulse. This is known as multiple (or dual) echo gradient-echo, which is possible when complete loss of the transverse magnetization by T2* relaxation has not yet occurred. Because MDME is a quantitative sequence, it enables absolute quantification of tissue physical properties, like longitudinal R 1 relaxation rate, transverse R 2 relaxation rate, and PD independent of the scanner settings. MDME parameters acquired in 1 scan are used in synthetic imaging to calculate pixel intensity, producing an appearance similar to that of conventional MR images with modifiable TE, TR, and TI. 5,11 Thus, synthetic (based on MDME) and conventional T1, T2, T1 FLAIR, T2 FLAIR, PD, and STIR contrast views were col-lected. MDME data were reconstructed outside the clinical care environment by using MAGnetic resonance image Compilation (MAGiC) software on a 64-bit Advantage Workstation (GE Healthcare, Milwaukee, Wisconsin). No errors were logged during processing, and the average processing time was approximately 2 minutes per case. Scan duration, subject disposition, and imaging results were recorded for each case.
The site-determined diagnosis was recorded on the basis of the results of MR imaging studies and work-up performed according to the standard of care by clinical neuroradiologists. The sites reported the reference (site-determined diagnosis) by using the same scale as the study readers, which reports normal or Ն1 pathologic subtype adapted from Osborn et al (2010) 12 : 1) traumatic, complex, indeterminate, or other condition or injury; 2) congenital malformation; 3) ischemic or hemorrhagic stroke; subarachnoid hemorrhage/aneurysm; 4) vascular malformation; 5) neoplasm/primary neoplastic cysts; 6) infectious/demyelinating disease; or 7) metabolic/degenerative disorders.

Radiologic Assessments
Synthetic and conventional images sets were randomized and assessed by 7 blinded independent neuroradiologists (Ͼ10 years' experience) on standard imaging workstations. Casecontrol pairs from the same subject were separated and read across 2 sessions, separated by a 4-week memory-washout period. Each read included either all synthetic or all conventional contrast views from a case. Overall diagnostic image quality was rated (considering all available contrast views) on a 5-point Likert-type scale: 5 ϭ excellent (acceptable for diagnostic use), 4 ϭ good (acceptable for diagnostic use), 3 ϭ acceptable (acceptable for diagnostic use but with minor issues), 2 ϭ poor (not acceptable for diagnostic use), or 1 ϭ unacceptable (not acceptable for diagnostic use). Ratings of Ն3 were considered acceptable overall. For image sets rated as unacceptable (1 or 2), the rationale was recorded as "open text." Readers also recorded radiologic findings indicative of a diagnosis with corresponding Osborn classifications.
For each contrast view, readers rated the legibility (or visibility of margins and structures associated with key anatomic/morphologic features) of anatomies defined a priori. Legibility ratings supplemented overall image-quality data, which consider all regions of the brain, as a means of providing specific information about anatomic regions in brain imaging. Each anatomy was rated on a binary scale (legible/illegible), including the following: central sulcus, head of the caudate nucleus, posterior limb of the internal capsule, cerebral peduncle, middle cerebellar peduncle, and cervicomedullary junction. Readers recorded whether any of the following artifacts were present: 13 low signal-to-noise, motion and section issues, infolding or wrap-around, white pixel or spike noise, phase encoding, flow, contrast-to-noise, low image resolution, or blurring. Readers could provide free text comments on any other observations.

Statistical Analysis
Statistical analysis was performed in SAS 9.2 (SAS Institute, Cary, North Carolina), and sample size was calculated in PASS12 (NCSS Statistical Software, Kaysville, Utah). Per the prospective statistical plan to determine noninferiority, a Wilcoxon signed rank test was used to determine noninferiority of synthetic-toconventional MR imaging in terms of the overall diagnostic image quality score, by using a 1-sided ␣ ϭ .025 test with a noninferiority margin of ⌬ ϭ .5 with a 5-level Likert scale. The primary hypothesis is 1-sided and can be stated as H 0 :S Յ Ϫ⌬ and H A :S Ͼ Ϫ⌬, where the S is the median difference of overall diagnostic image quality across readers for synthetic-versus-conventional MR imaging, in which noninferiority is established by rejecting the null hypothesis. The margin (⌬) of .5 was determined statistically on the basis of the population and was confirmed by clinical estimates from prior research 5 and institutional pilot data, in accordance with recommendations for determination of noninferiority margins described in the US Food and Drug Administration Guidance for Industry: Non-Inferiority Clinical Trials to Establish Effectiveness (2016) (https://www.fda.gov/ downloads/Drugs/GuidanceComplianceRegulatoryInformation/ Guidances/UCM202140.pdf) and trial designs for noninferiority testing in radiology reviewed by Ahn et al (2012). 14 Descriptive statistics were used to summarize secondary endpoints of anatomic/ morphology legibility by anatomic region, artifact prevalence, and diagnostic performance (sensitivity/specificity) by the Osborn classi-fication. Interrater reliability between readers was assessed by kappa () statistic.

Overall Diagnostic Quality
Each of 7 blinded neuroradiologists read all 109 clinically acquired case-control image sets (109 synthetic and 109 conventional) for a total of 1526 reads (763 synthetic and 763 conventional reads). Of these, 56/109 were acquired on 1.5T static field strength scanners and 53/109 were acquired on 3T scanners. Because no significant differences for 5-level image quality scores (acceptable ϭ 3, 4, or 5 versus unacceptable ϭ 1 or 2) were observed on the basis of scanner static field strength (1.5T or T) or acquisition site (P Ͼ .05 with a 2-tailed t test), results were pooled for analysis. The duration of scanning was recorded, with a singleacquisition sequence for synthetic reconstruction requiring 5 minutes 36 seconds on 1.5T scanners and 5 minutes 4 seconds on 3T scanners (On-line Table 1).
Considering all contrast views, 734 (96%) synthetic cases and 745 (98%) conventional cases were rated as acceptable (Ն3 on a 5-point scale) (Table). Figure 1 shows comparable synthetic and conventional case-control images from a normal (no pathology present) brain by contrast view. Figures 2-5 show case-control Diagnostic image-quality ratings by static field strength of scanner and overall a  images across a range of brain pathologies (continued in On-line Figs 1-3). Overall diagnostic image quality of synthetic images was statistically noninferior to conventional images, with a mean difference (synthetic-conventional) across readers of Ϫ0.335 Ϯ 0.352 with a lower limit of the (1-sided) 95% CI of Ϫ0.402 (median, Ϫ0.428; minimum, Ϫ1.286; and maximum, 0.714; P Ͻ .001). Among synthetic images rated as poor or unacceptable (1 or 2 on a 5-point scale), the most common quality issue was patient motion in synthetic image sets owing to generating from a single acquisition (where a single motion event propagates across all reconstructed contrast views).

Legibility of Anatomic/Morphologic Features
Anatomic/morphologic features were visualized and rated as legible in synthetic and conventional imaging for Ն98% of regions across contrast views, except in the cervicomedullary junction rated at 96% on both synthetic and conventional imaging (Online Table 2). For synthetic and conventional pairs from the same subject, readers agreed for Ն95% of anatomic/morphologic regions across contrast views, except in the posterior limb of the internal capsule for T1, T1 FLAIR, and PD views (Ͼ80% agreement). Notably, 6 of 7 readers had agreement of 99%-100% for T1 FLAIR, with 1 reader as an outlier at 89%, possibly related to experience. Further study will be needed to investigate the influence of experience on reading synthetic images and possible training solutions.

Artifacts Occurrence and Characterization
Fewer artifacts (all characterizations) were identified in synthetic than in conventional imaging for T1-weighted (9.2%), STIR (24.8%), and PD (1.1%) contrast views (On-line Table 3). Synthetic images had more artifacts overall on the T2-weighted (5.0%), T1 FLAIR (17.9%), and T2 FLAIR (49.3%) contrast views (On-line Table 3). Phase-encoding artifacts were less frequent in synthetic STIR images (27.2%) and synthetic T1 contrast views (13.0%). Synthetic contrast views were more likely to contain  white pixels/spike noise artifacts across contrast views (except PD), and flow artifacts were more common in synthetic views, most notably in the synthetic T1-weighted (13.0%), T1 FLAIR (22.1%), and T2 FLAIR (24.1%) contrast views. Readers identified relatively more artifacts among synthetic T2 FLAIR contrast views compared with other synthetic and conventional contrast views. Synthetic T2 FLAIR showed 24.1% more flow artifacts, 17.6% more white noise artifacts, and 59.2% more artifacts marked as "other" compared with conventional views. Examination of reader free text comments revealed that artifacts marked as "other" primarily described localized, granulated hyperintensities apparent in the margins only in synthetic T2 FLAIR contrast views (Fig 6). These artifacts were recognizable by a distinct pixelated appearance and a tendency to occur along tissue-CSF boundaries only in T2 FLAIR views in otherwise unremarkable image sets. Readers reported that synthetic T2 FLAIR may have some diagnostic limitations in practice, which could necessitate a conventional T2 FLAIR scan. However, owing to the nature of synthetic imaging (which results in a full range of possible contrast views for cross-comparison), neuroradiologists were readily able to distinguish T2 FLAIR artifacts from pathology, without impacting diagnostic utility.

Diagnostic Performance
Overall interrater agreement ( correlation coefficient) for pathology detection was 0.502 for synthetic images and slightly higher at 0.605 for conventional images. Across the 7 readers (1526 total reads, including 763 synthetic and 763 conventional pairs), overall sensitivity for correct identification of pathology ranged from 60.32 (95% CI, 47. 20  reported by the site (based on clinical MR imaging studies and, when necessary, additional follow-up or laboratory testing), the study included 46 healthy and 63 pathologic cases (of which 2 cases contained 2 pathology types and 1 case contained 3 pathology types), including 7 traumatic or complex injuries, 2 congenital malformations, 12 strokes/hemorrhages, 2 vascular malformations, 32 neoplasms/primary neoplastic cysts, 10 infectious/ demyelinating conditions, and 2 metabolic/degenerative disorders. Readers of synthetic MR imaging showed equal or higher ability to diagnose all pathologies, except for neoplasms/primary neoplastic cysts (n ϭ 2, difference in detection of Ϯ6.3% sensitivity and Ϯ1.3% specificity among readers) subgroup and infectious diseases (n ϭ 10, difference in detection of Ϯ10.0% sensitivity and Ϯ3.0% specificity).

DISCUSSION
To our knowledge, this is the first large, prospective, randomized study of synthetic MR imaging technology to enroll a cross-section of the neuroimaging population, including a variety of brain pathologies encountered in clinical practice. On the basis of blinded assessments from 7 neuroradiologists, the overall diagnostic quality of synthetic MR imaging was statistically noninferior to conventional MR imaging series for T1-and T2-weighted, T1 and T2 FLAIR, STIR, and PD contrast views. Furthermore, neuroradiologists reported similar anatomic/morphologic feature legibility in both synthetic and conventional images. Both synthetic and conventional sequences exhibited similar quality issues and artifact trends for T1-and T2-weighted, STIR, and PD contrast views, while synthetic imaging had more FLAIR artifacts. Synthetic FLAIR artifacts were readily recognizable by cross-comparison within contrast views and thus did not significantly impact the diagnostic use of synthetic MR imaging. Overall, study results demonstrated that both synthetic and conventional imaging have similar diagnostic utility.
Anatomic and morphologic characteristics were visible in A slight misregistration is apparent due to patient motion between the MDME scan (used for synthetic reconstruction, lower row) and the comparable conventional scan acquired in the study (upper row). While misregistration due to motion can pose challenges in conventional serial acquisitions due to partial section differences in images across contrast views, synthetic reconstruction inherently prevents misregistration across synthetic contrast views.
both synthetic and conventional views, though both exhibited issues pertaining to visualizing the craniocervical junction, CSF suppression, and pulsation artifacts that are well-documented in MR imaging. 5,15-18 Synthetic imaging exhibited characteristic hyperintense artifacts in FLAIR views, corroborating previous reports that further work will be necessary before synthetically generated FLAIR views can fully replace conventional FLAIR in practice. 5,6 While FLAIR artifacts contributed to lower overall image quality scores (because all views were considered in this composite primary end point), the overall impact of FLAIR artifacts on diagnosis was inherently limited by the nature of the synthetic views, in which immediate cross-comparison with other contrast views is possible. On rare occasions, encoding artifacts in FLAIR views could necessitate clinical workflow changes such as the addition of a single conventional scan; however, the impact on the patient's overall scan experience is offset by the time savings of the synthetic acquisition. Furthermore, motion and signalencoding artifacts were observed to affect all reconstructed synthetic views if present in the original acquisition. As few as 7.5% of single MR images exhibited motion artifacts, while up to 19.8% of long scans of multiple contrasts may be affected. 19 Because synthetic imaging reduces the overall scan time, the impact of acquisition issues is expected to be limited in practice. Diagnostic performance of synthetic imaging was similar to that of conventional MR imaging, as indicated by statistical noninferiority of synthetic images. While the noninferiority model is decisive for effectiveness in therapeutic studies, which directly assess ultimate patient outcomes, elucidating the clinical implications of noninferiority findings in radiology is less straightforward because the negative effects of image quality may have variable effects on ultimate patient outcomes. 14 Thus, from a clinical perspective, we observed that in both synthetic and conventional MR imaging, some neoplasms/primary neoplastic cysts and infectious or demyelinating conditions were challenging for readers to identify without additional clinical or laboratory work-up, possibly due to overlapping appearances of neoplastic and inflammatory conditions on MR images. 20,21 The sensitivity and specificity of MR imaging in neuroradiology have been reported to range from 39% to 98% and 33% to 100%, respectively, with wide variations based on reader experience and the pathologic condition studied. 22-25 Across study readers, synthetic MR imaging sensitivity and specificity had values within typical clinically observed ranges for blinded MR imaging reads (without clinical context). 22-25 Statistical variations in diagnostic classifications may be centrally attributable to small samples of certain pathologies in the present study, meriting further study of these pathologic subgroups. Synthetic scanning is performed in the axial view only, and some clinical cases may be limited by spatial resolution in this section direction. Owing to the relatively shorter synthetic acquisition time, however, additional sequences can also be combined with the synthetic acquisition in a single examination session with minimal burden on the patient.
The strengths of this study include the use of a prospective acquisition protocol with matched scanning parameters (On-line Table 1). Because scans were acquired in a fixed order with MDME (synthetic reconstruction) acquired last, a relative propensity toward motion artifacts in synthetic images may not be representative of actual occurrence. Reports have, however, shown that single scans of short duration have lower incidences of motion than longer scans. 13,26 The trial results support the use of synthetic MR imaging in brain imaging to reduce scan time and the associated discomfort for patients undergoing brain MR imaging, with diagnostic performance similar to that of conventional imaging.

CONCLUSIONS
The current study demonstrated that synthetic images were statistically noninferior in terms of overall diagnostic image quality compared with conventional MR images, with similar diagnostic utility for detecting a range of brain pathologies. Both synthetic and conventional MR imaging could visualize anatomic and morphologic features of the brain, with similar trends in artifacts and diagnostic utility. Because synthetic reconstructions rely on the quality of a single scan, care should be taken to minimize motion and acquisition artifacts. While more artifacts were observed in synthetic T2 FLAIR reconstructions, cross-comparison with other contrast views enabled neuroradiologists to readily detect these artifacts without interfering with the diagnostic ability of synthetic images. The trial results support the use of synthetic MR imaging in brain imaging to reduce scan time and discomfort for patients undergoing brain MR imaging, while acquiring highquality diagnostic MR images. We expect that further research may reveal additional applications for synthetic MR imaging. Subdural hematoma on T2 FLAIR in synthetic and conventional 3T MR imaging demonstrating pronounced artifacts. Conventional (left) and synthetic (right) T2 FLAIR images are shown for a patient with subdural hematoma, in which synthetic T2 FLAIR has notable granulated hyperintensities and lacks contrast between the lesion and surrounding tissues. Artifacts of this severity level were rare among synthetically reconstructed images, possibly due to issues in the MDME acquisition that are typically resolved on rescanning. For cases demonstrating these granulated hyperintensities on the synthetic T2 FLAIR, artifacts were readily recognizable by characteristic distortion and correlation with other contrast views without apparent artifacts. While these could necessitate rescanning with conventional T2 FLAIR in some cases, when coupled with other contrast views, these artifacts did not interfere with the diagnostic accuracy of synthetic MR imaging.