PACS Integration of Semiautomated Imaging Software Improves Day-to-Day MS Disease Activity Detection

BACKGROUND AND PURPOSE: The standard for evaluating interval radiologic activity in MS, side-by-side MR imaging comparison, is restricted by its time-consuming nature and limited sensitivity. VisTarsier, a semiautomated software for comparing volumetric FLAIR sequences, has shown better disease-activity detection than conventional comparison in retrospective studies. Our objective was to determine whether implementing this software in day-to-day practice would show similar efficacy. MATERIALS AND METHODS: VisTarsier created an additional coregistered image series for reporting a color-coded disease-activity change map for every new MS MR imaging brain study that contained volumetric FLAIR sequences. All other MS studies, including those generated during software-maintenance periods, were interpreted with side-by-side comparison only. The number of new lesions reported with software assistance was compared with those observed with traditional assessment in a generalized linear mixed model. Questionnaires were sent to participating radiologists to evaluate the perceived day-to-day impact of the software. RESULTS: Nine hundred six study pairs from 538 patients during 2 years were included. The semiautomated software was used in 841 study pairs, while the remaining 65 used conventional comparison only. Twenty percent of software-aided studies reported having new lesions versus 9% with standard comparison only. The use of this software was associated with an odds ratio of 4.15 for detection of new or enlarging lesions (P = .040), and 86.9% of respondents from the survey found that the software saved at least 2–5 minutes per scan report. CONCLUSIONS: VisTarsier can be implemented in real-world clinical settings with good acceptance and preservation of accuracy demonstrated in a retrospective environment.

With these therapies, no evidence of disease activity has become a new treatment target, making disease monitoring more important than ever. 3,4 MR imaging is the most commonly used surrogate marker of MS activity. 5,6 Radiologists typically evaluate MR imaging studies for the development of new MS lesions by comparing the current study with a prior study in adjacent view ports on a monitor, usually in multiple planes, which we will refer to as conventional side-by-side comparison (CSSC). The sensitivity of such a comparison is degraded by multiple human and technologic factors, including the quality of MR imaging protocols and the expertise of radiologists evaluating the examinations. [7][8][9] Although it is routinely accepted in phase II and III trials, the demanding nature and relative inaccuracy of visual inspection of MRIs compared with novel methods including computer-assisted lesion detection pose an important limitation to utility in clinical practice. 10,11 Indeed, computer-assisted lesion-detection software has shown promise by increasing the specificity and sensitivity of MS disease-activity monitoring. 8,12,13 One such software, VisTarsier (VT; open-source available at github.com/mhcad/vistarsier) has been validated in a series of retrospective studies, allowing radiologists, regardless of training level, to detect up to 3 times as many new MS lesions on monitoring scans compared with CSSC. 8,9,14 These validation studies, however, were performed on a dedicated research workstation with axial, coronal, sagittal and semitransparent 3D "overview" images, rather than on a conventional PACS workstation during normal clinical practice.
In this prospective, observational cohort study, we sought to share our experiences implementing this assistive software in the Royal Melbourne Hospital PACS and to demonstrate that once implemented, it would augment radiologists' capacity to detect increases in MS disease-activity detection compared with CSSC.

Software Integration into PACS
Every new MR imaging brain demyelination protocol study generated using 3T magnets (Tim Trio, 12-channel head coil; Siemens, Erlangen, Germany) for a patient with a previous study obtained with the same MR imaging protocol was automatically processed by the software. The automated process (Fig 1) is triggered as soon as a study is verified in our radiology information system (Karisma; Kestral, Perth, Australia) by the radiographer, with the radiology information system automatically sending a completion HL7 message (NextGen Connect; NextGen Health care, Irvine, California) to the software virtual machine (Xeon Processer E5645, 8 VCPU cores @ 2.40 GHz, 8 GB DDR3 RAM, 500 GB SATA3 7200 RPM hard disk drive, no 3D/GPU acceleration [Intel, Santa Clara, California, Windows 7 Professional 64-bit operating system [Microsoft, Redmond, Washington]). The software then queries the PACS and searches the study for a series that is deemed compatible on the basis of a list of possible series descriptors (eg, FLAIR sagittal 3D). If a compatible series exists in the new study, the software then queries the PACS for previous MR imaging studies of the same patient. Once a compatible series is found in the previous most recent MR imaging, the 2 series are retrieved and processed. Software processing includes brain-surface extraction and masking of volumetric FLAIR sequences, followed by intensity normalization, 6-df registration, automated change detection, and reslicing to generate 3 new coregistered series: 1) A resliced prior study sagittal FLAIR ($160 images, preserving original resolution, one 16-bit grayscale channel); 2) an increased signal intensity color map ($160 images, 256 Â 256, three 8-bit RGB channels); and 3) a decreased signal intensity color map ($160 images, 256 Â 256, three 8-bit RGB channels). Once processing is complete, the virtual machine sends the 3 series (typical total size $150 megabytes) back to the new study as additional series. These series are then available as part of the normal clinical study for staff radiologists to report in real-time in the usual PACS environment (see the On-line Figure for an example of the output series generated by VisTarsier).
Most important, these change maps do not replace routine sequences and reformats but are in addition to routine imaging. They merely draw the attention of reporting radiologists to areas that may represent new or enlarging lesions (orange). These areas are then assessed normally on routine imaging, and a determination is made as to whether they represent disease activity.

Participants and Data Collection
In July 2015, the software underwent a soft launch within our tertiary hospital's PACS (ethics approval number QA2015161). Eligibility criteria included the following: consecutive studies in patients with a confirmed diagnosis of multiple sclerosis (as per 2017 revised McDonald criteria) and an MR imaging including a imaging studies for patients with MS are processed by the VisTarsier software in a virtual machine once they are signed off in the radiology information system (RIS) by the radiographer. Successful processing requires all systems to be operational and compatible sequences to be available. volumetric FLAIR sequence (FOV ¼ 250, 160 sections, section thickness ¼ 0.98 mm, matrix ¼ 258 Â 258, in-plane resolution ¼ 0.97 mm, TR ¼ 5000 ms, TE ¼ 350 ms, TI ¼ 1800 ms, 72 degree selective inversion recovery magnetic preparation). 15 For all studies not meeting the automated criteria for software assistance, only CSSC was used by staff radiologists to report MS disease progression. At our hospital, the software runs as a virtual machine on a server that hosts several other research and nonessential clinical services. Thus, upgrades, power outages, and hospital network reconfigurations lead to a small amount of downtime. In cases in which studies were performed during these times or due to other software-based failures illustrated in Fig 1, VT-assisted series were not automatically generated, and only CSSC was used by reporting radiologists. Unfortunately, a detailed breakdown of the various causes of nonprocessing could not be collated prospectively and cannot be established retrospectively.
We collected imaging reports for all studies performed with the above protocol prospectively from July 1, 2015, to June 30, 2017. All imaging reports for studies meeting the inclusion and exclusion criteria were assessed for written evidence of interval radiologic disease activity. Disease activity was defined as the presence of new or enlarging lesions as stated in the report body and/or conclusion available to the referring clinician. Demographic and clinical details for each patient were included in the study.
After study completion, a brief survey was sent to assess the real-world impact of the software on the day-to-day lives of reporting radiologists and trainees. The results of this survey will be summarized without statistical analysis.

Statistical Analysis
Assessed demographic and clerical variables included the following: the presence of VT-generated series, age at scanning, sex, and reporting radiologist's training level. Assessed clinical variables included disease-modifying drug use, Expanded Disability Status Scale (EDSS), time from diagnosis to the date of the scan, and annualized rate of MR imaging scans (ie, the number of MR imaging scans per year). Because available MS subtype data were incomplete, EDSS, time since diagnosis, and annualized scan rates were used as surrogate markers of disease activity and trajectory. The distributions of the variables were compared between the groups, using t tests and x 2 tests. Generalized linear mixed models were computed to assess the difference in rates of disease progression with the software compared with CSSC. For the primary analysis, interval radiologic activity was entered as the dependent variable. All other assessed variables were entered as independent variables. Continuous variables were centered and scaled. A random intercept term for each participant was specified to allow multiple observations per person. Parameter estimation was performed using maximum likelihood. Because the dependent variable was binary, a binomial response family was used with a logit-link function. We also performed an additional sensitivity analysis with a stepwise forward variable selection for the multivariable generalized linear mixed model. An estimated odds ratio was computed for each variable. A 2-sided critical P value of .05 was used to assess statistical significance. Confidence intervals at the 95% level are presented when relevant. Data were analyzed with R statistical and computing software (http://www. r-project.org). 16

RESULTS
During the 2-year study period, 906 study pairs for 538 patients met the inclusion criteria. VT was automatically activated in 841 study pairs. This activation occurred only on the occasions when both studies included a volumetric 3D-FLAIR sequence, the software was active at the time of image migration to PACS, and both studies had the same series labeling. Thus, all studies protocoled for MS follow-up should have been automatically processed by VT, and the instances in which this was not the case were random, resulting from technical reasons unrelated to patient factors (eg, server being restarted, Fig 1). These random cases occurred in the remaining 65 study pairs, which allowed CSSC only.
Processing times for the software-generated series varied depending on a few factors, including ease of brain-surface extraction and workload of the server due to additional services (average processing time ¼ 5 minutes 11 seconds 6 22 seconds).
Clinical and demographic data are summarized in Table 1, with both groups showing a similar distribution of key variables. Age at scan, sex, and EDSS were comparable across the CSSC and software-assisted groups. As shown in Table 2, pharmacologic treatment was also comparable across groups.
In the first year following the introduction of the software, 20.49% (95% CI, 16.36%-24.63%) of studies using the software reported having new lesions versus 9.76% (95% CI, 0.67%-18.84%) with CSSC. Similarly, in the second year, 20.21% (95% CI, 16.6%-23.82%) of studies using the software reported new The fully adjusted multivariable generalized linear mixed model found a greater probability of identifying new/enlarging lesions compared with CSSC with an estimated odds ratio of 4.15 (95% CI, 1.07-16.14; P = .04). It was adjusted for age at scanning, sex, whether a scan was reported by a staff radiologist or a radiology resident, EDSS, time since diagnosis, and annualized rate of MR imaging scans. The On-line Table outlines the results of each partially adjusted model computed as part of our sensitivity analysis. These highlight the sustained effect of the software when adjusting for each additional variable independently. The Akaike information criterion (AIC) for the fully adjusted model was 586.8.
Of the 39 individuals reporting MR imaging to whom the impact assessment survey was sent, 23 responded, of whom eight (34.8%) were radiology residents and thirteen (56.5%) were staff radiologists, including eight (34.8%) fellowship-trained neuro-radiologists and two (8.7%) radiology fellows. Twenty-one (91.3%) reported always using the software when available, and 22 (95.7%) felt comfortable using it as an additional series for reporting. Twenty-one (91.3%) believed it saved them at least 2-5 minutes of reporting time per scan. None of the respondents believed the software added to their reporting time, and 21 (91.3%) stated that they would like to see it implemented in other areas soon.

DISCUSSION
Semiautomated imaging software has shown great promise in the field of MS disease monitoring. [17][18][19] Earlier studies of VT concluded that it allowed higher lesion detection with improved interreader reliability and decreased reporting times when used by readers of all radiology training levels (ie, ranging from medical student to fellowship-trained neuroradiologist) compared with their performance using CSSC. 8,9,14 The main caveats of prior research in this area, however, included the retrospective design, artificial research conditions, and/or relatively small sample sizes.
In this translational study, we used a previously retrospectively validated open-source software for MS follow-up. We used prospectively acquired data, accounting for several potential demographic and clinical confounders. We sought to demonstrate the efficacy of semiautomated imaging when implemented in a real-world clinical setting and to share our experience integrating one such software in our daily practice. We used a permissive research design to mitigate any distortion created by a research setting. Department staff were given an in-service brief and informal overview of how the software worked and of prior validation; then radiologists were left to work as they would outside a trial environment. There was no pressure to use the software, to pay attention to or record their usage pattern, or to focus on time. We thought that any such intervention would potentially mislead what another department could expect if they were to implement this sort of assistive software.
More than 800 of 906 new hospital scans had VT-assisted series automatically generated and available to the reporting radiologist in real-time, with only a few minutes elapsing before the color-mapped image series became available on the PACS for  reporting. This feature yielded a >4-fold increase in new lesion detection compared with those scans reported using CSSC. While <10% of studies using CSSC showed disease progression, it was reported in >20% of those using software assistance. In a poststudy survey, almost all radiologists and radiology trainees used VT and thought that it cut down on their reporting times for MS comparison studies. The results observed in this prospective study of >800 scans demonstrate an effect equivalent to the ones seen in our earlier retrospective studies. Similar demographic data were seen across both study groups and were specifically included in our analysis model to limit the amount of confounding. The software was the sole variable associated with a difference in lesion detection compared with age, sex, disease state, and time course; reporting radiologist; and annualized rate of scanning.
MR imaging remains the most widely used and reliable surrogate marker to monitor disease activity in patients in the realworld clinical setting. 5,6,8 Physical and psychological disabilities seen in MS are associated with the number of demyelinating lesions, some of which can be visualized on neuroimaging with FLAIR and T2-weighted sequences. [20][21][22] Recently, the importance of accurate interval MR imaging activity has become even greater because postcontrast imaging is no longer recommended for routine follow-up, largely due to concerns about the presence of residual contrast in the brain after repeat exposure to gadoliniumbased agents. 23,24 Semiautomated imaging represents a growing field of MS and radiology research, with methods ranging from assisted lesion assessment to brain volumetric analysis. 6,19,25 Similar growth is seen with an extension of computer-assisted detection called "radiomics," which converts images to minable data for deep learning. 26 Image coregistration is a crucial component of traditional MR imaging comparison. Although image coregistration is routinely performed on a PACS, minor changes in alignment are inevitable without reslicing. [27][28][29][30] Thus, if not via the color-change maps, the automated reslicing and coregistration availed by the software rapidly and effectively provide an important and known means to optimal image comparison and assessment. After incorporating VT-assisted imaging in our hospital's daily MR imaging reporting activities, our findings are in line with other smaller prospective studies that have shown an absolute increase of 13% (22% relative increase) in new MS lesion detection using similar semiautomated software. 19 Perhaps more important, implementation of this software in our department was largely seamless and did not appreciably increase transfer times to PACS or data memory burden. Similarly, a post hoc survey of staff in our department showed an overwhelmingly positive response to the integration of the software in our daily practice.

Limitations
The main limitation in this study is the relatively smaller number of scans in the CSSC group. Because our PACS is programmed to automatically process new images with the software whenever possible, the number of unaided scans was limited to the days when VT was unavailable, such as when servers were undergoing maintenance. These factors contributing to the group size discrepancy were random and were not associated with the probability of MR imaging activity. This discrepancy was also further addressed by the statistical design of our analysis.
For those wishing to implement a similar system in their practice, the mentioned downtime could be addressed by having a dedicated server for the software. Similarly, series description and naming in PACS was another potential source of exclusion from automated VisTarsier integration. Similarly, our protocols included 3D-FLAIR sequence series that were all named "FLAIR 3D Sag"; however, at times this could be changed manually, resulting in a matching study not being found. This could be addressed by raising awareness of the importance of standardized series naming. Unfortunately, the reason that a given scan from the CSSC cohort did not meet the automated criteria was not recorded prospectively, and it could not be reconstructed retrospectively.
Although a survey sent to all reporting doctors within the radiology department yielded highly positive results in terms of ease of use and time-saving capabilities of the software, we did not track reporting times as in previous retrospective studies. Unfortunately, these data were not retrospectively mineable on our department's PACS. The qualitative nature of these data thus makes them an adjunct, rather than a statistically rigorous end point.
Last, the inherent limitations of a pragmatic real-world prospective observational cohort study mean that we cannot explicitly control how the studies are read by radiologists, and we do not have the ability to generate inter-or intrareader descriptive statistics. These limitations have, however, previously been established in retrospective validation studies. 8 This is, in our opinion, offset by being able to describe the effect of implementing VisTarsier in a routine clinical environment, which is more likely to be of relevance to other institutions.

CONCLUSIONS
Semiautomated lesion-detection software improves the standard of reporting of new or enlarging T2/FLAIR hyperintense lesions in patients with multiple sclerosis. VisTarsier has improved reporting standards in cerebral MR imaging from patients with MS using standardized volumetric sequences and uniform scanning protocols. Most important, implementing this software in our practice's PACS was relatively seamless and very well received by staff. Future research should validate its capacity to improve reporting in a more heterogeneous sample of images. It should also seek to measure reporting times behind the scenes as a surrogate for workflow efficiency and to demonstrate a change in disease management as a marker of clinical relevance. Computeraided detection systems promise to improve radiologists' ability to detect disease activity in patients with MS.
in aiding the development of the semiautomated imaging software.