Effects of Acquisition Parameter Modifications and Field Strength on the Reproducibility of Brain Perfusion Measurements Using Arterial Spin-Labeling

BACKGROUND AND PURPOSE: Although the added diagnostic value of arterial spin-labeling is shown in various cerebral patholo-gies, its use in clinical practice is limited. To encourage clinical adoption of ASL, we investigated the reproducibility of CBF measurements and the effects of variations in acquisition parameters compared to the recommended ASL implementation. MATERIALS AND METHODS: Thirty-four volunteers (mean age, 57.8 6 17.0 years; range, 22 – 80 years) underwent two separate sessions (1.5T and 3T scanners from a single vendor) using a 15-channel head coil. Both sessions contained repeated 3D and 2D pseudocontinuous arterial spin-labeling scans using vendor-recommended acquisition parameters (recommendation paper – based), followed by three 3D pseudocontinuous arterial spin-labeling scans, two with postlabeling delays of 1600 and 2000ms and one with increased spatial resolution. All scans were single postlabeling delay. Intrasession (identical acquisitions, scanned ﬁ ve minutes apart) and intersession ( ﬁ rst 2D and 3D acquisitions of two sessions) reproducibility was examined as well as the effect of parameter variations on CBF. RESULTS: Intrasession CBF reproducibility was similar across image readouts and ﬁ eld strengths (within-subject coef ﬁ cient of variation between 4.0% and 6.7%). Intersession within-subject coef ﬁ cient of variation ranged from 6.6% to 14.8%. At 3T, the 3D acquisition with a higher spatial resolution resulted in less mixing of GM and WM signal, thus decreasing the bias in GM CBF between the 2D and 3D acquisitions ( D CBF ¼ 2.49 mL/100g/min [ P , .001]). Postlabeling delay variations caused a modest bias CONCLUSIONS: Arterial spin-labeling imaging is reproducible at both ﬁ eld strengths, and the reproducibility is not signi ﬁ cantly cor-related with age. Furthermore, 3T tolerates more acquisition parameter variations and allows more extensive optimizations so that 3D and 2D acquisitions can be compared.

A rterial spin-labeling (ASL) MR imaging has the potential to be a cost-effective and safe alternative to contrast agentbased perfusion imaging. 1 However, despite its proved clinical value, 2-4 technologic improvements, [5][6][7] and consensus recommendation on the implementation, 8 clinical use of ASL remains limited to date. 9 ASL is also regularly used in clinical and pharmaceutical trials because in these cases, the preference is to avoid the use of gadolinium-based contrast agents. In such trials, it remains a challenge to harmonize imaging protocols over different MR imaging systems, which use different readout types and ASL labeling and imaging parameters. [10][11][12] Overcoming these challenges is especially important in multicenter trials as well as in longitudinal studies, in which scanner hardware or software updates and subsequent sequence changes are common.
Several practical limitations hamper the adoption of ASL to image CBF in clinical practice. First, the recommended use of ASL is at a field strength of 3T. 8 However, if ASL is used as an alternative to contrast agent perfusion MR imaging to reduce the duration and cost of MR imaging examinations, it would also be preferably conducted at 1.5T, a field strength that is more widely available. Another limiting factor in clinical practice is that the sensitivity of CBF values to changes in the acquisition parameters is not well-understood.
Despite the relatively large body of literature on the precision of ASL, in which studies have shown that ASL has a similar reproducibility to PET 13 and that whole-brain (WB) reproducibility is comparable among various labeling and readout strategies, 10,14,15 investigators still question the effects of acquisition parameter changes on the precision of ASL with respect to the recommended implementation as described by the ASL consensus paper. 8 Other challenges for clinical adoption may be that most ASL reproducibility studies were conducted in young participants and may not be applicable to the elderly population. 16 Moreover, many studies apply partial volume correction (PVC) to mathematically correct for mixing of GM and WM perfusion, which is inherently present in ASL data due to the relatively large voxels and long readout durations in 3D acquisitions specifically. 17 Studies are often inconsistent on the corrections applied when reporting CBF values, complicating comparison among studies.
A final challenge for ASL-based perfusion imaging is that standard ASL acquisitions aim to quantify CBF from a single postlabeling delay (PLD) measurement without measuring whether the labeled blood arrived in the tissue. Therefore, it might be unclear whether a low ASL signal is due to decreased perfusion or a delay in arrival. Recently, a novel ASL parameter, which can be derived from single-PLD CBF maps, the spatial coefficient of variation (CoV), was introduced as a proxy of arterial transit time. 18 Correlation of spatial CoV with clinical parameters was shown in recent studies, [19][20][21] but the reproducibility of this parameter has not yet been reported.
To address the practical issues mentioned above and to encourage further adoption of ASL in clinical practice, this study aims to extend the knowledge on the precision of CBF and spatial CoV measurements. Specifically, this work focuses on studying ASL reproducibility with respect to three common sources of ASL signal variation: 1) age: studying healthy subjects over a large range of adult ages, focusing mostly on older adults because reproducibility studies in this age group are lacking in the literature; 2) field strength and scan parameter variations: assessing the influence of small acquisition parameter variations 8 at both 3T and 1.5T; and 3) partial volume correction: showing the effect of partial volume correction on deriving pure GM CBF with different scan parameter variations and imaging field strengths.

Participants
In this study, 34 healthy participants (20 men, 14 women; mean age, 57.8 6 17.0 years) were included. Detailed information about the distribution of participants over different field strengths is given in Online Table 1. This technical study with human participants has been performed under a waiver of institutional review board approval by the Medical Research Ethics Committees United (Nieuwegein, the Netherlands). All participants provided written informed consent and received remuneration for their participation. All experiments were performed in accordance with the Declaration of Helsinki guidelines. Volunteers included in this study participate regularly in MR imaging experiments and are, therefore, trained to lie still for longer periods of time.
On arrival, participants were instructed to refrain from intake of caffeine and smoking during the whole experiment. All participants underwent two scan sessions of 50 minutes with approximately 15 minutes of rest between, during which they were taken out of the scanner. Eight participants were scanned twice at 1.5T, 12 participants were scanned at 1.5T and 3T in a randomized order, and 14 participants were scanned twice at 3T. To describe the precision of the CBF and spatial CoV measurements, we define here intrasession repeatability, considering two within-session repeated measurements, and intersession reproducibility, considering two betweensession repeated measurements. Experiments were performed on the following scanner types: 1.5T Ingenia and IngeniaCx and 3T Achieva, Ingenia, and IngeniaCx (Philips Healthcare, Best, the Netherlands). All acquisitions were performed using the standard 15-channel head coil on MR imaging scanner software, Version R5.3 and R5.4, with identical implementation of the ASL sequence. All scanners were in close proximity to one another.

Image Acquisition
All sessions started with a 3D-T1w scan followed by two identical pseudocontinuous ASL (pCASL) scans with a 3D gradient spinecho (GraSE) readout and two identical pCASL scans with a 2D-EPI readout (in an interleaved fashion with an approximately 5minute gap between identical scans) for intrasession repeatability assessment. Next, we acquired two 3D-GraSE pCASL scans with different PLDs: 1600 and 2000 ms. Last, a 3D-GraSE pCASL scan with a higher spatial resolution was acquired (in-plane resolution of 2.75 Â 2.75 mm 2 instead of 3.75 Â 3.75 mm 2 ). In all pCASL examinations, a labeling duration of 1800 ms was used, as well as a 4-pulse background-suppression scheme and an integrated M0 scan. Further acquisition parameters are listed in Online Table 2. The initial 3D and 2D ASL scans (sequence numbers 2-5) were obtained from the vendor's imaging data base and in agreement with the consensus recommendation on ASL implementation. 8 The labeling plane was positioned 9 cm below the anterior/ posterior commissure plane. A phase-contrast angiography survey scan was performed to check the position of the labeling plane, and if required, the distance was adapted to avoid the labeling plane overlapping the siphons. The full sequence protocol was repeated after a 15-minute break for intersession reproducibility assessment.

Postprocessing
During reconstruction on the scanner, all ASL scans were quantified according to the single-compartment model recommended in the ASL consensus review. 8 This includes generation of M0, label, and control images using standard image corrections, including coil sensitivity corrections. Subsequently, a voxel-based calculation was performed to derive the CBF values, as explained in the ASL recommendation paper, using the same assumptions for the T1 of blood (1650 ms at 3T, 1350 ms at 1.5T), labeling efficiency (a ¼ 0.85), and the blood-brain partition coefficient (l = 0.9 mL/g). 8 Further image processing of the CBF maps was performed off-line in ExploreASL (https://sites.google.com/view/exploreasl), which is detailed in depth elsewhere. 22 Briefly, first, the 3D-T1w images were segmented into GM, WM, and CSF and registered to standard space (Montreal Neurological Institute template space) with a voxel size 1.5 Â 1.5 Â 1.5 mm 3 using CAT12 (C. Gaser; Structural Brain Mapping Group, Jena University Hospital). 23 Next, the quantified ASL scans were registered to the 3D-T1w scans by registering the perfusion-weighted maps, the mean label control difference, to the partial GM maps obtained from T1w segmentation using a rigid body registration, with a normalized mutual information criterion. After transforming the ASL images to standard space, the partial GM maps were smoothed to the effective spatial resolution of ASL. The average WB GM CBF was calculated as the average CBF of voxels with .70% partial GM content from the partial GM maps in the ASL resolution. WB CBF was calculated by combining the white matter and GM segmentations and thresholding partial GM 1 partial WM .70%. PVC was performed using linear regression on a 5 Â 5 kernel as described by Asllani et al 17 on GM maps adjusted to the effective spatial resolution to correct for both GM and WM CBF mixing and for differences in effective spatial resolution. 24 In this study, we report GM CBF, without PVC, unless otherwise mentioned.
The spatial CoV was calculated in GM using the CBF maps before PVC: 18 1Þ Spatial CoV ¼ ðSD ROI =Mean ROI Þ Â 100%:

Statistical Analysis
To assess the effects of field strength on reproducibility, we divided participants into groups of 1.5T (n ¼ 14) and 3T (n ¼ 20) for intrasession and 1.5T-1.5T (n ¼ 6), 1.5T-3T (n ¼ 14), and 3T-3T (n ¼ 14) for intersession comparison. For intrasession repeatability, repeated scans of the first scan sessions were used. For intersession reproducibility, the first 3D 1800 ms and 2D 1800 ms scans of both sessions were used.
CBF and spatial CoV data were tested for normality using a Shapiro-Wilk test. Scan types were compared using general linear models; repeated measures ANOVA, with a post hoc Tukey multiple comparison test when comparing normally distributed data, and a Friedman test, with post hoc Dunn multiple comparison test when comparing non-normally distributed data, were included in the analysis.
Repeatability and reproducibility assessment were performed on GM CBF values without PVC. First, the differences in CBF (DCBF) and spatial CoV (Dspatial CoV) between intrasession and intersession repeated measurements were calculated for each participant individually. Intrasession differences were calculated as 3D 1800 ms run 1 -3D 1800 ms run 2; intersession differences were calculated as 3D 1800 ms run 1 session 1 -3D 1800 ms run 1 session 2. To investigate whether there was a statistically significant correlation between the participants' age and DCBF and Dspatial CoV, we performed a Spearman rank correlation test. Next, as a measure for variability at a group level, the within-subject coefficient of variation (wsCV) was calculated as the ratio of the SD of the differences between the repeated measurements over the mean of the repeated measurements: The effect of variations in acquisition parameters was evaluated using the pCASL scans of the first session of all volunteers. First, the mean pair-wise difference, or bias, in observed GM CBF between the recommended 3D acquisitions and acquisitions with parameter variations was calculated by subtracting the CBF value from the images with deviating settings from the recommended 3D acquisition. To test whether the bias was statistically significantly different from zero, we performed a 1-sample t test. To examine whether the variance across the pair-wise differences between the recommended and deviating ASL acquisitions was significantly different compared with the variance across the pair-wise differences between repeated consensus paper acquisitions, we used the Pitman-Morgan test, a test for equal variance taking repeated measures into account. Statistical significance was defined as P , .05.

RESULTS
Visual inspection showed a large variation in global mean perfusion values among the participants, shown in the Online Figure. Local values of cortical perfusion ranged from 100 mL/100g/min in some participants to relatively low values of about 25 mL/ 100g/min in other participants.
The bias in GM spatial CoV, together with the 95% limits of agreement, between repeated 3D and 2D scans, was also calculated ( Table 1). None of the observed biases in the GM spatial CoV were significantly different from zero.
No statistically significant correlation between DCBF and age or between Dspatial CoV and age was observed using the Spearman rank correlation test ( Table 2).
The wsCV of GM CBF and spatial CoV was calculated for both field strengths and all field strength combinations (Table 3). After PVC, the wsCV of GM CBF was similar (data not shown). Intrasession variations were similar at both field strengths. Intersession CBF variability was lowest when scanning twice at 3T. In contrast, intersession spatial CoV variability was lowest when scanning twice at 1.5T.

Scan Parameter Variations
Averaged ASL scans showed that both 2D 1800 ms and 3D 1800 ms high-resolution scans have similar anatomic detail at 3T but had reduced SNR at 1.5T (Fig 2). The high-resolution acquisition at 1.5T was excluded from further analysis because the image quality of individual scans was insufficient to perform further analysis. The other 3D-GraSE images were of good quality at both field strengths.
CBF data from several acquisitions showed a non-normal distribution and required nonparametric testing (Online Table 3). A trend of decreasing CBF with increasing PLD was observed (Fig 3). This resulted in statistically significant differences between scans with PLDs of 1600 and 2000 ms (P , .05).   Statistically significant differences in measured CBF were also observed between acquisitions with different readout types (Fig 4). Similar to Fig 3, the differences observed among scans are decreased when comparing WB instead of GM CBF. At 1.5T, the 2D acquisition resulted in a significantly higher spatial CoV compared with the 3D acquisitions (Fig 5). At 3T, the high-resolution 3D acquisition resulted in significantly higher spatial CoV values compared with the other 3D acquisitions as well.
Acquisition parameter variations resulted in statistically significant pair-wise differences in GM CBF (P , .05) ( Table 4). Nevertheless, for most combinations, the variance across the observed pair-wise differences was not significantly different compared with the variance across the pair-wise differences between repeated recommendation ASL acquisitions. Moreover, we observed an increase in pair-wise differences among different pCASL acquisitions when scanning at 1.5T compared with 3T. At 3T, the 2D-EPI acquisition showed the best agreement with the 3D 1800 ms high-resolution acquisition.

DISCUSSION
In this study, we have shown that intrasession repeatability and intersession reproducibility of CBF measurements are similar at 3T and 1.5T and do not show a statistically significant correlation with age. Additionally, we observed that variations in image readout (2D versus 3D) do not have a significant effect on the reproducibility of the CBF measurements. These findings are in agreement with previous reproducibility studies reporting intrasession reproducibility of 3.5%-5.5% 14 and intersession reproducibility of 10.8%-11.3% 10 and are comparable in precision with respect to 15 O-H 2 O PET. 13 Results are also in line with studies on reproducibility using different labeling and readout techniques 10,14,15,25 and the effect of PLD on CBF reproducibility. 26 Although reproducibility was similar using different pCASL acquisition parameters and different image readouts, the average CBF values did differ. We observed that differences in measured CBF between 2D and 3D readouts were more pronounced in GM compared with WB CBF. This finding could be explained by the difference in effective resolution between 2D and 3D readouts. Reduced effective resolution results in more severe partial volume effects and hence affects GM CBF, due to GM and WM CBF mixing, more than WB CBF, in which mixing has a limited effect on the mean. CBF values were also affected by differences in PLD. We suspect that these differences are due to the single-compartment model that was used for quantification, which does not take into account that the duration that the label decays with the tissue T1 differs among the 1600-, 1800-. and 2000-ms PLD sequences. More advanced multicompartment or model-free approaches could account for the T1-decay in blood as well as in tissue. [27][28][29] Using an arterial blood T1 recently determined by Li et al, 30 in 2017, we have simulated that a dual compartment model would account for 68% of the observed difference between the shortest and longest PLD scans (data not shown). The remaining difference could be explained by insufficient delivery of labeled blood in our short PLD data. However, this effect would result in higher spatial CoV, which was not observed in our data.
We have shown that the intrasession repeatability and intersession reproducibility of the spatial CoV, just like CBF, do not show a statistically significant relationship with age. However, spatial CoV values were higher at 1.5T compared with 3T. This finding could be due to the shorter blood T1, which leads to less signal in distal compared with proximal areas, increasing the spatial CoV. We investigated this effect by calculating the spatial CoV for each imaging slice individually at both field strengths for a subset of our data. Only at 1.5T was a small upward trend in spatial CoV observed at higher slices (data not shown). This might indicate   that at 1.5T, distal parts of the brain show a higher spatial CoV due to faster T1 relaxation at 1.5T compared with 3T.
The higher mean spatial CoV values subsequently lead to lower intersession wsCV when scanning twice at 1.5T. Moreover, we did observe an increased spatial CoV in scans with a higher effective resolution, which can be explained by noisier acquisitions and/or higher contrast in these scans. A first investigation of this effect showed that scans with more temporal fluctuations of the ASL signal resulted in higher spatial CoV values (data not shown). Therefore, we conclude that both CBF and spatial CoV are affected by the effective resolution of the ASL acquisition. While the existing partial volume correction methods for GM CBF can deal with this issue, 17,24 a similar method that takes into account the differences in GM distribution changes with differences in effective resolution needs to be proposed and validated to be able to account for the resolution-related issues in spatial CoV calculation.
We observed that scan parameter variations compared with the recommended ASL parameters resulted in significant changes in the observed CBF values. Nevertheless, we consider the observed differences in CBF due to changes in PLD acceptable because the maximum bias between our 3D 1800 ms and 3D 1600 ms acquisitions at 1.5T was ,10% of the mean CBF. Moreover, the variance across the pair-wise difference was only once significantly affected for the 3D 1600 ms acquisition at 3T, indicating that these scan parameter variations only introduce an offset in the measured CBF but not in a different distribution around the mean CBF value. Comparing standard 3D and 2D ASL acquisitions resulted in a greater bias. At 3T however, the 3D-GraSE acquisition can be optimized to match the effective resolution of the 2D-EPI acquisition. This process results in less mixing of GM and WM signal, reducing the bias in measured CBF. This effect was already hypothesized by Mutsaerts et al, in 2014, based on previous work comparing 3D and 2D ASL scans. 10,31,32 Overall, the bias between recommended and deviating pCASL acquisitions gets more pronounced at 1.5T compared with 3T. This difference could be explained by the decreased SNR and increased loss of label during the PLD, due to the shorter relaxation time of blood at a lower field strength. Therefore, acquisitions at 1.5T should be compared with great care and only when the readout type is not changed.
This study has some limitations. Our subjects frequently participate in MR imaging examinations and therefore are trained to lie still for a long time. While this allows us to study the reproducibility of the sequence in the strictly technical sense, the reproducibility in clinical practice might be affected by patient movement. Although motion correction and outlier rejection are typically used by ASL processing software, we expect that motion during the acquisition can still reduce SNR and lead to slight blurring due to imperfect interpolation in the motion correction. This might lead to a decrease of reproducibility, possibly affecting the 3D sequences more because each average of a 3D sequence is typically acquired over multiple shots and does not allow simple motion correction as in 2D readouts. We included relatively few young participants, but because the between-subject variability is usually lower in younger participants and our reproducibility in young volunteers was in agreement with that in previous studies, the effect is probably limited. Furthermore, for practical reasons, not every combination of scanning at 3T and 1.5T systems was performed in all volunteers, decreasing the available sample size for some combinations. In the scan parameter settings, some small differences such as shot length and through-plane resolution between 1.5T and 3T sequences remained. Although this might have slightly influenced comparisons between field strenghts, this was necessary to make the sequences comparable while maintaining acceptable image quality.

CONCLUSIONS
With this work, we provide insights that can help ASL acquisition parameter optimization in the setting of clinical practice as well as in clinical trials in which MR imaging systems with different ASL applications are used. Our data show that ASL imaging is well reproducible at 3T and 1.5T and that the differences between repeated measurements show no statistically significant correlation with age. It should be noted that, Scanning at 3T offers more tolerance for scan parameter variations compared with 1.5T and allows more extensive acquisition parameter optimization, resulting in good agreement among ASL acquisitions. We advise that clinical comparisons at 1.5T should be made only on the basis of scans with identical acquisitions settings. With these precautions taken into account, our findings advocate the use of ASL as a cost-effective and safe alternative to contrast agent-based perfusion at both field strengths. Note:-HR indicates high-resolution. a Differences between repeated identical pCASL acquisitions with consensus paper parameter settings. b A bias that is significantly different from zero (P , .05). c A significantly different variance across the observed pair-wise differences between acquisitions compared with the variance across the pair-wise differences between repeated consensus paper parameter settings (P , .05).