The first step for neuroimaging data analysis: DICOM to NIfTI conversion

https://doi.org/10.1016/j.jneumeth.2016.03.001Get rights and content

Highlights

  • Introduce conversion tools for different vendors.

  • Explain conversion basics.

  • Present methods to detect and correctproblems.

Abstract

Background

Clinical imaging data are typically stored and transferred in the DICOM format, whereas the NIfTI format has been widely adopted by scientists in the neuroimaging community. Therefore, a vital initial step in processing the data is to convert images from the complicated DICOM format to the much simpler NIfTI format. While there are a number of tools that usually handle DICOM to NIfTI conversion seamlessly, some variations can disrupt this process.

New method

We provide some insight into the challenges faced with image conversion. First, different manufacturers implement the DICOM format differently which complicates the conversion. Second, different modalities and sub-modalities may need special treatment during conversion. Lastly, the image transferring and archiving can also impact the DICOM conversion.

Results

We present results in several error-prone domains, including the slice order for functional imaging, phase encoding direction for distortion correction, effect of diffusion gradient direction, and effect of gantry correction for some imaging modality.

Comparison with existing methods

Conversion tools are often designed for a specific manufacturer or modality. The tools and insight we present here are aimed at different manufacturers or modalities.

Conclusions

The imaging conversion is complicated by the variation of images. An understanding of the conversion basics can be helpful for identifying the source of the error. Here we provide users with simple methods for detecting and correcting problems. This also serves as an overview for developers who wish to either develop their own tools or adapt the open source tools created by the authors.

Introduction

Many of the popular tools used for scientific image processing, analysis and visualization require images to be stored in the NIfTI file format, whereas scanners used to acquire these images usually export data in the DICOM format. These two formats are each suited for their specific niche: DICOM is comprehensive and verbose, while NIfTI is simple and easy to support. Therefore, a common initial step in any neuroimaging analysis is to convert the images from DICOM to NIfTI format. Usually, a user can choose from one of many tools that seamlessly transforms their data. However, this step belies numerous challenges involved with this process. Specifically, the DICOM standard is particularly complicated, and different scanner manufacturers extend the DICOM standard in a variety of ways, often resulting in duplication of information and incompatibilities between software only designed to work with one particular subset of DICOM. Therefore, while one conversion tool may work for many images, it may fail for others. Our objective is to describe some of the general assumptions of these conversion tools and the situations where they may fail. Our primary aim is to inform users about how to identify these errors, and to provide suggestions for coping with these situations. We also provide an overview for developers who wish to either develop their own tools or adapt the open source tools created by the authors.

Modern medical imaging devices typically store data in the DICOM image format. The Digital Imaging and Communications in Medicine (DICOM) standard evolved from the American College of Radiology (ACR) and National Electrical Manufacturers Association (NEMA) standards, which originated in the 1980's (http://en.wikipedia.org/wiki/DICOM). The DICOM standard is complex, comprehensive (describing data transfer as well as compression), and evolving. For example, the 2011 edition of this standard now spans 4902 printed pages (http://dicom.nema.org/), which does not include specifics on details such as image compression. Annex A of Part 6 describes 37 different forms of transfer syntaxes (schemes for encoding image data). Further, section C.7.3.1.1.1 refers to 123 distinct modalities (some retired), with transfers of DICOM data often including DICOM files that include modalities such as scanned patient notes, audio files and other forms of embedded data. While our focus is on radiological modalities, when converting DICOM datasets to NIfTI the software will need to parse the neuroimaging DICOM files from any comingled DICOM files that store information from different modalities.

DICOM describes a tag based format, where each object in the file is encapsulated in a tag that describes the purpose and size of that chunk of data. Typically, the earlier objects store information about the participant, the device, the imaging sequence, and the image specifics (e.g. dimensions of the image), while the final object encodes the image data itself. Each object contains, in order, a Tag, an optional Value Representation (VR), a Length and the Value of the object itself (Table 1).

A Tag is a two-number code (e.g. “0028, 0010”) defining the meaning of the element. What the element exactly means relies on a so-called DICOM dictionary. The DICOM standard defines a dictionary for a large number of variables, referred to as the public tags. In addition, the standard also allows manufacturers to define their own tags, so-called private tags. The first element of the two-number code of public tags is always even, whereas it is odd for the vendor-specific private tags. This ensues the private tags won’t conflict with public tags, but different vendors may use the same block of private tags for different purpose.

The optional second component in an object, VR, is a two-character code defining the data type of the value in the object. For example, “US” refers to unsigned short integer(s), “DS” refers to decimal string, etc. Whether or not the VR field is present in the file depends on the transfer syntax, which may be either “implicit” or “explicit”. The VR is encoded in the file for the explicit transfer syntax, whereas for the implicit representation, the VR needs to be determined from the tag using a dictionary. For implicitly represented private tags, the VR may not be available (or at least very difficult to track down), which makes the data very difficult to interpret. Although most modern DICOM files adopt explicit VR, the default DICOM standard involves implicit VR. A further complication relates to whether explicit transfer syntax is big- or little-endian.

The next component (Length) defines number of bytes of the stored value, while the final component encodes the Value itself. The Length provides the information needed to determine where the next object in the file begins.

There can be many objects in a DICOM file (typically about a hundred, but there can be three orders of magnitude more for multi-frame DICOM). Table 1 shows several important objects that are usually used for DICOM to NIfTI conversion.

The Transfer Syntax UID encodes important information about how to decipher the rest of the DICOM file. In particular, it defines the transfer syntax, as well as information about how image data may be compressed. The Rows, Columns, and maybe other elements define the dimension of the Pixel Data. Usually, each DICOM file encodes a single 2D slice, although there are exceptions to this general rule. The Image Position and Orientation in patient coordinate define the slice location and orientation in the scanner space. The Pixel Spacing and Spacing Between Slices (or Slice Thickness) defines the voxel size in three dimension. The Series and Instance Numbers, and maybe others, help sort the DICOM files. Sometimes, some of these objects are missing from the DICOM files, so heuristic approaches are needed to try to determine which slices should go together in the same 3D volumes.

As noted, the DICOM standard has proved hugely successful, providing a unified framework for transferring, storing and printing medical data. While DICOM is very flexible and comprehensive, it does require considerable effort and expense to implement transparently. Research environments usually lack the resources for such undertakings, which also aligns poorly with the continual advances in analysis methods and data. Instead, researchers usually use simpler image formats that allow more rapid progress to be made. These simpler formats retain only a limited, relevant set of the images’ metadata.

In contrast to DICOM, NIfTI is a very simple, minimalistic format. This format has been widely adopted in neuroimaging research, allowing scientists to mix and match image processing and analysis tools developed by different teams. The NIfTI format was originally designed to be a backwards-compatible extension of the proprietary ANALYZE-7.5 file format (http://nifti.nimh.nih.gov/nifti-1). C libraries have been developed by the community to read and write NIfTI files, which means that developers do not need to re-implement support to this format (which is not only time consuming, but also allows the opportunity to introduce bugs).

The format specifies 348-bytes of header data and uncompressed image data. The header and image data can be saved as separate files (using the file extensions ‘.hdr’ and ‘.img’), or as a single file (using the ‘.nii’ extension, with the first 348 bytes devoted to the header and the image data typically beginning at byte 352). Images can have up to 7 dimensions (three spatial dimensions, time, and then other dimensions such as diffusion gradient direction). Note that the space dimensions do not have to be in the order of left-right, posterior-anterior and inferior-superior, although this is normally true for axially oriented acquisitions. One improvement of NIfTI over its predecessor, Analyze format, is that it allows spatial orientation information to be stored more fully. Therefore, NIfTI-compliant software should reduce the chance of making left-right errors. Specifically, NIfTI images can include two independent spatial transforms for mapping the image data into different frames of reference. One of these, the “sform”, allows a full 12-parameter affine transform to be encoded, whereas the other one, the “qform”, allows only a 9-parameter mapping. The latter transform is limited to encoding translations, rotations (via a quaternion representation) and isotropic zooms. While it is suitable for mapping voxels in most MRI images to some Euclidean coordinate system, the 9-parameter transform is unable to encode the shears that are often needed to account for the gantry tilt of CT scanners.

The inclusion of two different spatial transforms in the NIfTI header can also cause some confusion. When both sform and qform representation are included, some tools, such as MRIcron and SPM, give precedence to the sform, while others (e.g. those relying on the Insight Segmentation and Registration Toolkit, http://www.itk.org) default to the qform. Therefore, the same image can appear differently in different viewers and the starting estimates for image registration may differ among tools.

While the simplicity of NIfTI is a major advantage, it necessarily constrains what can be stored in the header, which, as we will see, can lead to confusion with some sequences (specifically, diffusion imaging and multi-band sequences). NIfTI format does allow extensions to encode such information, but consensus among the community would be needed for these extensions to be used in a consistent way. For this reason, many packages (e.g. SPM) simply ignore any extensions.

Table 2 shows some example parameters in the NIfTI header that users may need to be aware of.

The “dim” field encodes the dimension of image data. The first number indicates how many dimensions. For the example in Table 2, the image has four dimensions. The next four numbers indicate the size of each dimension. Typically, the first three dimensions are for space, and fourth one is for time (although this convention is often abused).

The “pixdim” encodes the voxel size and time interval corresponding to the spatial and temporal dimensions. The first value has a special purpose, which will be mentioned later. The units of voxel size and time are coded in parameter “xyzt_units”, but most converters default to millimeters and seconds.

The “slice_code” is useful for slice timing correction. It specifies the slice order. Codes 1 through 4 refer to ascending, descending, interleaved ascending, and interleaved descending respectively. Codes 5 and 6 are for interleaved ascending and descending, but starting with the 2nd and 2nd to the last slice. DICOM files do not always encode the information needed for the slice_code, so converters may not fill in this field.

The “descrip” field can hold up to 80 characters, and is often used to store some textual description of the image that does not fit in the NIfTI header. In the example in Table 2, it is likely the acquisition start time and phase encoding direction. The former may be used to align with some physiological recording, and the latter may be useful for image distortion correction.

The qform and sform information are the two methods for encoding affine mappings mentioned earlier. The example in Table 2 shows an axial acquisition without any tilt, which produced zeros for all three quaternion parameters, and pixdim values (three 3s) for diagonal of affine transform. If there is small tilt from the major axes, those 3s may be a little smaller than 3, and those zeros may become numbers with small absolute value. The three qoffset parameters are the offset at three space dimensions, and they are the same as the last column of sform transformation matrix if sform and qform encode the same coordinate system.

One major variant of these parameters is worthy of mention. For the axial acquisition example in Table 2, those transform parameters may be very different even for the same dataset. Often, the three quaternions are [0 −1 0] or close to that, and the srow_x has a −3 rather than 3 for the first number. This is the indication that the image data is stored in so-called left-hand storage. This means that the same dataset can be organized in different way and with different transform parameters. If they are overlaid in an image viewer they will match perfectly. However when one applies analysis to the NIfTI image data, be aware of this variant, and deal with the image flip accordingly if needed.

For slice orientation other than axial, the image could be re-organized as if it were axial slice. This won’t cause any problem in terms of the NIfTI format, and may avoid confusion in some analysis and visualization tools. However, there are other considerations against doing so. A major concern is due to the fact that most analysis tools treats the third dimension as slice dimension, and may not be able to perform slice timing correction correctly otherwise. If the first number in column 3 of the affine transform has large absolute value in the example in Table 2, that means the third dimension of the image data is along left-right axis, and accordingly the fourth numbers in dim and pixdim are for this axis. If there was no image data re-organization, this indicates the sagittal slice acquisition.

Section snippets

DICOM to NIfTI conversion

To convert DICOM into NIfTI, the first step is to sort files into different series. A DICOM series includes a set of DICOM images (Instances) that were generated together by the same equipment at the same operation. The most reliable way to sort series is by DICOM object Series Instance UID, although it can also be done by combining objects Patient Name, Study ID, and Series Number. Within each series, we need to sort the images into different volumes, if applicable. This is typically

Discussion

Ideally, users can select one of a number of tools for seamlessly converting DICOM data to the NIfTI format. In these cases, user's selection will be driven by user preference. For example, do they prefer a user interface or a command line tool? Do they prefer a stand-alone executable or do they want a MATLAB or Python-based converter that can be easily tuned for specific situations? Indeed, Table 3 focuses on tools that the authors of this work have developed and maintain, but this is not an

Cited by (507)

View all citing articles on Scopus
View full text