Medical Image Format FAQ - Part 1

General Information & Standard Formats

Access to other parts ...


1. Introduction

1.1 Objective

The goal of this FAQ is to facilitate access to medical images stored on digital imaging modalities such as CT and MR scanners, and their accompanying descriptive information. The document is designed particularly for those who do not have access to the necessary proprietary tools or descriptions, particularly in those moments when inspiration strikes and one just can't wait for the local sales person to track down the necessary authority and go through the cycle of correspondence necessary to get a non-disclosure agreement in place, by which time interest in the project has usually faded, and another great research opportunity has passed! It may also be helpful for those keen to experiment with home-grown PACS-like systems using their existing equipment, and also for those who still have equipment that is still useful but so old even the host computer vendor doesn't support it any more!

There is of course no substitute for the genuine tools or descriptions from the equipment vendors themselves, and pointers to helpful individuals in various organizations, as well as names and catalog numbers of various useful documents, are included here where known.

In addition there are several small companies that specialize in such connectivity problems that have a good reputation and are well known. Contact information is provided for them, though I personally have no experience with their products and am not endorsing them.

Finally, great care has been taken not to include any information that has been released under non-disclosure agreements. What is included here is the result of either information freely released by vendors, handy hints from others working in the field, or in many cases close scrutiny of hex dumps and experimentation with scanner parameters and study of the effects on the image files. The intent is to spread hard-earned knowledge gained over many years amongst those new to the field or a particular piece of equipment, not to threaten anyone's proprietary interests, or to substitute for the technical support available from vendors that ranges from free to extortionate, and excellent to abysmal, depending on who your are dealing with and where in the world you are located!

Please use this information in the spirit in which is intended, and where possible contribute whatever you know in order to expand the information to cover more vendors and equipment.

1.2 Types of Formats

Later sections will deal with the problems of getting the image files from the modality to the workstation, but for the moment assume the files are there and need to be deciphered.

Four types of information are generally present in these files:

Extracting the image information alone is usually straightforward and is described in 1.3. Dealing with the descriptive information, for example to make use of the data for dissemination in a PACS environment, or to extract geometry details in order to combine images into 3D datasets, is more difficult and requires deeper understanding of how the files are constructed.

There are three basis families of formats that are in popular use:

The block format is one of the most popular, though in most cases, the early part of the header contains only a limited number of pointers to large blocks, the blocks are almost always in the same place and a constant length, for standard rather than reformatted images at least, and if one doesn't know the specifics of the layout one can get by assumming a fixed format. I presume this reflects the intent of the designers to handle future expansion and revision of the format.

The example par excellence of the tag based format is the ACR/NEMA style of data stream, which, though never intended as a file format per se has proven useful as model. See for example the sections dealing with the ACR/NEMA standards as well as DICOM (whose creators are about to vote on a media interchange format after all this time) and Papyrus. ACR/NEMA style tags are described in more detail elsewhere, but each is self-contained and self-describing (at least if you have the appropriate data dictionary) and contains its own length, so if you can't interpret it you can skip it! Very convenient. Most file formats based on this scheme are just concatenated series of tags, and apart from having to guess the byte order, which is not specified (unlike TIFF which is a similar deal for those in the "real" imaging world), and sometimes skip a fixed length but short header, are dead easy to handle.

To identify such a file just do a "strings <file | grep 'ACR-NEMA'" - if it is such a file, just look through the start of the hex dump until you start to see the characteristic sequentially ordered pairs of 16 bit words that identify ACR/NEMA attributes, decide the byte order, et voila, you can pipe it into any general ACR/NEMA dumping program to see what it contains. If you see even group tags, they will be described in the standard. If you see odd group tags then they are vendor specific and you will have to ask the vendor or correlate them with identification information printed on the film until you figure out the ones that are important to you.

1.3 In Desperation - Quick & Dirty Tricks

Because radiologists, radiographers, technologists, physicists and imaging programmers are dedicated long suffering creatures who work long hours under adverse conditions for little reward, the vendors in their generosity have seen fit to make life a little easier, by almost universally putting the image data at the end of the file. Rarely you will see files that are padded out to fixed record size boundaries (eg. Vax VMS 512 byte records), and sometimes overlay plane data may be stored after the image data. Furthermore there is almost always an option at archive time to allow for storage in an uncompressed and totally unadulterated form. Even in ACR/NEMA the tag for image pixel data is numerically the highest and hence the last to appear in the sequence which is guaranteed to be sorted.

They could have screwed us up totally by gratuitously adding variable length blocks of other stuff at the end, but the only time I have encountered this was on a Siemens Impact with the ACR/NEMA based SPI format padded out to 512 bytes.

In other words, if an image is 256 by 256, uncompressed, and 12-16 bits deep (and hence usually, though not always, stored as two bytes per pixel), then we all know that the file is going to contain 256*256*2=131072 bytes of pixel data at the end of the file. If the file is say 145408 bytes long, as all GE Signa 3X/4X files are for example, then you need to skip 14336 bytes of header before you get to the data. Presume row by row starting with top left hand corner raster order, try both alternatives for byte order, deal with the 16 to 8 bit windowing problem, and very soon you have your image on the screen of your workstation.

This technique is so useful, even NIH Image for the Macintosh (an excellent must-have free program BTW.) provides a raw import tool to do this, and describes it in the manual using the 14336 byte offset! This tool is something that is sadly lacking in most commercial image handling programs for non-medical applications, which can't import images with more than 8 bits per channel.

Of course you have to live without the identification, demographic and technique information (other than what can be derived from the file name in some cases), but for many research and presentation purposes this is quite adequate.

Occasionally one runs into clever files where four 12 bit words are packed into three 16 bit words and one goes crazy trying to figure out the logic of how they are packed. The back of the old ACR/NEMA standard describes somewhere one way in which this is done. One should still be able to calculate the length easily enough.

I haven't yet encountered a format that did nasty things like have strips of rows seperated by padding ... I guess we are lucky that most images are nice powers of two or even multiples thereof (256,320,512).

Of course the GE CT 9800 uses perimeter encoding even when DPCM compression is not selected, so this technique won't work.

2. Standard Formats

2.1 ACR/NEMA 1.0 and 2.0

        ACR/NEMA Standards Publication No. 300-1985      <- ACR/NEMA 1.0
        ACR/NEMA Standards Publication No. 300-1988      <- ACR/NEMA 2.0
        ACR/NEMA Standards Publication PS2-1989          <- data compression

The American College of Radiologists (ACR) and the National Electrical Manufacturers Association (NEMA) recognized some time ago the need for standards to facilitate multi-vendor connectivity to promote the development of PACS and what is now referred to as Wide Area Networking. The first such standard was version 1.0 which was released in 1985 as ACR/NEMA Standards Publication No. 300-1985, subsequently revised several times, then revised again and released as version 2.0 in 1988, described in ACR/NEMA Standards Publication No. 300-1988. There it remained until a radically revised and reorganized approach, preserving backward compatibility, was released during 1992-1993 as ACR/NEMA Standards Publication PS3, also referred to as DICOM 3.

In the interim, to facilitate the transfer of compressed images, another standard described in ACR/NEMA Standards Publication PS2-1989, was released which described various means fo extending standard 300-1985 to handle compression utilizing a broad range of reversible and irreversible schemes. Though this part of the standard was never apparently implemented by anyone, and has been quietly bypassed by those working on DICOM 3 compression, it makes very interesting reading and is a nice summary of applicable techniques.

What does one need to know about ACR/NEMA 1.0 and 2.0 ? The standards define a mechanism along the lines of the layered ISO-OSI (Open Systems Interconnect) model, with physical, transport/network, session, and presentation and application layers. Unless one actually wants to physically connect to a device that supports the unique 50 pin point-to-point electrical interface, then one really only needs to be aware of how ACR/NEMA implements the presentation and application layers, which are described in terms of a "message format". This message format is important to many people, not because anyone seriously wants to connect devices in the limited fashion envisaged by these early standards, but because many proprietary formats and other de facto standards have adopted the ACR/NEMA message format and its corresponding data dictionary and extension mechanisms.

The message format is described in sections 4, 5 and 10 of ACR/NEMA SP 300-1988 which are summarized briefly here. Section 6 describes command structure which is not really relevant other than that commands are also structured in the same way as data and consume part of the data dictionary. You will not encounter command tags in data streams ("messages") encapsulated in file formats though.

A message consists of a series of "data elements" each of which contains a piece of information. Each element is described by an "element name" consisting of a pair of 16 bit unsigned integers ("group number", "data element number"). The data stream is ordered by ascending group number, and within each group by ascending data element number. Each element may occur only once in a message. Even numbered groups describe elements defined by the standard. Odd numbered groups are available for use by vendors or users, but must conform to the same structure as standard elements. Following the (group number, data element number) pair is a length field that is a 32 bit unsigned even integer that describes the number of bytes from the end of the length field to the beginning of the next data element.

The last part of a data element is its value, which is defined by the data dictionary to be an ascii (numeric AN or text AT) or binary value (BI 16 bit or BD 32 bit). The values may be single or multiple. Multiple ascii values are delimited by the backslash (05CH) character. Odd length ascii values are padded with a space (020H).

For example:

            0008 0010  000C 0000  4341 2D52 454E 414D
                                  3120 302E

is data element "Recognition Code" because that is what the dictionary defines group 0008 element 0010 to be. The dictionary says it is of type AT (ascii text), has a value multiplicity of single and only enumerated values are allowed, in this case the ascii string "ACR-NEMA 2.0". It is of length 0000000C hex or 12 bytes long.

The electrical interface is a 16 bit one, and hence even though 32 binary values are defined to be transmitted least significant word first (though the order for the 32 bit length is not actually specified), there is no mention in the standard as to how to encapsulate the message in an 8 bit world, hence different users and vendors have chosen little or big endian schemes. The new DICOM standard assumes a default little endian representation which seems to be the most appropriate considering the old definition for 32 bit words, which specified that the least significant 16 bit word be transmitted first.

Hence there are three likely possible byte orders that a vendor interpreting the ACR/NEMA standard in a byte oriented world may have used:

            - little endian 16 and 32 bit words, as in DICOM 3,
            - big endian 16 and 32 bit words, as in DICOM 3,
            - big endian 16 bit words, but the least significant half of
              a 32 bit word is sent first (as per ACR/NEMA 2.0).
The choice seems to be made usually on the basis of the native byte order of integers on the host processor. Most of the formats I have encountered are one of the first two, but I did encounter one from Philips that used the last scheme and it drove me crazy for a while, until I appreciated the subtlety of it ! I call it "Big Bad Endian" format in my implementation that recognizes it, but that may be a value judgement on my part :)

Notice particularly how this design allows one to parse the message even if the data dictionary is not complete. Consider an element that has an unrecognized element name. One cannot interpret the content of the element and so has to ignore it. One doesn't even know whether it contains binary or ascii information (this is what DICOM later refers to as "implicit representation". despite this, the length value allows one to skip to the next element and proceed.

Over the years there has been much discussion amongst those who favour such implicit dictionary driven schemes, and those who prefer explicit representations, including explicit description of the element type (binary or ascii, etc.) and even the element description itself! Some would prefer the message to contain something like "RecognitionCode='ACR-NEMA 2.0';" for example. The nuclear medicine groups have adopted a de facto standard called Interfile that makes use of ACR/NEMA data elements, but uses such a descriptive representation. Their argument is that the data stream is much more readable which is true enough, and more readily extensible.

The groups are organized as follows:
            0000                    Command
            0008                    Identifying
            0010                    Patient
            0018                    Acquisition
            0020                    Relationship
            0028                    Image Presentation
            4000                    Text
            6000-601E (even)        Overlay
            7FE0                    Pixel Data
Some of the more interesting elements are:
            (nnnn,0000) BD S Group Length           # of bytes in group nnnn
            (nnnn,4000) AT M Comments

            (0008,0010) AT S Recognition Code       # ACR-NEMA 1.0 or 2.0
            (0008,0020) AT S Study Date             #
            (0008,0021) AT S Series Date            #
            (0008,0022) AT S Acquisition Date       #
            (0008,0023) AT S Image Date             #
            (0008,0030) AT S Study Time             #
            (0008,0031) AT S Series Time            #
            (0008,0032) AT S Acquisition Time       #
            (0008,0033) AT S Image Time             #
            (0008,0060) AT S Modality               # CT,NM,MR,DS,DR,US,OT

            (0010,0010) AT S Patient Name
            (0010,0020) AT S Patient ID
            (0010,0030) AT S Patient Birthdate      #
            (0010,0040) AT S Patient Sex            # M, F, O for other
            (0010,1010) AT S Patient Age            # xxxD or W or M or Y

            (0018,0010) AT M Contrast/Bolus Agent   # or NONE
            (0018,0030) AT M Radionuclide
            (0018,0050) AN S Slice Thickness        # mm
            (0018,0060) AN M KVP
            (0018,0080) AN S Repetition Time        # ms
            (0018,0081) AN S Echo Time              # ms
            (0018,0082) AN S Inversion Time         # ms
            (0018,1120) AN S Gantry Tilt            # degrees

            (0020,1040) AT S Position Reference     # eg. iliac crest
            (0020,1041) AN S Slice Location         # in mm (signed)

            (0028,0010) BI S Rows
            (0028,0011) BI S Columns
            (0028,0030) AN M Pixel Size             # row\col in mm
            (0028,0100) BI S Bits Allocated         # eg. 12 bit for CT
            (0028,0101) BI S Bits Stored            # eg. 16 bit
            (0028,0102) BI S High Bit               # eg. 11
            (0028,0103) BI S Pixel Representation   # 1 signed, 0 unsigned

            (7FE0,0010) BI M Pixel Data             # as described by grp 0028

The way in which the pixel data is stored can vary tremendously, though thankfully most users and vendors use the simple unimaginative scheme that is shown above, ie. 1 12 bit pixel stored in the low order part of a 16 bit word with no attempt at packing more compactly. Following are some examples shown in Appendix E of the standard. Note that when one adds the little/big endian question the permutations mount!

        Bits Allocated = 16
        Bits Stored    = 12
        High Bit       = 11

                          |<------------------ pixel ----------------->|
            ______________ ______________ ______________ ______________
           |XXXXXXXXXXXXXX|              |              |              |
            15          12 11           8 7            4 3            0


        Bits Allocated = 16
        Bits Stored    = 12
        High Bit       = 15

           |<------------------ pixel ----------------->|
            ______________ ______________ ______________ ______________
           |              |              |              |XXXXXXXXXXXXXX|
            15          12 11           8 7            4 3            0


        Bits Allocated = 12
        Bits Stored    = 12
        High Bit       = 11

           ------ 2 ----->|<------------------ pixel 1 --------------->|
            ______________ ______________ ______________ ______________
           |              |              |              |              |
            15          12 11           8 7            4 3            0

           -------------- 3 ------------>|<------------ 2 --------------
            ______________ ______________ ______________ ______________
           |              |              |              |              |
            15          12 11           8 7            4 3            0

           |<------------------ pixel 4 --------------->|<----- 3 ------
            ______________ ______________ ______________ ______________
           |              |              |              |              |
            15          12 11           8 7            4 3            0


And so on ... refer to the standard itself for more detail.

The next part is part2 - standard formats (continued).