INPUT DATA

Next: INTERNAL MEMORY Up: The KEYBOARD INTERFACE: Command Previous: INFORMATION

`INPUT DATA`

Input of data from defined file formats into FIT2D's internal data-base. A number of different formats are available and more will be added as appropriate. Different formats have different levels of sophistication (or lack of sophistication) and hence the amount of user input varies. At the time of writing the following formats are available:

: 1-D ASCII FREE FORMAT This allows 1-D data from ASCII text files to be input
: 2-D ASCII FREE FORMAT This allows flexible input of 2-D data from ASCII text files containing lists of pixel values
: BINARY (UNFORMATTED) This allows unformatted binary data to be input
: BSL FORMAT This is the Daresbury BSL format based on the Hamburg OTOKO format
: CHIPLOT FORMAT Simple ASCII X/Y column format data as used by the program CHIPLOT
: COMPRESSED DIFFRACTION DATA Inputs data which has been stored in the ``Compacted Diffraction Data'' format
: DIP-2000 (MAC SCIENCE) This allows a special two byte binary 2500 $\times$ 2500 image to be input
: ESRF DATA FORMAT Inputs files which have been stored using a limited sub-set of the ``ESRF data format'' (files from the SAXS end-station on BL-4).
: FIT2D STANDARD FORMAT This is the standard format which a flexible self describing format
: FUJI BAS-2000 Fuji BAS-2000 image plate scanner format (also BAS-1500)
: GAS 2-D DETECTOR (ESRF) Direct input of raw format files written on the beam-lines (separate ASCII header file and a binary ``histogramming memory'' file)
: HAMAMATSU PHOTONICS Integer*2 binary data described by a short header. Produced by the Hamamatsu Photonics K.K. C4880 CCD cameras which are used at the Photon Factory for the X-ray image intensifier/CCD read-out detectors.
: IMAGEQUANT For input of data from the Molecular Dynamics Imaging plate PC systems (a TIFF based format file)
: MAR RESEARCH FORMAT MarResearch image plate system format.
: NEW MAR CODE Same as MAR RESEARCH FORMAT
: PDS FORMAT The ``Powder Diffraction Standard'' format [24]
: PHOTOMETRICS CCD FORMAT Integer*2 binary data described by a short header. Produced by the X-ray image intensifier/CCD read-out detector system²².
: PMC FORMAT PhotoMetrics Compressed XRII/CCD data. This is data from the Photometrics CCD camera, but which has been compressed using the program pmi2pmc.
: PRINCETON CCD FORMAT Integer*2, Integer*4, or IEEE REAL*4 binary data described by a short header. Produced by the X-ray image intensifier/CCD read-out detector system. (Very similar but slightly different format to the Photometrics format.)
: TIFF Input of simple 1 and 2-byte per pixel TIFF files
: UNKNOWN Attempt to deduce format on non-compressed data
: USER INTENSITIES Interactive entry of data values
: WESS FORMAT For input of film densitometer data (Unformatted unsigned byte data)

Some of these formats are described in greater detail below:

`1-D ASCII FREE FORMAT`

This allows 1-D data to be input from an ASCII file with a large amount of flexibility in treating column information.

The program prompts for the file name:

Enter name of file containing 1-D data
FILE NAME [fe_2.scan]:

A sample of the start of the file is output to help the user chose the correct columns for input of the angular and intensity data:

Sample of start of the input file:
 -30.0000    39689        0          0
 -28.0000    40200        0          0
 -26.0000    40682        0          0
 -24.0000    40892        0          0
 -22.0000    40914        0          0
 -20.0000    40569        0          0

A variety of questions allow the user to specify the starting line, and character position for the data, and which columns of data to use for X-coordinates and the Y-coordinates. By entering 0 for the number of coordinates to be input the program automatically inputs as many coordinates as possible. Alternatively the number of coordinates to input may be specified.

Here is an example of the program dialogue:

Main menu: ENTER COMMAND [INPUT DATA]:
FILE FORMAT [FIT2D STANDARD FORMAT]:1-D
Enter name of file containing 1-D data
FILE NAME [source.scan]:/scratch/EXPERIMENTS/bl10/cell_1_edit.scan
Sample of start of the input file:
  #  DETECTOR vct6 Ch5  Monitor    Seconds
  0  -27.0000     7394        0         45
  1  -26.0000     7491        0         45
  2  -25.0000     7672        0         45
  3  -24.0000     7680        0         45
  4  -23.0000     7675        0         45
TYPE OF DATA FORMAT [VERTICAL COLUMNS]:
NUMBER OF LINES TO IGNORE (Range: 0 to 1000) [0]:1
NUMBER OF CHARACTERS TO IGNORE (Range: 0 to 255) [0]:
NUMBER OF COORDINATES (Range: 0 to 500) [500]:0
Columns of numbers for data-set  1
COLUMN NUMBER FOR X-COORDINATES (Range: 0 to 80) [1]:2
COLUMN NUMBER FOR Y-COORDINATES (Range: 1 to 80) [3]:
INFO:   56 X/Y coordinates per data-set have been found

`2-D ASCII FREE FORMAT`

This allows a 2-D image to be defined and input from numbers in an ASCII file. The numbers may be input in free format, so a variety of column formats may be input.

The numbers may be separated by one of more spaces and TABS, or by commas, spaces, and TABS. Most common forms of scientific notation are supported.

The program prompts for the file name:

Enter name of file containing 1-D data
FILE NAME [file.ascii]:

The size of the image to be defined is then input:

X NUMBER PIXELS (Range: 1 to 10000000) [30]:
Y NUMBER PIXELS (Range: 1 to 10000000) [30]:

This defines the width and the height of the image to be input.

A sample of the start of the file is output to help the user chose the correct line to start input:

Sample of start of the input file:
Test File for 2-D input
1.0 2.0 3.0
3.0 2.0
3.0
4.0
5.9

You are asked how many lines to ignore, so that a short header section may be ``jumped'' over. In the above example, the first line is a text line, so should be ignored. e.g.

NUMBER OF LINES TO IGNORE (Range: 0 to 1000) [0]: 1

Every ``number'' which can be converted will be input and used to define the next pixel value. Thus, the number of values per line may be variable, as in the above example.

`BINARY (UNFORMATTED)`

Unformatted binary data can be input provided the user knows the size of the data image. Presently single byte integer ``Integer*2'', ``Integer*4'' and ``Real*4'' data can be input using this option. The single byte, ``Integer*2'' and ``Integer*4'' data may be signed or unsigned and either byte order may be treated. FIT2D prompts for data size and options for byte order and sign/unsigned data.

If the size of the data is not known it may be possible to set a small value for the Y-direction and vary the X-direction until the image wraps properly (provided of course it is possible to identify a true image). Similarly using the wrong byte order should produce very strange results when the image is viewed.

(It should be noted that the GUI input form has the additional ability to be able to specify an offset from the start of the file in order to jump over known length headers.)

The following example shows a a binary image of 1242 $\times$ 1152 pixels being input in 2-byte unsigned integers, without byte swapping:

Main menu: ENTER COMMAND [INPUT DATA]:input
FILE FORMAT [PRINCETON CCD FORMAT]:bin
INPUT FILE NAME [no_data.dat]:grid.bin
WARNING: No '.info' file, so unknown image size, and pixel sizes
         (Pixel sizes defaulted to 100 x 100 microns.)
X NUMBER PIXELS (Range: 1 to 10000000) [1242]:
Y NUMBER PIXELS (Range: 1 to 10000000) [1152]:
DATA TYPE [INTEGER (2-BYTE)]:?
4-BYTE INTEGER: Signed or unsigned, either byte order
BYTE VALUES: Single signed or unsigned integers
INTEGER (2-BYTE): Signed or unsigned either byte order
REAL (4-BYTE IEEE): Floating point real numbers
DATA TYPE [INTEGER (2-BYTE)]:
PERFORM BYTE SWAPPING [NO]:n
SIGNED DATA [NO]:n
Main menu: ENTER COMMAND [IMAGE]:plo

`BSL FORMAT`

The BSL format is essentially the same as the OTOKO format developed at Hamburg. It consists of an ASCII header file and a number of binary image data files. The images are stored in floating point reals. (This may cause problems when files are moved between non-IEEE and IEEE float point reals systems.)

This may also cause problems between little-endian and big-endian systems since the endianess of the stored data is not recorded.

FIT2D will open the header file and obtain information concerning the images which may be input. The user will be prompted for the number of the image to input.

`CHIPLOT FORMAT`

The standard $\chi$ PLOT data format is a very simple format allowing input of one or more data-sets together with title and axis labels, and accompanying text. Error estimates may be defined.

The file should be a formatted variable length record ASCII file. Such a file may be created with an editor, or a BASIC program using PRINT # statements, or a Fortran program using formatted WRITE statements.

The minimum file format is as follows:

Title and Axes Labels The first three lines contain character strings (i.e. text) that will be used to label the graph. The first line is the title of the graph, the second is the label to be written for the X-axis, and the third line is the label for the Y-axis.

Number of Data points and Data-Sets The fourth line specifies how many data points are present in each of the data-sets, and how many data-sets to input. This line should contain two integers separated by a one or more spaces or a comma. If the second integer is missing it is assumed that only one curve is required.

The Data Points The next '*' lines of the file should contain the data to be plotted; where '*' is the number of data points specified in line four. Each line should contain the value for the X-coordinate, and the values of the Y-coordinates for the different curves. (Note: All the curves share the same X-coordinates.) e.g. If only one curve is to be drawn each line should have two values, the values being separated by a blank space or a comma. Such a line may be as follows:

1 30.3

or equivalent

1.0, 3.03e1

Here the lines represents the coordinate (1.0,30.3). If the number does not contain a decimal point it is assumed to be whole number.

The following example shows a file which produces a single curve. Note on line four of the file the number of curves is not specified so one is assumed.

Figure 1. SINGLE DATA-SET
X AXIS UNITS
Y AXIS UNITS
10
1 5e-2
2., 5.9e-2
3. 6.7e-2
4 6.8e-2
5 6.6e-2
6 5e-2
7 4.1e-2
8 2e-2
9 0.6e-2
10 -2.6e-2

ASCII X/Y column output from other programs can be easily converted to this format for input.

`DIP-2000 (Mac Science)`

The DIP-2000 Mac Science scanner produces images of 2500 $\times$ 2500 pixels with each pixel intensity stored in two bytes. As the scanner uses two different photo-multiplier tubes the ``gain'' is different for different pixels.

The intensity values are decoded according to the following scheme:

$If (I_e \geq 0): I_d = I_e$

Where are the intensity values of the input encoded pixels, and are the intensities of the corrected values.

The user is asked for the name of the file to input, and whether byte swapping is necessary. For a Sun, Silicon Graphics, or HP workstation byte swapping will generally not be necessary.

(Note: The orientation of the image is not necessarily correct at present. When the input orientation is correctly known, this input may be adjusted to conform with the normal ESRF image display convention (looking from sample towards the detector.)

``ESRF Data Format''

FIT2D now contains its own code to input a limited sub-set of the so called ``ESRF data format''. Unfortunately the format is very poorly defined, and in practice is more defined by the software which writes the data, than by the documentation. The two being completely different in a number of fundamental aspects. This option should only be used on the understanding that it is unsupported (unsupportable).

This routine should hopefully be able to input files produced by the SAXS end-station of ESRF Beam-line 4. However, some older files may fail owing to 4-byte floating point numbers being written across word boundaries. If this happens a warning message will explain the problem, and inform the user of the number of spaces which should be added to the header to make the file conform to word boundaries. These should be added at the end of the header section before the curly bracket (}).

Only files with all the header information occurring before the binary image data are ``supported'', and no data compression is supported.

As one file can potentially contain many images, the user is prompted for the number of the required image.

`FUJI BAS-2000`

The Fuji BAS-2000 image plate scanners produce an ASCII header file and a binary data file. This option will read the header file and find out the size of the image and parameters necessary to convert the stored logarithmic scaled data to a linear scale. The binary data is input and automatically converted to the linear scale.

This input option will also work for BAS-1500 scanners.

`GAS 2-D DETECTOR (ESRF)`

WARNING: The data format used to output data from the ESRF 2-D Gas-filled detectors is poorly defined and may change. This option is believed to work, but no guarantee is given that it is either presently correct, or will continue to be correct. Please check input data carefully. If problems are encountered, or the format changes FIT2D will be modified accordingly.

This format is used to input data directly as produced by the 2-D Gas-filled multi-wire proportional counter detectors used at the ESRF (the detectors themselves are produced by André Gabriel at the EMBL Grenoble). The data is output within two files: an ASCII header file describing the size, type of data, and the number of images held within a binary image file. The binary image file is general referred to as the ``histogramming memory''. This may contain many images.

The user is prompted for the file name of the header file:

INPUT HEADER FILE NAME [wes019-004]:wes019-004

The header file is checked to exist and be a valid header file. FIT2D outputs the number of images in the file, the size of the images and the number of bits used to store each pixel value. The program array size must be big enough to input an entire image.

If more than one image is contained in the binary image file FIT2D prompts for the number of the image to be input. The choice of byte-swapping and of inputing signed or unsigned data is given. Since the data is written by a Motorola 68000 series processor the raw data is written in big endian byte order. Byte-swapping should not be necessary on HP, Sun, and Silicon Graphics workstations, but should be necessary on Vax workstations (and on PC's).

From the header file name the name of the binary image file is constructed (hm is concatenated to the header file name). The data is input in the format defined in the header file.

This is an example of the FIT2D log for the input of a gas-filled detector image:

Main menu: ENTER COMMAND [INPUT DATA]:
FILE FORMAT [FIT2D STANDARD FORMAT]:gas
INPUT HEADER FILE NAME [wes019-004]:wes019-004
INFO: The file contains one image (frame) of   1024 *   1024 pixels.
      Each pixel is stored using 16 bits.
PERFORM BYTE SWAPPING [NO]:?
Enter ``YES'' to swap the byte order on input. Normally byte swapping
should not be necessary for HP, Sun, and Silicon Graphics
workstations. It will normally be necessary for VAX workstations.
PERFORM BYTE SWAPPING [NO]:
SIGNED DATA [NO]:

`HAMAMATSU PHOTONICS`

The HAMAMATSU PHOTONICS format inputs data from the Hamamatsu Photonics K.K. C4880 CCD cameras which are used at the Photon Factory for the X-ray image intensifier/CCD read-out detectors. This is an Integer*2 binary data format described by a short header. Any byte swapping which needs to be performed will be carried out automatically.

If the image is too big for the current program arrays, then a warning message will be produced and only part of the image will be input.

`IMAGEQUANT`

This is the file format used by the Molecular Dynamics Imaging Plate scanner. It is a TIFF based file format, but does not strictly follow the TIFF standard. The Molecular Dynamics scanner produces images from A4 imaging plates which are 1152 $\times$ 1482 pixels for the coarse resolution scan (176 $\mu$ m), and 2304 $\times$ 2964 pixels for the fine resolution scan (88 $\mu$ m).

The Molecular Dynamics scanner produces and displays the image as seen from behind the detector, but the ESRF standard is for images as seen from the sample. Thus the image is left to right reversed when view by FIT2D as compared the PC.

FIT2D offers the possibility to input only a region of the total image, and the possibility to re-bin pixels on input to make the very large images smaller and more manageable. An example prompt and user input is as follows:

DATA FILE NAME [lysip1.gel]:data.gel
X REBIN NUMBER (Range: 1 to 1152) [1]:2
Y REBIN NUMBER (Range: 1 to 1482) [1]:2
LEFT-HAND PIXEL OF IMAGE REGION [1]:500
LOWER PIXEL OF IMAGE REGION [1]:500
RIGHT-HAND PIXEL OF IMAGE REGION (Range: 500 to 1152) [1152]:1000
UPPER PIXEL OF IMAGE REGION (Range: 500 to 1482) [1482]:1000
INFO: Full image size =    1152 *    1482 pixels

`MAR RESEARCH FORMAT / NEW MAR CODE`

The MarResearch on-line image plate scanners produce a number of different file formats. The raw output from the scanner is a spiral read-out scan. This is not suitable for input to FIT2D, but usually this is immediately transformed to a Cartesian raster file.

The original file format contained a fixed length header of one record, followed by the data in a simple binary format, and maybe a number of overload records at the end of the file. FIT2D will input this format including information such as the sample to detector distance from the header and the overload records.

A new format now exists and there is the possibility to store compressed data. From FIT2D V9.114 support exists to input these new files, including the compressed data.

(Note: At present the orientation of the images may still be wrong, since there appears to be a complicated historical scheme where different sized images are stored in a different manner.)

`PMC FORMAT`

This is data from the Photometrics CCD camera, but which has been compressed using the program pmi2pmc. pmi2pmc is a program which can run on a PC or workstation which keeps the same header (presently the first 172 bytes) but compresses the image data according to a simple compression algorithm,

with the value of the first image pixel being stored immediately after the header bytes.

pmi2pmc is designed to allow loss-less data compression at source prior to data transfer over ethernet. A typical transfer time is 20 seconds per uncompressed image, and the compressed images may be nearly one third of the size of the raw data. Thus, the transfer may be much faster if the data can be compressed on the PC. Once compressed it is faster to read it in compressed, so FIT2D has this input option.

pmi2pmc does not change the byte raster order, which is normally from left to right as seen by a camera-man, and from top to bottom. On input FIT2D will correct the image so that it is seen from the crystal side and the byte order is from bottom to top.

The raw data can suffer from appreciable spatial distortion and non-uniformity of response. Commands within the CALIBRATION sub-menu exist to help calibrate and correct these effects (see Section 16, Page ).

(Note: At present the orientation of the CCD camera can change which will also change the sense of the output image. Soon the mechanical holder should stop this uncertainty.)

`PRINCETON CCD FORMAT`

This is the file format produced by the Princeton CCD camera; one of three CCD cameras used with the ESRF X-ray Image Intensifier systems. It is a simple binary format with a 4100 byte header followed by signed or unsigned two byte integer data, signed 4-byte integer data, or IEEE floating point real data. There is an old version of the format and a newer, similar but slightly different version. Both are automatically recognised and input (V9.104). All input types are now supported (V9.136). The full size image of the present ESRF systems is 1242 $\times$ 1152 pixels, but smaller images may be produced (the header contains the size information).

The Princeton camera and software produces and displays the image as seen from behind the detector, but the ESRF standard is for images as seen from the sample. Thus the image is left to right reversed when view by FIT2D as compared the PC. The orientation may be changed by the PC software and by the actual orientation of the CCD camera. FIT2D will display up with the same sense as up in the PC images, but this may or may not be up in the sense of the experiment.

If more than one image has been stored in a file, FIT2D will output a prompt asking for the required image (V7.32), e.g.:

IMAGE NUMBER (Range: 1 to 3) [1]:

`TIFF`

This allows input of simple TIFF files. The TIFF ``standard'' allows many different options, many of which are not supported. Simple 16 bit per pixel and 8 bit per pixel formats are presently supported. Note: no image compression is presently supported. This format has been added primarily to allow input of data from the EMBL Drum scanner which is used for neutron diffraction at the ILL.

NOTE: This option should not be used for input of data from the Molecular Dynamics 400E scanner. The option IMAGEQUANT should be used instead (see Section 15.51.11, Page ).

DATA FILE NAME [test.tiff]:data.tiff
INFO: Full image size =    4000 *    4000 pixels
X REBIN NUMBER (Range: 1 to 4000) [1]:2
Y REBIN NUMBER (Range: 1 to 4000) [1]:2
LEFT-HAND PIXEL OF IMAGE REGION [1]:500
LOWER PIXEL OF IMAGE REGION [1]:500
RIGHT-HAND PIXEL OF IMAGE REGION (Range: 500 to 4000) [4000]:3000
UPPER PIXEL OF IMAGE REGION (Range: 500 to 4000) [4000]:3000

Note: If TIFF files need to be input into FIT2D which use options not presently supported, please inform me and I will try to cater for them.

`UNKNOWN`

This option will try to deduce the data format purely from byte to byte and then pixel to pixel correlations within the data. This cannot work with compressed data.

With raw non-compressed data the bytes should should regular correlations so that it is possible to estimate with a good success rate the number of bytes used in each pixel.

Having determined the number of bytes in a pixel it is relatively easy for integer data to deduce the byte ordering since the lower significant bytes tend to change more rapidly than the high significant bytes.

The number of pixels in a row can then be determined by peaks in the 1-D linear autocorrelation function, although very structured images can lead to false results.

Finally visual inspection can determine the start of the image.

`USER INTENSITIES`

Data values may be input interactively to form a 1-D line of data values. This can be very useful for quickly calculating statistical quantities using the STATISTICS command (See Section 15.109, Page ) after the data has been defined. Data values can also be easily entered and plotted using the PLOT command (See Section 15.69, Page ).

The program prompts continual for more data values; the user must use ``user escape'' to terminate entry of data values. ``User escape'' is two backslashes ()²³. After all the data values have been entered the user may change the entered values to correct mistakes. A typical prompt and user input session may be as follows:

INFO: Continue entering numbers as required, then use "USER ESCAPE"
INFO: "USER ESCAPE" is double backslash (\\)
ENTER DATA VALUE:10.5
ENTER DATA VALUE:15
ENTER DATA VALUE:1.1e2
ENTER DATA VALUE:.45
ENTER DATA VALUE:13.6
ENTER DATA VALUE:\\
DATA VALUE TO CHANGE (0 = quit) (Range: 0 to 5) [0]:4
ENTER NEW DATA VALUE [0.450000]:45.0
DATA VALUE TO CHANGE (0 = quit) (Range: 0 to 5) [5]:0

Where ``user escape'' was entered for the last ENTER DATA VALUE prompt.

Next: INTERNAL MEMORY Up: The KEYBOARD INTERFACE: Command Previous: INFORMATION

Andrew Hammersley
2004-01-09