EISCAT UK support group logo EISCAT HQ logo

8: Integrating raw data

How to integrate (or dis-integrate!) EISCAT raw data.

While data recorded at the basic time resolution of the EISCAT systems (typically 5 or 10 seconds) can be interesting for the study of high time-resolution phenomena, they are also often too noisy for purposes such as data analysis and therefore have to be post-integrated up to time scales of 1 minute or greater.

Alternatively, some experiments whose data are dumped at 5 or 10 seconds resolution contain "time slices", in which data from sequential sub-periods within the integration are written to different parts of the result memory. For example, an experiment with a 5 second pre-integration period could contain 10 identical data blocks from each of the 0.5 second intervals within the pre-integration. Because of this, one might want to split up the data dump before doing further analysis - we sometimes refer to this as "disintegration" !

Data Integration

One possibility, if you are only interested in reducing your data to plasma parameters, is to do the data integration internally within the GUISDAP analysis program. Section 10 describes how to do this. However, a better approach might be to prepare an integrated data set before analysing it. This might be a better option because the stand-alone integration programs offer more flexibility than exists in GUISDAP, and because having an explicit copy of the integrated data allows the user to do more things, such as plotting spectra.

We offer two programs for integrating raw data, namely

Integrate2

The integrate2 program is written in python, and designed to handle matlab-format raw data, i.e. raw data from the ESR or post-renovation mainland radars. The program outputs integrated data, also in matlab format, which can be used (among other things) as input to the GUISDAP analysis program.

Integrate2 can be run from the command line, and can also be configured to take input from an idef file. In normal operation, it is a stand-alone program, not used as part of a unix pipe.

In order to run integrate2, it is necessary to specify the format and location of the input and output files, and the strategy to be used to integrate them. Simple integration strategies, such as time-based or scan-based integration, can be done from the command line, e.g.:

NB. You need to type the curly brackets in this command line and substitute the input and output directories for your own. The output directory must already exist.

integrate2 -s "gupmat { dir /stager/c/ian/test/input;}" -i "integrate time { period 5:00;} output gup { dir /stager/c/ian/test/output;}"

The above command takes matlab-format raw data from the directory /stager/c/ian/test/input, and produces five-minute integrated data in matlab format in the directory /stager/c/ian/test/output. These data can then be analysed using GUISDAP.

To integrate on antenna movement (i.e. to add together data dumps made in each position while the antenna was stationary, but to throw away dumps made while moving between positions) the following syntax can be used

integrate2 -s "gupmat { dir /stager/c/ian/test/input;}" -i "integrate move {} output gup { dir /stager/c/ian/test/output;}"

integrate2 can also be used as part of a unix pipe, provided that it is told to expect input from stdin, e.g.

find {directory name} | sort | integrate2 -s "gupmat {stdin;}" -i "integrate move {} output gup { dir /stager/c/ian/test/output;}"

It can also read input from a matlab filelist (filelist.dat) in a data directory, i.e.

integrate2 -s "gupmat { dir /stager/c/ian/test/input; filelist;}" -i "integrate move {} output gup { dir /stager/c/ian/test/output;}"

More complex integration strategies can be specified using idef files, which have the same format as for intpipe. Alternatively it is possible to specify quite complex strategies via the command line, but the command lines can become unwieldy if too much complexity is being included.

The integrate2 utility is fully documented at the Leicester website.

Intpipe

The C-based program, intpipe is fully compatible with LDR format data from the pre-renovation mainland system, and has the added advantage that integration done using intpipe can be linked to other commands via unix pipes.

The intpipe routine takes input in integrated data format, and by default outputs integrated data in the same format. It requires an "integration definition file", also known as an "idef" file, which specifies how the integration should be done. Idef files are human-readable, and are easy to edit and re-use. The directory /soft/eiscat/etc/integrate includes a number of examples of idef files, including examples of time bound integration, such as fixed_5min_uhf.idef and scan-type integration. In this type of integration, the data are integrated while the radar is stationary, but dumps taken where the radar is moving are discarded (see scan_uhf.idef).

Full documentation of all the integration programs available for LDR-format data is available here.

The syntax for producing a file of data in integrated data format, which has been integrated to the user's specifications using an idef file is:

getLR < {catalogue file} | LDR2int | intpipe -id {idef file} > {integrated data file}

Because of the way that intpipe is designed, to take piped input and provide output suitable for a unix pipe, the integrated data file which is produced can then be piped into other utilities, such as the RAL Analysis Program. Unix pipes have already been discussed in Section 7 and explicit examples of their use in analysis are given in Section 10.

Intpipe can also be used where a pre-existing integrated data file or LDR-file is being re-integrated at coarser time resolution. The syntax for using this utility is:

intpipe -id {idef file} < {input integrated data file} > {output integrated data file}

An important note about variances

The most common usage of the integration programs is to provide input for the data analysis programs (GUISDAP or RAL Analysis). It is very important that this input (i.e. the integration output) conforms to the format that the analysis programs expect, which is:

  • For GUISDAP - matlab-format files, which if integrated beyond the basic pre-integration should contain the measured variance of the data.
  • For RAL Analysis (or Powerprofile) - integrated data format files, not containing the measured variance, since this is calculated internally to the program.
  • If you ask GUISDAP to analyse the data using measured variances (see Section 10) but attempt to provide GUISDAP with input data which does not contain the variance information, then the program will crash. For integrations containing only 2 or 3 data dumps, it is likely that the measured variance of the data might not be a good indicator of the true variance. In these circumstances, it might be better to ask GUISDAP to use calculated theoretical variances, but be warned that this can make the program very slow !

    Data dis-integration

    As discussed above, it is sometimes necessary to divide data dumps before further processing, for example in cases where the data contain "time slices" corresponding to the storage of results from consecutive subsets of the pre-integration period. Because of the variety of time-slicing techniques used in EISCAT experiment, there is no single piece of software to do this. Some discussion of the best way to approach this problem, and the experiments for which it is appropriate, is given on the data disintegration page.