CarbonTracker Lagrange Documentation
Stage 1 - Creating H Slices¶
H Slices¶
The footprint data from the netcdf footprint files is extracted for each receptor and placed in “H slice” files using the script hsplit.py
Prerequisites¶
There are three things that are required before running hsplit.py.
- Make sure a configuration file is created.
- Create a landmask for the region under study.
- Create a receptor file with a list of footprint filenames to use.
Usage¶
hsplit.py is run with:
python hsplit.py [-c configfile] receptor_file where receptor_file : file containing footprint file names, 1 line per receptor -c configfile : configuration file. If not specified, use 'config.ini'. Required input files: landmask_na.npy - Landmask file
Description¶
The objective of hsplit.py is to extract the time-dependent non-zero footprint values from the netcdf footprint file, and store them in multiple files, one for each timestep used in the inversion. To minimize the amount of opening/closing of the footprint files, only one pass through the receptor list is done. Data from each footprint file is appended to intermediate text files. When the pass through the receptor list is completed, the text files are converted to numpy savez files (or optionally FORTRAN compatible binary files), containing the locations and values of non-zero footprint data. These ‘H strip’ files are sparse files, meaning that only the indices and values of non-zero grid cells are stored. The indices of the grid cells are dependent on the landmask that is being used, so that these indices can be converted to latitude and longitude if desired.
H slice files¶
Think of a single H strip file as an array of dimension nobs x nlandcells. Because we only want non-zero values in the files, instead of the full array we will save data as sparse arrays, only keeping the indices and non-zero values. Each line in the intermediate text files has three values, the observation number (i.e. the line number for the file in the receptor file, the land cell number, and the footprint value. These text files are converted to FORTRAN compatible binary files in hsplit.py. The format of the binary files consists of:
- a single 4 byte integer with the number of entries n,
- n 4 byte integers of observation numbers;
- n 4 byte integers of landcell numbers;
- and n 8 byte reals of footprint values.
We also need to know the size of the full, non-sparse array, so in the binary files the last entry will be the last observation number, the last landcell number, and either an actual obervation value or 0 if there is no observation value available. These values are required to be able to expand the sparse array into the fill non-sparse array, with 0’s for values of the missing data.
In the binary files, the observation numbers and land cell numbers are 1 based, i.e. they range from 1 to nobs, and from 1 to nlandcells respectively. In the intermediate text files, the observation numbers and cell numbers are 0 based, so they would range from 0 to nobs-1 and 0 to nlandcells-1.
Files are named Hnnnn.bin, where nnnn is the 0 padded timestep number, e.g. 0022.