Package org.tridas.io.formats.catras

This is a binary format for software written by Aniol, first released in 1983.

See:
          Description

Class Summary
CatrasFile  
CatrasReader  
CatrasToTridasDefaults  
CatrasWriter  
TridasToCatrasDefaults  
 

Enum Summary
CatrasToTridasDefaults.CATRASFileType  
CatrasToTridasDefaults.CATRASLastRing  
CatrasToTridasDefaults.CATRASProtection  
CatrasToTridasDefaults.CATRASScope  
CatrasToTridasDefaults.CATRASSource  
CatrasToTridasDefaults.CATRASVariableType  
CatrasToTridasDefaults.DefaultFields  
 

Package org.tridas.io.formats.catras Description

This is a binary format for software written by Aniol, first released in 1983.

Several versions of CATRAS were released over the years, the most recent we have seen is v4.35 released in 2003. It is uncertain if there have been changes made over the years. The code in this library is based on Matlab, Fortran and C code of Ronald Visser, Henri Grissino-Mayer and Ian Tyers.

Reading byte code

Reading byte code is more complicated than reading text files. Each byte is 8-bits and therefore can represent up to 256 values. Depending on the type of information each byte contains, the bytes are interpreted in one of three ways:

Strings

Some of the bytes in CATRAS files contain character information. In the case each byte represents a letter. In java an array of bytes can be directly decoded into a string.

Integers

As a byte can only represent 256 values, whenever an integer is required CATRAS stores them as byte pairs. Each byte pair consists of a least significant byte (LSB) and a most significant byte (MSB). The order that they appear in files typically varies between platforms and is known as 'endianness'. As CATRAS solely runs of Microsoft (x86) processors we can safely assume that all CATRAS files will be using little-endian (i.e. LSB MSB). The counting in a byte pair therefore works as follows:

LSBMSB
00
10
20
......
2550
01
11
21
......
255255

A byte pair can therefore store 256x256=65536 values (more than enough for most number fields). Matters are complicated though by the need to store negative numbers. In CATRAS pairs with an MSB<=128 are positive, while pairs with an MSB ranging from 255 to 128 (counting backwards) represent negative values.

Categories

Categories are typically recorded as single bytes as most categories have just a few possible values. They can therefore be conceptualised as being integers where 0=first option, 1=second option etc. The exception to this is for species as there are more than 256 species. In this case, a byte pair is used in exactly the same way as described for integers above. The only problem for species is that the codes are unique to each laboratory and refer to values enumerated in a separate '.wnm' file. Without this dictionary the species code is of little use.

Dates

Dates are stored as three single bytes, one for day, one for month, one for year. With only 256 values available for 'year', all dates are stored with 2 digit years e.g. 25/12/84. When converting to TRiDaS all years >70 are treated as 20th century, whereas years <70 are treated as 21st century. This is an arbitrary decision for use in this library as CATRAS does not care either way.

Metadata

The first 128 bytes contain the file header information and the remainder of the file contains the ring width data and sample depth data. Our current understanding of the header bytes is as follows but I'm not convinced that these are all correct. Deciphering these requires painstaking work because we must try to ascertain how each byte is being used (e.g. as a byte pair, single byte or as a string):

Data

The remaining bytes in the file contain the actual data values stored as integer byte pairs. It appears that older version of CATRAS included one or more padding values of -1. These values should be ignored. The end of the data values are indicated by a stop value of 999.

Following the ring width data values there are 42 bytes of unknown meaning. These are then followed by byte pairs representing the counts/sample depth for each ring if the series is a chronology.

Unknown bytes

There are a number of bytes in both the header and data sections that are are unaccounted for and are therefore likely to contain data that we are ignoring:



Copyright © 2011. All Rights Reserved.