dbdreader package

Submodules

dbdreader.dbdreader module

class dbdreader.dbdreader.DBD(filename, cacheDir=None, skip_initial_line=True)

Bases: object

Class to read a single DBD type file

Parameters:
  • filename (str) – dbd filename

  • cachedDir (str or None, optional) – path to CAC file cache directory. If None, the default path is used.

  • skip_initial_line (bool, default: True) – controls the behaviour of the binary reader: if set to True, all first lines of data in the binary files are skipped otherwise they are read. Default value is True, as the data in the initial file have usually no scienitific merit (random value or arbitrarily old); only for debugging purposes one may want to have the initial data line read.

SKIP_INITIAL_LINE = True
close()

Closes a DBD file

get(*parameters, decimalLatLon=True, discardBadLatLon=True, return_nans=False, max_values_to_read=-1)

Returns time and parameter data for requested parameter

This method reads the requested parameter, and convert it optionally to decimal format if the parameter is latitude-like or longitude-like

Parameters:
  • *parameters (variable length list of str) – parameter name

  • decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.

  • discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.

  • return_nans (bool, optional) – if True, nans are returned for timestamps the variable was not updated or changed.

  • max_values_to_read (int, optional) – if > 0, reading is stopped after this many values have been read.

Returns:

time vector (in seconds) and value vector

Return type:

tuple of (ndarray, ndarray) for each parameter requested.

Raises:

DbdError when the requested parameter(s) cannot be read.

Changed in version 0.4.0: Multi parameters can be passed, giving a time,value tuple for each parameter.

Changed in version 0.5.5: For a single parameter request, the number of values to be read can be limited.

get_fileopen_time()

Returns the time stamp of opening the file in UTC

get_list(*parameters, decimalLatLon=True, discardBadLatLon=True, return_nans=False)

Returns time and value tuples for a list of requested parameters

This method returns time and values tuples for a list of parameters. It is basically a short-hand for a looped get() method.

Note that each parameter comes with its own time base. No interpolation is done. Use get_sync() for that in stead.

Parameters:
  • *parameters (list of str) – list of parameter names

  • decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.

  • discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.

  • return_nans (bool) – If True, nan’s are returned for those timestamps where no new value is available. Default value: False

Returns:

  • list of (ndarray, ndarray) – list of tuples of time and value vectors for each parameter requested.

  • .. deprecated:: 0.4.0

  • .. note:: – This function will be removed in a future version. Use .get() instead.

get_mission_name()

Returns the mission name such as micro.mi

get_sync(*sync_parameters, decimalLatLon=True, discardBadLatLon=True)
Returns a list of values from parameters, all interpolated to the

time base of the first paremeter

This method is used if a number of parameters should be interpolated onto the same time base.

Parameters:
  • *sync_parameters (variable length list of str) – parameter names. Minimal length is 2. The time base of the first parameter is used to interpolate all other parameters onto.

  • decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.

  • discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.

Returns:

  • (ndarray, ndarray, …) – Time vector (of first parameter), values of first parmaeter, and interpolated values of subsequent parameters.

  • Example – get_sync(‘m_water_pressure’,’m_water_cond’,’m_water_temp’)

Notes

Changed in version 0.4.0: Calling signature has changed from the sync parameters passed on as a list, to passed on as parameters.

get_xy(parameter_x, parameter_y, decimalLatLon=True, discardBadLatLon=True)

Returns values of parameter_x and paramter_y

For parameters parameter_x and parameter_y this method returns a tuple with the values of both parameters. If necessary, the time base of parameter_y is interpolated onto the one of parameter_x.

Parameters:
  • parameter_x (str) – parameter name of x-parameter

  • parameter_y (str) – parameter name of y-parameter

  • decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.

  • discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.

Returns:

tuple of value vectors

Return type:

(ndarray, ndarray)

has_parameter(parameter)

Check wheter this file contains parameter

Parameters:

parameter (str) – parameter to check

Returns:

True if parameter is in the list, or False if not

Return type:

bool

class dbdreader.dbdreader.DBDHeader

Bases: object

Class to read the headers of DBD files. This file is typically used by DBD and MultiDBD and not directly.

property factored
parse(line)
read_cache(fp, fpcopy=None)

read cache file

read_header(fp, filename='')

read the header of the file, given by fp

class dbdreader.dbdreader.DBDList(*p)

Bases: list

List that properly sorts dbd files.

Object subclassed from list. The sort method defaults to sorting dbd files and friends in the right order.

Parameters:

*p (variable length list of str) – filenames

REGEX = re.compile('-[0-9]*-[0-9]*-[0-9]*\\.[demnstDEMNST][bB][dD]')
sort(cmp=None, key=None, reverse=False)

sorts filenames ensuring dbd files are in chronological order in place

Parameters:
  • cmp – ingored keyword (for compatibility reasons only)

  • key – ignored keyword (for compatibility reasons only)

  • reverse (bool) – If True, performs a reverse sort.

class dbdreader.dbdreader.DBDPatternSelect(date_format='%d %m %Y', cacheDir=None)

Bases: object

Selecting DBD files.

A class for selecting dbd files based on a date condition. The class opens files and reads the headers only.

Parameters:

date_format (str, optional) – date format used to interpret date strings.

Note

Times are based on the opening time of the file only.

bins(pattern=None, filenames=None, binsize=86400, t_start=None, t_end=None)

Return a list of filenames, in time bins

The method makes a list of all filenames, matching either pattern or filenames and bins these in time windows of width. If t_start and t_end are not given, they are computed from the first and last timestamps of the files specified, respectively.

This method returns a list of tuples, where each tuple contains the centred time of the bin, and a list of all filenames that fall within this bin.

Parameters:
  • pattern (str) – search pattern (as used in glob)

  • filenames (list of str) – filename list

  • binsize (float) – binsize of in seconds

  • t_start (None or float) – Timestamp in seconds since 1/1/1970

  • t_end (None or float) – Timestamp in seconds since 1/1/1970

Returns:

list of filenames, grouped per bin

Return type:

list of list of str

Raises:

ValueError if nor pattern or filenames is given.

cache = {}
get_date_format()

Returns date format string.

Returns:

date format string

Return type:

str

get_filenames(pattern, filenames, cacheDir=None)

Get filenames (sorted) and update CAC cache directory.

Parameters:
  • pattern (str) – search pattern (as used in glob)

  • filenames (list of str) – list of filenames

Returns:

sorted list of filenames.

Return type:

list of str

select(pattern=None, filenames=[], from_date=None, until_date=None)

Select file names from pattern or list.

This method selects the filenames given a filename list or search pattern and given time limits.

Parameters:
  • pattern (str) – search pattern (passed to glob) to find filenames

  • filenames (list of str) – filename list

  • from_date (None or str, optional) – date used as start date criterion. If None, all files are included until the until_date.

  • until_date (None or str, optional) – date used aas end date criterion. If None, all files after from_date are included.

Returns:

list of filenames that match the criteria

Raises:

ValueError if nor pattern or filenames is given.

Note

Either pattern or filenames should be supplied, and at least one of from_date and until_date.

set_date_format(date_format)

Set date format

Sets the date format used to interpret the from_date and until_dates.

Parameters:
  • date_format (str) – format to interpret date strings. Example “%H %d %m %Y”

  • cachedDir (str or None, optional) – path to CAC file cache directory. If None, the default path is used.

exception dbdreader.dbdreader.DbdError(value=9, mesg=None, data=None)

Bases: Exception

class MissingCacheFileData(missing_cache_files, cache_dir)

Bases: tuple

cache_dir

Alias for field number 1

missing_cache_files

Alias for field number 0

class dbdreader.dbdreader.MultiDBD(filenames=None, pattern=None, cacheDir=None, complemented_files_only=False, complement_files=False, banned_missions=[], missions=[], max_files=None, skip_initial_line=True)

Bases: object

Opens multiple dbd files for reading

This class is intended for reading multiple dbd files and treating them as one.

Parameters:
  • filenames (list of str or None) – list of filenames to open

  • pattern (str or None) – search pattern as passed to glob

  • cacheDir (str or None) – path to directory with CAC cache files (None: the default directory is used)

  • complemented_files_only (bool) – if True, only those files are retained for which both engineering and science data files are available.

  • complement_files (bool) – If True automatically include matching [de]bd files

  • banned_missions (list of str) – List of mission names that should be disregarded.

  • missions (list of str) – List of missions names that should be considered only.

  • maxfiles (int) –

    maximum number of files to be read, where

    >0: the first n files are read <0: the last n files are read.

  • skip_initial_line (bool (default: True)) – If True, the first data line in each dbd file (and friends) is not read.

Notes

Upon creating the dbd file, when starting a new mission or dive segment, all parameters are written and marked as updated. In reality, most parameters are NOT update, and the value written is the value in memory, which may be several minutes old, or even longer. It has been pointed out to me that a handful parameters, are set only once, before creating the dbd file. Since these parameters are not of interest for normal data processing, the first line of data is skipped by default, but can be read if required.

Changed in version 0.4.0: ensure_paired and included_paired keywords have been replaced by complemented_files_only and complement_files, respectively.

close()

Close all open files

determine_ctd_type()

Determines CTD type installed from the presence of CTD specific name for the time stamp.

Returns:

  • string – {“ctd41cp”, “rbrctd”}

  • If unable to get a positive CTD identification, it is assumed the CTD installed is a Seabird

  • CTD, returning “ctd41cp”.

Notes

New in version 0.5.5.

get(*parameters, decimalLatLon=True, discardBadLatLon=True, return_nans=False, include_source=False, max_values_to_read=-1)

Returns time and value tuple(s) for requested parameter(s)

This method returns time and values tuples for a list of parameters.

Note that each parameter comes with its own time base. No interpolation is done. Use get_sync() for that in stead.

Parameters:
  • parameter_list (list of str) – list of parameter names

  • decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.

  • discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.

  • return_nans (bool) – If True, nan’s are returned for those timestamps where no new value is available. Default value: False

  • include_source (bool, optional) –

    If True, a list with a reference for each data point to the DBD object, where the datapoint originated from. If called with a single parameter, a tuple of a Nx2 array with data and a list of N elements with refrences to a DBD object. If called for more parameters, a list of such tuples is returned.

    Default value: False

  • max_values_to_read (int, optional) – if > 1, then reading is stopped after this many values have been read. Default value : -1

Returns:

  • (ndarray, ndarray) or

  • ((ndarray, ndarray), list) or

  • [(ndarray, ndarray), (ndarray, ndarray), …]

  • [((ndarray, ndarray), list), ((ndarray, ndarray), list), …] – for a single parameter, for a single parameter, including source file list, for multiple parameters, for multiple parameters, including source file list, respectively.

  • .. versionchanged:: 0.5.5 For a single parameter request, the number of values to be read can be limited.

get_CTD_sync(*parameters, decimalLatLon=True, discardBadLatLon=True)

Returns a list of values from CTD and optionally other parameters, all interpolated to the time base of the CTD timestamp.

Parameters:
  • *parameters (variable length list of str) – names of parameters to be read additionally

  • decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.

  • discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.

Returns:

Time vector (of first parameter), C, T and P values, and interpolated values of subsequent parameters.

Return type:

(ndarray, ndarray, …)

Notes

New in version 0.4.0.

get_global_time_range(fmt='%d %b %Y %H:%M')

Returns start and end dates of data set (all files)

Parameters:

fmt (str) – String that determines how the time string is formatted.

Returns:

tuple with formatted time strings

Return type:

(str, str)

get_sync(*parameters, decimalLatLon=True, discardBadLatLon=True)
Returns a list of values from parameters, all interpolated to the

time base of the first paremeter

This method is used if a number of parameters should be interpolated onto the same time base.

Parameters:
  • *parameters (variable length list of str) – parameter names. Minimal length is 2. The time base of the first parameter is used to interpolate all other parameters onto.

  • decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.

  • discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.

Returns:

  • (ndarray, ndarray, …) – Time vector (of first parameter), values of first parmaeter, and interpolated values of subsequent parameters.

  • Example – get_sync(‘m_water_pressure’,’m_water_cond’,’m_water_temp’)

Notes

Changed in version 0.4.0: Calling signature has changed from the sync parameters passed on as a list, to passed on as parameters.

get_time_range(fmt='%d %b %Y %H:%M')

Get start and end date of the time range selection set

Parameters:

fmt (str) – String that determines how the time string is formatted

Returns:

Tuple with formatted time strings

Return type:

(str, str)

get_xy(parameter_x, parameter_y, decimalLatLon=True, discardBadLatLon=True)

Returns values of parameter_x and paramter_y

For parameters parameter_x and parameter_y this method returns a tuple with the values of both parameters. If necessary, the time base of parameter_y is interpolated onto the one of parameter_x.

Parameters:
  • parameter_x (str) – parameter name of x-parameter

  • parameter_y (str) – parameter name of y-parameter

  • decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.

  • discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.

Returns:

tuple of value vectors

Return type:

(ndarray, ndarray)

has_parameter(parameter)

Has this file parameter? :returns: True if this instance has found parameter :rtype: bool

classmethod isScienceDataFile(fn)

Is file a science file?

Parameters:

fn (str) – filename

Returns:

True if file fn is a science file

Return type:

bool

set_skip_initial_line(skip_initial_line)

Sets the reading mode of the binary reader to skip the initial data entry or not.

Parameters:

skip_initial_line (bool) – Sets the attribute skip_initial_line of each DBD instance, controlling the reading of the first data entry of each binary file.

set_time_limits(minTimeUTC=None, maxTimeUTC=None)

Set time limits for data to be returned by get() and friends.

Parameters:
  • minTimeUTC (str) – start time in UTC

  • maxTimeUTC (str) – end time in UTC

Notes

{minTimeUTC, maxTimeUTC} are expected in one of these formats:

“%d %b %Y” 3 Mar 2014

or

“%d %b %Y %H:%M” 4 Apr 2014 12:21

dbdreader.dbdreader.epochToDateTimeStr(seconds, dateformat='%Y%m%d', timeformat='%H:%M')

Converts seconds since Epoch to date string

This function converts seconds since Epoch to a datestr and timestr with user configurable formats.

Parameters:
  • seconds (float or int) – seconds since Epoch

  • dateformat (str) – string defining how the date string should be formatted

  • timeformat (str) – string defining how the time string should be formatted

Returns:

datestring and timestring

Return type:

(str, str)

dbdreader.dbdreader.strptimeToEpoch(datestr, fmt)

Converts datestr into seconds

Function to convert a date string into seconds since Epoch. This function is not affected by the time zone used by the OS and interprets the date string in UTC.

Parameters:
  • datestr (str) – A string presenting the date, such as “2010 May 01”

  • fmt (str) – Format to interpret strings. Example: “%Y %b %d”

Returns:

time since epoch in seconds

Return type:

int

dbdreader.dbdreader.toDec(x, y=None)

NMEA style to decimal degree converter

Parameters:
  • x (float) – latitiude or longitude in NMEA format

  • y (float, optional) – latitiude or longitude in NMEA format

Returns:

decimal latitude (longitude) or tuple of decimal latitude and longitude

Return type:

float or tuple of floats

dbdreader.decompress module

class dbdreader.decompress.BytesIORW(source)

Bases: object

Helper class implementing a BytesIO buffer that can be written to and read from.

Note that the methods write() and readline() are implemented only.

readline()
write(b)
class dbdreader.decompress.CompressedFile(filename)

Bases: object

Class to access a compressed file, providing a method

readline() that returns a decompressed line of data. The compressed file is read block by block, as long as needed.

The main reason for the class is to be able to read the header of a compressed glider data file.

close()
readline()
readlines()
seek(offset)
tell()
class dbdreader.decompress.Decompressor(filename=None, fp=None)

Bases: object

Class to decompress glider files

Parameters:
  • filename (str) – name of file to decompress

  • manager (This class is designed to be used with a context) –

  • d (>>> with Decompressor(filename) as) – data = d.decompress()

  • Alternatively

  • priori (a file can be opened a) –

>>> d = Decompressor()
>>> fd = open(filename ,'rb')
>>> data = d.decompress(fd)
CHUNKSIZE = 32768
COMPRESSION_FACTOR = 10
ENDIANESS = 'big'
SIZEFIELDSIZE = 2
decompress(fp=None)

Decompresses a an entire file (in memory)

Parameters:

fp (file descriptor or None) – file descriptor to use. If None (default), the file descriptor assigned by the constructor is used.

Returns:

decompressed file data as bytes

Return type:

bytes

decompressed_blocks(n=None, fp=None)

Generator that returns decompressed data blocks

Parameters:
  • n (int or None) – limits the number of blocks read and returned. If None (default) all blocks are returned

  • fp (file descriptor or None) – file descriptor to use. If None (default), the file descriptor assigned by the constructor is used.

Yields:

bytes – decompressed data block

class dbdreader.decompress.FileDecompressor

Bases: object

Class that provides an easy way to automatically decompress a compressed glider

data file and write it into a normal binary data file.

The factual decompressing is done by decompress method.

Example

>>> FileDecompressor.decompress("01600000.dcd")

which would result in the writing of a decompressed file 01600000.dbd.

decompress(filename)

Decompresses a file

Parameters:

filename (str) – (compressed) filename

Returns:

uncompressed filename

Return type:

str

dbdreader.decompress.decompress_file(filename)

Decompreses a glider data file and writes the normal binary file.

dbdreader.decompress.is_compressed(filename)

dbdreader.scripts module

dbdreader.scripts.cac_gen()

A script to generate cac files

run cac_gen -h for more help.

dbdreader.scripts.dbdrename()

Standalone script to rename dbd files

Use dbdrename -h for help.

Module contents