dbdreader package
Submodules
dbdreader.dbdreader module
- class dbdreader.dbdreader.DBD(filename, cacheDir=None, skip_initial_line=True)
Bases:
object
Class to read a single DBD type file
- Parameters:
filename (str) – dbd filename
cachedDir (str or None, optional) – path to CAC file cache directory. If None, the default path is used.
skip_initial_line (bool, default: True) – controls the behaviour of the binary reader: if set to True, all first lines of data in the binary files are skipped otherwise they are read. Default value is True, as the data in the initial file have usually no scienitific merit (random value or arbitrarily old); only for debugging purposes one may want to have the initial data line read.
- SKIP_INITIAL_LINE = True
- close()
Closes a DBD file
- get(*parameters, decimalLatLon=True, discardBadLatLon=True, return_nans=False, max_values_to_read=-1)
Returns time and parameter data for requested parameter
This method reads the requested parameter, and convert it optionally to decimal format if the parameter is latitude-like or longitude-like
- Parameters:
*parameters (variable length list of str) – parameter name
decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.
discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.
return_nans (bool, optional) – if True, nans are returned for timestamps the variable was not updated or changed.
max_values_to_read (int, optional) – if > 0, reading is stopped after this many values have been read.
- Returns:
time vector (in seconds) and value vector
- Return type:
tuple of (ndarray, ndarray) for each parameter requested.
- Raises:
DbdError when the requested parameter(s) cannot be read. –
Changed in version 0.4.0: Multi parameters can be passed, giving a time,value tuple for each parameter.
Changed in version 0.5.5: For a single parameter request, the number of values to be read can be limited.
- get_fileopen_time()
Returns the time stamp of opening the file in UTC
- get_list(*parameters, decimalLatLon=True, discardBadLatLon=True, return_nans=False)
Returns time and value tuples for a list of requested parameters
This method returns time and values tuples for a list of parameters. It is basically a short-hand for a looped get() method.
Note that each parameter comes with its own time base. No interpolation is done. Use get_sync() for that in stead.
- Parameters:
*parameters (list of str) – list of parameter names
decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.
discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.
return_nans (bool) – If True, nan’s are returned for those timestamps where no new value is available. Default value: False
- Returns:
list of (ndarray, ndarray) – list of tuples of time and value vectors for each parameter requested.
.. deprecated:: 0.4.0
.. note:: – This function will be removed in a future version. Use .get() instead.
- get_mission_name()
Returns the mission name such as micro.mi
- get_sync(*sync_parameters, decimalLatLon=True, discardBadLatLon=True)
- Returns a list of values from parameters, all interpolated to the
time base of the first paremeter
This method is used if a number of parameters should be interpolated onto the same time base.
- Parameters:
*sync_parameters (variable length list of str) – parameter names. Minimal length is 2. The time base of the first parameter is used to interpolate all other parameters onto.
decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.
discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.
- Returns:
(ndarray, ndarray, …) – Time vector (of first parameter), values of first parmaeter, and interpolated values of subsequent parameters.
Example – get_sync(‘m_water_pressure’,’m_water_cond’,’m_water_temp’)
Notes
Changed in version 0.4.0: Calling signature has changed from the sync parameters passed on as a list, to passed on as parameters.
- get_xy(parameter_x, parameter_y, decimalLatLon=True, discardBadLatLon=True)
Returns values of parameter_x and paramter_y
For parameters parameter_x and parameter_y this method returns a tuple with the values of both parameters. If necessary, the time base of parameter_y is interpolated onto the one of parameter_x.
- Parameters:
parameter_x (str) – parameter name of x-parameter
parameter_y (str) – parameter name of y-parameter
decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.
discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.
- Returns:
tuple of value vectors
- Return type:
(ndarray, ndarray)
- has_parameter(parameter)
Check wheter this file contains parameter
- Parameters:
parameter (str) – parameter to check
- Returns:
True if parameter is in the list, or False if not
- Return type:
bool
- class dbdreader.dbdreader.DBDHeader
Bases:
object
Class to read the headers of DBD files. This file is typically used by DBD and MultiDBD and not directly.
- property factored
- parse(line)
- read_cache(fp, fpcopy=None)
read cache file
- read_header(fp, filename='')
read the header of the file, given by fp
- class dbdreader.dbdreader.DBDList(*p)
Bases:
list
List that properly sorts dbd files.
Object subclassed from list. The sort method defaults to sorting dbd files and friends in the right order.
- Parameters:
*p (variable length list of str) – filenames
- REGEX = re.compile('-[0-9]*-[0-9]*-[0-9]*\\.[demnstDEMNST][bB][dD]')
- sort(cmp=None, key=None, reverse=False)
sorts filenames ensuring dbd files are in chronological order in place
- Parameters:
cmp – ingored keyword (for compatibility reasons only)
key – ignored keyword (for compatibility reasons only)
reverse (bool) – If True, performs a reverse sort.
- class dbdreader.dbdreader.DBDPatternSelect(date_format='%d %m %Y', cacheDir=None)
Bases:
object
Selecting DBD files.
A class for selecting dbd files based on a date condition. The class opens files and reads the headers only.
- Parameters:
date_format (str, optional) – date format used to interpret date strings.
Note
Times are based on the opening time of the file only.
- bins(pattern=None, filenames=None, binsize=86400, t_start=None, t_end=None)
Return a list of filenames, in time bins
The method makes a list of all filenames, matching either pattern or filenames and bins these in time windows of width. If t_start and t_end are not given, they are computed from the first and last timestamps of the files specified, respectively.
This method returns a list of tuples, where each tuple contains the centred time of the bin, and a list of all filenames that fall within this bin.
- Parameters:
pattern (str) – search pattern (as used in glob)
filenames (list of str) – filename list
binsize (float) – binsize of in seconds
t_start (None or float) – Timestamp in seconds since 1/1/1970
t_end (None or float) – Timestamp in seconds since 1/1/1970
- Returns:
list of filenames, grouped per bin
- Return type:
list of list of str
- Raises:
ValueError if nor pattern or filenames is given. –
- cache = {}
- get_date_format()
Returns date format string.
- Returns:
date format string
- Return type:
str
- get_filenames(pattern, filenames, cacheDir=None)
Get filenames (sorted) and update CAC cache directory.
- Parameters:
pattern (str) – search pattern (as used in glob)
filenames (list of str) – list of filenames
- Returns:
sorted list of filenames.
- Return type:
list of str
- select(pattern=None, filenames=[], from_date=None, until_date=None)
Select file names from pattern or list.
This method selects the filenames given a filename list or search pattern and given time limits.
- Parameters:
pattern (str) – search pattern (passed to glob) to find filenames
filenames (list of str) – filename list
from_date (None or str, optional) – date used as start date criterion. If None, all files are included until the until_date.
until_date (None or str, optional) – date used aas end date criterion. If None, all files after from_date are included.
- Returns:
list of filenames that match the criteria
- Raises:
ValueError if nor pattern or filenames is given. –
Note
Either pattern or filenames should be supplied, and at least one of from_date and until_date.
- set_date_format(date_format)
Set date format
Sets the date format used to interpret the from_date and until_dates.
- Parameters:
date_format (str) – format to interpret date strings. Example “%H %d %m %Y”
cachedDir (str or None, optional) – path to CAC file cache directory. If None, the default path is used.
- exception dbdreader.dbdreader.DbdError(value=9, mesg=None, data=None)
Bases:
Exception
- class dbdreader.dbdreader.MultiDBD(filenames=None, pattern=None, cacheDir=None, complemented_files_only=False, complement_files=False, banned_missions=[], missions=[], max_files=None, skip_initial_line=True)
Bases:
object
Opens multiple dbd files for reading
This class is intended for reading multiple dbd files and treating them as one.
- Parameters:
filenames (list of str or None) – list of filenames to open
pattern (str or None) – search pattern as passed to glob
cacheDir (str or None) – path to directory with CAC cache files (None: the default directory is used)
complemented_files_only (bool) – if True, only those files are retained for which both engineering and science data files are available.
complement_files (bool) – If True automatically include matching [de]bd files
banned_missions (list of str) – List of mission names that should be disregarded.
missions (list of str) – List of missions names that should be considered only.
maxfiles (int) –
- maximum number of files to be read, where
>0: the first n files are read <0: the last n files are read.
skip_initial_line (bool (default: True)) – If True, the first data line in each dbd file (and friends) is not read.
Notes
Upon creating the dbd file, when starting a new mission or dive segment, all parameters are written and marked as updated. In reality, most parameters are NOT update, and the value written is the value in memory, which may be several minutes old, or even longer. It has been pointed out to me that a handful parameters, are set only once, before creating the dbd file. Since these parameters are not of interest for normal data processing, the first line of data is skipped by default, but can be read if required.
Changed in version 0.4.0: ensure_paired and included_paired keywords have been replaced by complemented_files_only and complement_files, respectively.
- close()
Close all open files
- determine_ctd_type()
Determines CTD type installed from the presence of CTD specific name for the time stamp.
- Returns:
string – {“ctd41cp”, “rbrctd”}
If unable to get a positive CTD identification, it is assumed the CTD installed is a Seabird
CTD, returning “ctd41cp”.
Notes
New in version 0.5.5.
- get(*parameters, decimalLatLon=True, discardBadLatLon=True, return_nans=False, include_source=False, max_values_to_read=-1)
Returns time and value tuple(s) for requested parameter(s)
This method returns time and values tuples for a list of parameters.
Note that each parameter comes with its own time base. No interpolation is done. Use get_sync() for that in stead.
- Parameters:
parameter_list (list of str) – list of parameter names
decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.
discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.
return_nans (bool) – If True, nan’s are returned for those timestamps where no new value is available. Default value: False
include_source (bool, optional) –
If True, a list with a reference for each data point to the DBD object, where the datapoint originated from. If called with a single parameter, a tuple of a Nx2 array with data and a list of N elements with refrences to a DBD object. If called for more parameters, a list of such tuples is returned.
Default value: False
max_values_to_read (int, optional) – if > 1, then reading is stopped after this many values have been read. Default value : -1
- Returns:
(ndarray, ndarray) or
((ndarray, ndarray), list) or
[(ndarray, ndarray), (ndarray, ndarray), …]
[((ndarray, ndarray), list), ((ndarray, ndarray), list), …] – for a single parameter, for a single parameter, including source file list, for multiple parameters, for multiple parameters, including source file list, respectively.
.. versionchanged:: 0.5.5 For a single parameter request, the number of values to be read can be limited.
- get_CTD_sync(*parameters, decimalLatLon=True, discardBadLatLon=True)
Returns a list of values from CTD and optionally other parameters, all interpolated to the time base of the CTD timestamp.
- Parameters:
*parameters (variable length list of str) – names of parameters to be read additionally
decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.
discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.
- Returns:
Time vector (of first parameter), C, T and P values, and interpolated values of subsequent parameters.
- Return type:
(ndarray, ndarray, …)
Notes
New in version 0.4.0.
- get_global_time_range(fmt='%d %b %Y %H:%M')
Returns start and end dates of data set (all files)
- Parameters:
fmt (str) – String that determines how the time string is formatted.
- Returns:
tuple with formatted time strings
- Return type:
(str, str)
- get_sync(*parameters, decimalLatLon=True, discardBadLatLon=True)
- Returns a list of values from parameters, all interpolated to the
time base of the first paremeter
This method is used if a number of parameters should be interpolated onto the same time base.
- Parameters:
*parameters (variable length list of str) – parameter names. Minimal length is 2. The time base of the first parameter is used to interpolate all other parameters onto.
decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.
discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.
- Returns:
(ndarray, ndarray, …) – Time vector (of first parameter), values of first parmaeter, and interpolated values of subsequent parameters.
Example – get_sync(‘m_water_pressure’,’m_water_cond’,’m_water_temp’)
Notes
Changed in version 0.4.0: Calling signature has changed from the sync parameters passed on as a list, to passed on as parameters.
- get_time_range(fmt='%d %b %Y %H:%M')
Get start and end date of the time range selection set
- Parameters:
fmt (str) – String that determines how the time string is formatted
- Returns:
Tuple with formatted time strings
- Return type:
(str, str)
- get_xy(parameter_x, parameter_y, decimalLatLon=True, discardBadLatLon=True)
Returns values of parameter_x and paramter_y
For parameters parameter_x and parameter_y this method returns a tuple with the values of both parameters. If necessary, the time base of parameter_y is interpolated onto the one of parameter_x.
- Parameters:
parameter_x (str) – parameter name of x-parameter
parameter_y (str) – parameter name of y-parameter
decimalLatLon (bool, optional) – If True (default), latitiude and longitude related parameters are converted to decimal format, as opposed to nmea format.
discardBadLatLon (bool, optional) – If True (default), bogus latitiude and longitude values are ignored.
- Returns:
tuple of value vectors
- Return type:
(ndarray, ndarray)
- has_parameter(parameter)
Has this file parameter? :returns: True if this instance has found parameter :rtype: bool
- classmethod isScienceDataFile(fn)
Is file a science file?
- Parameters:
fn (str) – filename
- Returns:
True if file fn is a science file
- Return type:
bool
- set_skip_initial_line(skip_initial_line)
Sets the reading mode of the binary reader to skip the initial data entry or not.
- Parameters:
skip_initial_line (bool) – Sets the attribute skip_initial_line of each DBD instance, controlling the reading of the first data entry of each binary file.
- set_time_limits(minTimeUTC=None, maxTimeUTC=None)
Set time limits for data to be returned by get() and friends.
- Parameters:
minTimeUTC (str) – start time in UTC
maxTimeUTC (str) – end time in UTC
Notes
{minTimeUTC, maxTimeUTC} are expected in one of these formats:
“%d %b %Y” 3 Mar 2014
or
“%d %b %Y %H:%M” 4 Apr 2014 12:21
- dbdreader.dbdreader.epochToDateTimeStr(seconds, dateformat='%Y%m%d', timeformat='%H:%M')
Converts seconds since Epoch to date string
This function converts seconds since Epoch to a datestr and timestr with user configurable formats.
- Parameters:
seconds (float or int) – seconds since Epoch
dateformat (str) – string defining how the date string should be formatted
timeformat (str) – string defining how the time string should be formatted
- Returns:
datestring and timestring
- Return type:
(str, str)
- dbdreader.dbdreader.strptimeToEpoch(datestr, fmt)
Converts datestr into seconds
Function to convert a date string into seconds since Epoch. This function is not affected by the time zone used by the OS and interprets the date string in UTC.
- Parameters:
datestr (str) – A string presenting the date, such as “2010 May 01”
fmt (str) – Format to interpret strings. Example: “%Y %b %d”
- Returns:
time since epoch in seconds
- Return type:
int
- dbdreader.dbdreader.toDec(x, y=None)
NMEA style to decimal degree converter
- Parameters:
x (float) – latitiude or longitude in NMEA format
y (float, optional) – latitiude or longitude in NMEA format
- Returns:
decimal latitude (longitude) or tuple of decimal latitude and longitude
- Return type:
float or tuple of floats
dbdreader.decompress module
- class dbdreader.decompress.BytesIORW(source)
Bases:
object
Helper class implementing a BytesIO buffer that can be written to and read from.
Note that the methods write() and readline() are implemented only.
- readline()
- write(b)
- class dbdreader.decompress.CompressedFile(filename)
Bases:
object
Class to access a compressed file, providing a method
readline() that returns a decompressed line of data. The compressed file is read block by block, as long as needed.
The main reason for the class is to be able to read the header of a compressed glider data file.
- close()
- readline()
- readlines()
- seek(offset)
- tell()
- class dbdreader.decompress.Decompressor(filename=None, fp=None)
Bases:
object
Class to decompress glider files
- Parameters:
filename (str) – name of file to decompress
manager (This class is designed to be used with a context) –
d (>>> with Decompressor(filename) as) – data = d.decompress()
Alternatively –
priori (a file can be opened a) –
>>> d = Decompressor() >>> fd = open(filename ,'rb') >>> data = d.decompress(fd)
- CHUNKSIZE = 32768
- COMPRESSION_FACTOR = 10
- ENDIANESS = 'big'
- SIZEFIELDSIZE = 2
- decompress(fp=None)
Decompresses a an entire file (in memory)
- Parameters:
fp (file descriptor or None) – file descriptor to use. If None (default), the file descriptor assigned by the constructor is used.
- Returns:
decompressed file data as bytes
- Return type:
bytes
- decompressed_blocks(n=None, fp=None)
Generator that returns decompressed data blocks
- Parameters:
n (int or None) – limits the number of blocks read and returned. If None (default) all blocks are returned
fp (file descriptor or None) – file descriptor to use. If None (default), the file descriptor assigned by the constructor is used.
- Yields:
bytes – decompressed data block
- class dbdreader.decompress.FileDecompressor
Bases:
object
- Class that provides an easy way to automatically decompress a compressed glider
data file and write it into a normal binary data file.
The factual decompressing is done by decompress method.
Example
>>> FileDecompressor.decompress("01600000.dcd")
which would result in the writing of a decompressed file 01600000.dbd.
- decompress(filename)
Decompresses a file
- Parameters:
filename (str) – (compressed) filename
- Returns:
uncompressed filename
- Return type:
str
- dbdreader.decompress.decompress_file(filename)
Decompreses a glider data file and writes the normal binary file.
- dbdreader.decompress.is_compressed(filename)
dbdreader.scripts module
- dbdreader.scripts.cac_gen()
A script to generate cac files
run cac_gen -h for more help.
- dbdreader.scripts.dbdrename()
Standalone script to rename dbd files
Use dbdrename -h for help.