data.readers.bufr.bufr

Classes

BUFRList

Represents a list of

BUFRMessage

Represents a BUFR message in a BUFR file.

Module Contents

class data.readers.bufr.bufr.BUFRList(*args, **kwargs)

Bases: BUFRListMixIn, earthkit.data.core.index.Index

Represents a list of BUFRMessages.

batched(n)

Iterate through the object in batches of n.

Parameters:

n (int) – Batch size.

Returns:

Returns an iterator yielding batches of n elements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer than n elements.

Return type:

object

abstract bounding_box()

Return the bounding box.

config(name)
abstract datetime()

Return datetime.

default_encoder()
group_by(*keys, sort=True)

Iterate through the object in groups defined by metadata keys.

Parameters:
  • *keys (tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.

  • sort (bool, optional) – If True (default), the object is sorted by the metadata keys before grouping. Sorting is only applied if the object is supporting the sorting operation.

Returns:

Returns an iterator yielding batches of elements grouped by the metadata keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of the keys change.

Return type:

object

head(n=5, **kwargs)

Generates a list like summary of the first n BUFRMEssages using a set of metadata keys. Same as calling ls with n.

Parameters:
  • n (int, None) – The number of messages (n > 0) to be printed from the front.

  • **kwargs (dict, optional) – Other keyword arguments passed to ls.

Returns:

See ls.

Return type:

Pandas DataFrame

Notes

The following calls are equivalent:

ds.head()
ds.head(5)
ds.head(n=5)
ds.ls(5)
ds.ls(n=5)
abstract isel(*args, **kwargs)

Uses metadata value indices to select a subset of the elements from a fieldlist-like object.

Parameters:
  • *args (tuple) – Positional arguments specifying the filter conditions. (See below for details).

  • **kwargs (dict, optional) – Other keyword arguments specifying the metadata keys to perform the filtering on. (See below for details).

Returns:

Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.

Return type:

object

Notes

isel works similarly to sel but conditions are specified by indices of metadata keys. A metadata index stores the unique, sorted values of the corresponding metadata key from all the fields in the input data. If the object is a obj:FieldList <data.readers.grib.index.FieldList> to list the indices that have more than one values use FieldList.indices(), or to find out the values of a specific index use FieldList.index().

Filter conditions are specified by a set of metadata keys either by a dictionary (in *args) or a set of **kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:

  • single index:

    ds.sel(param=1)
    
  • list of indices:

    ds.sel(param=[1, 3])
    
  • slice of values (behaves like normal Python indexing, stop value not included):

    # filter levels on level indices 1 and 2
    ds.sel(level=slice(1,3))
    

Examples

>>> import earthkit.data
>>> ds = earthkit.data.from_source("file", "docs/examples/tuv_pl.grib")
>>> len(ds)
18
>>> ds.indices
{'levelist': (1000, 850, 700, 500, 400, 300), 'param': ('t', 'u', 'v')}
>>> subset = ds.isel(param=0)
>>> len(ds)
6
>>> for f in subset:
...     print(f)
...
GribField(t,1000,20180801,1200,0,0)
GribField(t,850,20180801,1200,0,0)
GribField(t,700,20180801,1200,0,0)
GribField(t,500,20180801,1200,0,0)
GribField(t,400,20180801,1200,0,0)
GribField(t,300,20180801,1200,0,0)
>>> subset = ds.isel(param=[1, 2], level=slice(2, 4))
>>> len(subset)
4
>>> for f in subset:
...     print(f)
...
GribField(u,700,20180801,1200,0,0)
GribField(v,700,20180801,1200,0,0)
GribField(u,500,20180801,1200,0,0)
GribField(v,500,20180801,1200,0,0)
ls(*args, **kwargs)

Generates a list like summary of the BUFR message list using a set of metadata keys.

Parameters:
  • n (int, None) – The number of BUFRMEssages to be listed. None means all the messages, n > 0 means messages from the front, while n < 0 means messages from the back of the list.

  • keys (list of str, dict, None) –

    Metadata keys. To specify a column title for each key in the output use a dict. If keys is None the following dict will be used to define the titles and the keys:

    {
        "edition": "edition",
        "type": "dataCategory",
        "subtype": "dataSubCategory",
        "c": "bufrHeaderCentre",
        "mv": "masterTablesVersionNumber",
        "lv": "localTablesVersionNumber",
        "subsets": "numberOfSubsets",
        "compr": "compressedData",
        "typicalDate": "typicalDate",
        "typicalTime": "typicalTime",
        "ident": "ident",
        "lat": "localLatitude",
        "lon": "localLongitude",
    }
    

  • extra_keys (list of str, dict, None) – List of additional keys to keys. To specify a column title for each key in the output use a dict.

Returns:

DataFrame with one row per BUFRMEssage.

Return type:

Pandas DataFrame

Examples

BUFR: using TEMP data

metadata(*args, **kwargs)

Returns the metadata values for each message.

Parameters:
Returns:

List with one item per BUFRMessage

Return type:

list

order_by(*args, **kwargs)

Changes the order of the messages in a BUFRList object.

Parameters:
  • *args (tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)

  • **kwargs (dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)

Returns:

Returns a new object with reordered messages. It contains a view to the data in the original object, so no data is copied.

Return type:

object

property parent
sel(*args, **kwargs)

Uses header metadata values to select only certain messages from a BUFRList object.

Parameters:
  • *args (tuple) – Positional arguments specifying the filter condition as dict. (See below for details).

  • **kwargs (dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).

Returns:

Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.

Return type:

object

Notes

Filter conditions are specified by a set of metadata keys either by a dictionary (in *args) or a set of **kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:

  • single value:

    ds.sel(dataCategory="2")
    
  • list of values:

    ds.sel(dataCategory=[1, 2])
    
  • slice of values (defines a closed interval, so treated as inclusive of both the start

and stop values, unlike normal Python indexing):

# filter dataCategory between 1 and 4 inclusively
ds.sel(dataCategory=slice(1,4))
tail(n=5, **kwargs)

Generates a list like summary of the last n BUFRMEssages using a set of metadata keys. Same as calling ls with -n.

Parameters:
  • n (int, None) – The number of messages (n > 0) to be printed from the back.

  • **kwargs (dict, optional) – Other keyword arguments passed to ls.

Returns:

See ls.

Return type:

Pandas DataFrame

Notes

The following calls are equivalent:

ds.tail()
ds.tail(5)
ds.tail(n=5)
ds.ls(-5)
ds.ls(n=-5)
abstract to_pandas(**kwargs)

Convert into a pandas dataframe

to_target(target, *args, **kwargs)

Write data into the specified target.

abstract to_xarray(**kwargs)

Convert into an xarray dataset

unique_values(*coords, remapping=None, patches=None, progress_bar=False)

Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes

class data.readers.bufr.bufr.BUFRMessage(path, offset, length)

Bases: earthkit.data.core.Base

Represents a BUFR message in a BUFR file.

Parameters:
  • path (str) – Path to the BUFR file

  • offset (number) – File offset of the message (in bytes)

  • length (number) – Size of the message (in bytes)

batched(*args)

Return iterator for batches of data

abstract bounding_box()

Return the bounding box.

abstract datetime()

Return datetime.

dump(subset=1)

Generates a dump with the message content represented as a tree view in a Jupyter notebook.

Parameters:

subset (int) – Subset to dump. Please note that susbset indexing starts at 1. Use None to dump all the subsets in the message.

Returns:

Dump contents represented as a tree view in a Jupyter notebook.

Return type:

HTML

Examples

BUFR: using TEMP data

group_by(*args)

Return iterator for batches of data grouped by metadata keys

property handle

Gets an object providing access to the low level BUFR message structure.

Type:

CodesHandle

is_compressed()

Checks if the BUFR message contains compressed subsets.

is_coord(key)

Check if the specified key is a BUFR coordinate descriptor

Parameters:

key (str) – Key name (can contain ecCodes rank)

Returns:

True if the specified key is a BUFR coordinate descriptor

Return type:

bool

is_uncompressed()

Checks if the BUFR message contains uncompressed subsets.

abstract isel(*args, **kwargs)
message()

Returns a buffer containing the encoded message.

Return type:

bytes

metadata(*keys, astype=None, **kwargs)

Returns metadata values from the BUFR message. When the message in packed (default state) only the header keys are available. To access the data section keys you need to call unpack.

Parameters:
  • *keys (tuple) – Positional arguments specifying metadata keys. Only ecCodes BUFR keys can be used here. It can contain a single str or a list or tuple. Can be empty, in this case all the keys will be used.

  • astype (type name, list or tuple) – Return types for keys. A single value is accepted and applied to all the keys. Otherwise, must have same the number of elements as keys. Only used when keys is not empty.

  • **kwargs (tuple, optional) –

    Other keyword arguments:

    • default: value, optional

      Specifies the same default value for all the keys specified. When default is not present and a key is not found or its value is a missing value metadata will raise KeyError.

Returns:

  • when keys is not empty:
    • single value when keys is a str

    • otherwise the same type as that of keys (list or tuple)

  • when keys is empty:
    • returns a dict with one item per key

Return type:

single value, list, tuple or dict

Raises:

KeyError – If no default is set and a key is not found in the message or it has a missing value.

Examples

>>> import earthkit.data
>>> ds = earthkit.data.from_source("file", "docs/examples/temp_10.bufr")
>>> ds[0].metadata("edition")
3
>>> ds[0].metadata("dataCategory", "dataSubCategory")
(2, 101)
abstract order_by(*args, **kwargs)

Reorder the elements of the object.

pack()

Encodes the data section of the message. Having called pack the contents of only the header keys become available via metadata. To access the data section you need to use unpack again.

See also

unpack

abstract sel(*args, **kwargs)

Filter the object based on metadata.

subset_count()

Returns the number of subsets in the given BUFR message.

abstract to_pandas(**kwargs)

Convert into a pandas dataframe

abstract to_target(*args, **kwargs)

Write data into the specified target.

abstract to_xarray(**kwargs)

Convert into an xarray dataset

unique_values(*coords, remapping=None, patches=None, progress_bar=False)

Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes

unpack()

Decodes the data section of the message. When a message is unpacked all the keys in the data section become available via metadata.

See also

unpack

write(f)

Writes the message to a file object.

Parameters:

f (file object) – The target file object.