earthkit.data.readers.bufr.file¶
Attributes¶
Classes¶
Represent a list of |
|
Represent a list of |
|
Base class for all sources. |
|
Represent a list of |
|
Represent a list of |
|
Base class for all sources. |
Module Contents¶
- class earthkit.data.readers.bufr.file.BUFRList(*args, **kwargs)¶
Bases:
earthkit.data.featurelist.indexed.IndexFeatureListBaseRepresent a list of
BUFRMessages.- batched(n)¶
Iterate through the object in batches of
n.- Parameters:
n (
int) – Batch size.- Returns:
Returns an iterator yielding batches of
nelements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer thannelements.- Return type:
object
- describe(*args, **kwargs)¶
Generate a summary of the fieldlist.
- get(keys, default=None, astype=None, raise_on_missing=False, output='auto', group_by_key=False, flatten_dict=False)¶
Return values for the specified keys from all the messages.
- Parameters:
keys (
str,list,tuple) – Specify the metadata keys to extract. Can be a single key (str) or multiple keys as a list/tuple of str. Keys are assumed to be of the form “component.key”. For example, “time.valid_datetime” or “parameter.name”. It is also allowed to specify just the component name like “time” or “parameter”. In this case the corresponding component’sto_dict()method is called and its result is returned. For other keys, the method looks for them in the private components of the fields (if any) and returns the value from the first private component that contains it.default (
Any,None) – Specify the default value(s) forkeys. Returned when the given key is not found andraise_on_missingis False. Whendefaultis a single value, it is used for all the keys. Otherwise it must be a list/tuple of the same length askeys.astype (
type as str,intorfloat) – Return type forkeys. Whenastypeis a single type, it is used for all the keys. Otherwise it must be a list/tuple of the same length askeys.raise_on_missing (
bool) – When True, raises KeyError if any ofkeysis not found.output (
type,str) –Specify the output structure type in conjunction with
group_by_key. Whengroup_byis False (default) the output is a list with one item per field andoutputhas the following effect on the items:- ”auto” (default):
when
keysis a str returns a single value per fieldwhen
keysis a list/tuple returns a list/tuple of values per field
list or “list”: returns a list of values per field.
tuple or “tuple”: returns a tuple of values per field.
dict or “dict”: returns a dictionary with keys and their values per field.
When
group_by_keyis True the output is grouped by key as follows and return an object with one item per key. The item contains the list of values for that key from all the fields. Whenoutputis dict a dict is returned otherwise list.group_by_key (
bool) – When True the output is grouped by key as described inoutput.flatten_dict (
bool) – When True andoutputis dict, for each field if any of the values in the returned dict is itself a dict, it is flattened to depth 1 by concatenating the keys with a dot. For example, if the returned dict is{"a": {"x": 1, "y": 2}, "b": 3}, it becomes{"a.x": 1, "a.y": 2, "b": 3}. This option is ignored whenoutputis not dict.remapping (
dict, optional) –Create new metadata keys from existing ones. E.g. to define a new key “param_level” as the concatenated value of the “parameter.variable” and “vertical.level” keys use:
remapping={"param_level": "{parameter.variable}{vertical.level}"}
patch (
dict, optional) – A dictionary of patch to be applied to the returned values.
- Returns:
The returned value depends on the
outputandgroup_by_keyparameters. See above.- Return type:
list,dict- Raises:
KeyError – If
raise_on_missingis True and any ofkeysis not found.
Examples
>>> import earthkit.data >>> ds = earthkit.data.from_source("file", "docs/how-tos/test.grib") >>> ds.get("parameter.variable") ['2t', 'msl'] >>> ds.get(["parameter.variable", "parameter.units"]) [('2t', 'K'), ('msl', 'Pa')] >>> ds.get(("parameter.variable", "parameter.units")) [['2t', 'K'], ['msl', 'Pa']]
- graph(depth=0)¶
- group_by(*keys, sort=True)¶
Iterate through the object in groups defined by metadata keys.
- Parameters:
*keys (
tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.sort (
bool, optional) – IfTrue(default), the object is sorted by the metadatakeysbefore grouping. Sorting is only applied if the object is supporting the sorting operation.
- Returns:
Returns an iterator yielding batches of elements grouped by the metadata
keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of thekeyschange.- Return type:
object
- head(n=5, **kwargs)¶
Generate a list like summary of the first
nBUFRMEssages using a set of metadata keys. Same as callinglswithn.- Parameters:
n (
int,None) – The number of messages (n> 0) to be printed from the front.**kwargs (
dict, optional) – Other keyword arguments passed tols.
- Returns:
See
ls.- Return type:
Pandas DataFrame
Notes
The following calls are equivalent:
ds.head() ds.head(5) ds.head(n=5) ds.ls(5) ds.ls(n=5)
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- ls(n=None, keys='default', extra_keys=None)¶
Generate a list like summary of the BUFR message list using a set of metadata keys.
- Parameters:
n (
int,None) – The number ofBUFRMEssages to be listed.Nonemeans all the messages,n > 0means messages from the front, whilen < 0means messages from the back of the list.keys (
listofstr,dict,None) –Metadata keys. To specify a column title for each key in the output use a dict with keys as the metadata keys and values as the column titles. If
keysis None the following dict will be used to define the titles and the keys:[ "edition", "dataCategory", "dataSubCategory", "bufrHeaderCentre", "masterTablesVersionNumber", "localTablesVersionNumber", "numberOfSubsets", "compressedData", "typicalDate", "typicalTime", "ident", "localLatitude", "localLongitude", ]
extra_keys (
listofstr,dict,None) – List of additional keys to ``keys``s. To specify a column title for each key in the output use a dict.
- Returns:
DataFrame with one row per
BUFRMEssage.- Return type:
Pandas DataFrame
Examples
- classmethod merge(sources)¶
- metadata(*args, **kwargs)¶
Return the metadata values for each message.
- Parameters:
*args (
tuple) – Positional arguments defining the metadata keys. Passed toBUFRMessage.metadata()**kwargs (
dict, optional) – Keyword arguments passed toBUFRMessage.metadata()
- Returns:
List with one item per
BUFRMessage- Return type:
list
- mutate()¶
- mutate_source()¶
- name = None¶
- classmethod new_mask_index(*args, **kwargs)¶
- order_by(*args, **kwargs)¶
Change the order of the messages in a BUFRList object.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)**kwargs (
dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)
- Returns:
Returns a new object with reordered messages. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
- property parent¶
The parent source, if any.
- sel(*args, remapping=None, **kwargs)¶
Use header metadata values to select only certain messages from a BUFRList object.
- Parameters:
*args (
tuple) – Positional arguments specifying the filter condition as dict. (See below for details).**kwargs (
dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).
- Returns:
Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Notes
Filter conditions are specified by a set of metadata keys either by a dictionary (in
*args) or a set of**kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:single value:
ds.sel(dataCategory="2")
list of values:
ds.sel(dataCategory=[1, 2])
slice of values (defines a closed interval, so treated as inclusive of both the start
and stop values, unlike normal Python indexing):
# filter dataCategory between 1 and 4 inclusively ds.sel(dataCategory=slice(1,4))
- source_filename = None¶
- tail(n=5, **kwargs)¶
Generate a list like summary of the last
nBUFRMEssages using a set of metadata keys. Same as callinglswith-n.- Parameters:
n (
int,None) – The number of messages (n> 0) to be printed from the back.**kwargs (
dict, optional) – Other keyword arguments passed tols.
- Returns:
See
ls.- Return type:
Pandas DataFrame
Notes
The following calls are equivalent:
ds.tail() ds.tail(5) ds.tail(n=5) ds.ls(-5) ds.ls(n=-5)
- to_data_object()¶
Convert this source into a data object, if possible.
- to_numpy(*args, **kwargs)¶
- to_pandas(columns=None, filters=None, **kwargs)¶
Extract BUFR data into a pandas DataFrame using pdbufr.
- Parameters:
columns (
str,sequence[str]) – List of ecCodes BUFR keys to extract for each BUFR message/subset. See: pdbufr.read_bufr() for details.filters (
dict) – Defines the conditions when to extract the specifiedcolumns. See: pdbufr.read_bufr() for details.**kwargs (
dict, optional) – Other keyword arguments passed to pdbufr.read_bufr().
- Return type:
Pandas DataFrame
Examples
- to_target(target, *args, **kwargs)¶
- unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)¶
Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to collect unique values for.sort (
bool, optional) – Whether to sort the collected unique values. Default is False.drop_none (
bool, optional) – Whether to drop None values from the collected unique values. Default is True.squeeze (
bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.remapping (
dict, optional) – A dictionary for remapping keys or values during collection. Default is None.patch (
dict, optional) – A dictionary for patching key values during collection. Default is None.progress_bar (
bool, optional) – Whether to display a progress bar during collection. Default is False.cache (
bool, optional) – Whether to use a cached collector. Default is False.
- class earthkit.data.readers.bufr.file.BUFRListInFile(path, parts=None, positions=None)¶
Bases:
BUFRList,earthkit.data.readers.bufr.core.BUFRReaderBaseRepresent a list of
BUFRMessages.- property appendable¶
- batched(n)¶
Iterate through the object in batches of
n.- Parameters:
n (
int) – Batch size.- Returns:
Returns an iterator yielding batches of
nelements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer thannelements.- Return type:
object
- property binary¶
- describe(*args, **kwargs)¶
Generate a summary of the fieldlist.
- property filter¶
- get(keys, default=None, astype=None, raise_on_missing=False, output='auto', group_by_key=False, flatten_dict=False)¶
Return values for the specified keys from all the messages.
- Parameters:
keys (
str,list,tuple) – Specify the metadata keys to extract. Can be a single key (str) or multiple keys as a list/tuple of str. Keys are assumed to be of the form “component.key”. For example, “time.valid_datetime” or “parameter.name”. It is also allowed to specify just the component name like “time” or “parameter”. In this case the corresponding component’sto_dict()method is called and its result is returned. For other keys, the method looks for them in the private components of the fields (if any) and returns the value from the first private component that contains it.default (
Any,None) – Specify the default value(s) forkeys. Returned when the given key is not found andraise_on_missingis False. Whendefaultis a single value, it is used for all the keys. Otherwise it must be a list/tuple of the same length askeys.astype (
type as str,intorfloat) – Return type forkeys. Whenastypeis a single type, it is used for all the keys. Otherwise it must be a list/tuple of the same length askeys.raise_on_missing (
bool) – When True, raises KeyError if any ofkeysis not found.output (
type,str) –Specify the output structure type in conjunction with
group_by_key. Whengroup_byis False (default) the output is a list with one item per field andoutputhas the following effect on the items:- ”auto” (default):
when
keysis a str returns a single value per fieldwhen
keysis a list/tuple returns a list/tuple of values per field
list or “list”: returns a list of values per field.
tuple or “tuple”: returns a tuple of values per field.
dict or “dict”: returns a dictionary with keys and their values per field.
When
group_by_keyis True the output is grouped by key as follows and return an object with one item per key. The item contains the list of values for that key from all the fields. Whenoutputis dict a dict is returned otherwise list.group_by_key (
bool) – When True the output is grouped by key as described inoutput.flatten_dict (
bool) – When True andoutputis dict, for each field if any of the values in the returned dict is itself a dict, it is flattened to depth 1 by concatenating the keys with a dot. For example, if the returned dict is{"a": {"x": 1, "y": 2}, "b": 3}, it becomes{"a.x": 1, "a.y": 2, "b": 3}. This option is ignored whenoutputis not dict.remapping (
dict, optional) –Create new metadata keys from existing ones. E.g. to define a new key “param_level” as the concatenated value of the “parameter.variable” and “vertical.level” keys use:
remapping={"param_level": "{parameter.variable}{vertical.level}"}
patch (
dict, optional) – A dictionary of patch to be applied to the returned values.
- Returns:
The returned value depends on the
outputandgroup_by_keyparameters. See above.- Return type:
list,dict- Raises:
KeyError – If
raise_on_missingis True and any ofkeysis not found.
Examples
>>> import earthkit.data >>> ds = earthkit.data.from_source("file", "docs/how-tos/test.grib") >>> ds.get("parameter.variable") ['2t', 'msl'] >>> ds.get(["parameter.variable", "parameter.units"]) [('2t', 'K'), ('msl', 'Pa')] >>> ds.get(("parameter.variable", "parameter.units")) [['2t', 'K'], ['msl', 'Pa']]
- graph(depth=0)¶
- group_by(*keys, sort=True)¶
Iterate through the object in groups defined by metadata keys.
- Parameters:
*keys (
tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.sort (
bool, optional) – IfTrue(default), the object is sorted by the metadatakeysbefore grouping. Sorting is only applied if the object is supporting the sorting operation.
- Returns:
Returns an iterator yielding batches of elements grouped by the metadata
keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of thekeyschange.- Return type:
object
- head(n=5, **kwargs)¶
Generate a list like summary of the first
nBUFRMEssages using a set of metadata keys. Same as callinglswithn.- Parameters:
n (
int,None) – The number of messages (n> 0) to be printed from the front.**kwargs (
dict, optional) – Other keyword arguments passed tols.
- Returns:
See
ls.- Return type:
Pandas DataFrame
Notes
The following calls are equivalent:
ds.head() ds.head(5) ds.head(n=5) ds.ls(5) ds.ls(n=5)
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- ls(n=None, keys='default', extra_keys=None)¶
Generate a list like summary of the BUFR message list using a set of metadata keys.
- Parameters:
n (
int,None) – The number ofBUFRMEssages to be listed.Nonemeans all the messages,n > 0means messages from the front, whilen < 0means messages from the back of the list.keys (
listofstr,dict,None) –Metadata keys. To specify a column title for each key in the output use a dict with keys as the metadata keys and values as the column titles. If
keysis None the following dict will be used to define the titles and the keys:[ "edition", "dataCategory", "dataSubCategory", "bufrHeaderCentre", "masterTablesVersionNumber", "localTablesVersionNumber", "numberOfSubsets", "compressedData", "typicalDate", "typicalTime", "ident", "localLatitude", "localLongitude", ]
extra_keys (
listofstr,dict,None) – List of additional keys to ``keys``s. To specify a column title for each key in the output use a dict.
- Returns:
DataFrame with one row per
BUFRMEssage.- Return type:
Pandas DataFrame
Examples
- classmethod merge(sources)¶
- property merger¶
- metadata(*args, **kwargs)¶
Return the metadata values for each message.
- Parameters:
*args (
tuple) – Positional arguments defining the metadata keys. Passed toBUFRMessage.metadata()**kwargs (
dict, optional) – Keyword arguments passed toBUFRMessage.metadata()
- Returns:
List with one item per
BUFRMessage- Return type:
list
- mutate()¶
- mutate_source()¶
- name = None¶
- classmethod new_mask_index(*args, **kwargs)¶
- number_of_parts()¶
- order_by(*args, **kwargs)¶
Change the order of the messages in a BUFRList object.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)**kwargs (
dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)
- Returns:
Returns a new object with reordered messages. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
- property parent¶
The parent source, if any.
- part(n)¶
- property parts¶
- path¶
- sel(*args, remapping=None, **kwargs)¶
Use header metadata values to select only certain messages from a BUFRList object.
- Parameters:
*args (
tuple) – Positional arguments specifying the filter condition as dict. (See below for details).**kwargs (
dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).
- Returns:
Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Notes
Filter conditions are specified by a set of metadata keys either by a dictionary (in
*args) or a set of**kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:single value:
ds.sel(dataCategory="2")
list of values:
ds.sel(dataCategory=[1, 2])
slice of values (defines a closed interval, so treated as inclusive of both the start
and stop values, unlike normal Python indexing):
# filter dataCategory between 1 and 4 inclusively ds.sel(dataCategory=slice(1,4))
- property source¶
- source_filename = None¶
- property stream¶
- tail(n=5, **kwargs)¶
Generate a list like summary of the last
nBUFRMEssages using a set of metadata keys. Same as callinglswith-n.- Parameters:
n (
int,None) – The number of messages (n> 0) to be printed from the back.**kwargs (
dict, optional) – Other keyword arguments passed tols.
- Returns:
See
ls.- Return type:
Pandas DataFrame
Notes
The following calls are equivalent:
ds.tail() ds.tail(5) ds.tail(n=5) ds.ls(-5) ds.ls(n=-5)
- to_data_object()¶
Convert this source into a data object, if possible.
- to_numpy(*args, **kwargs)¶
- to_pandas(columns=None, filters=None, **kwargs)¶
Extract BUFR data into a pandas DataFrame using pdbufr.
- Parameters:
columns (
str,sequence[str]) – List of ecCodes BUFR keys to extract for each BUFR message/subset. See: pdbufr.read_bufr() for details.filters (
dict) – Defines the conditions when to extract the specifiedcolumns. See: pdbufr.read_bufr() for details.**kwargs (
dict, optional) – Other keyword arguments passed to pdbufr.read_bufr().
- Return type:
Pandas DataFrame
Examples
- to_target(target, *args, **kwargs)¶
- unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)¶
Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to collect unique values for.sort (
bool, optional) – Whether to sort the collected unique values. Default is False.drop_none (
bool, optional) – Whether to drop None values from the collected unique values. Default is True.squeeze (
bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.remapping (
dict, optional) – A dictionary for remapping keys or values during collection. Default is None.patch (
dict, optional) – A dictionary for patching key values during collection. Default is None.progress_bar (
bool, optional) – Whether to display a progress bar during collection. Default is False.cache (
bool, optional) – Whether to use a cached collector. Default is False.
- class earthkit.data.readers.bufr.file.BUFRReader(source, path, parts=None, positions=None)¶
Bases:
earthkit.data.sources.Source,earthkit.data.readers.bufr.core.BUFRReaderBaseBase class for all sources.
- property appendable¶
- property binary¶
- property filter¶
- graph(depth=0)¶
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- is_streamable_file()¶
- classmethod merge(sources)¶
- property merger¶
- mutate()¶
- mutate_source()¶
- name = None¶
- property parent¶
The parent source, if any.
- property parts¶
- path¶
- property source¶
- source_filename = None¶
- property stream¶
- to_data_object()¶
Convert this source into a data object, if possible.
- to_featurelist()¶
- to_pandas(*args, **kwargs)¶
- to_target(target, *args, **kwargs)¶
- earthkit.data.readers.bufr.file.BUFR_LS_KEYS = ['edition', 'dataCategory', 'dataSubCategory', 'bufrHeaderCentre', 'masterTablesVersionNumber',...¶
- earthkit.data.readers.bufr.file.COLUMNS = ('latitude', 'longitude', 'data_datetime')¶
- class earthkit.data.readers.bufr.file.MaskBUFRList(*args, **kwargs)¶
Bases:
BUFRList,earthkit.data.core.index.MaskIndexRepresent a list of
BUFRMessages.- batched(n)¶
Iterate through the object in batches of
n.- Parameters:
n (
int) – Batch size.- Returns:
Returns an iterator yielding batches of
nelements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer thannelements.- Return type:
object
- describe(*args, **kwargs)¶
Generate a summary of the fieldlist.
- get(keys, default=None, astype=None, raise_on_missing=False, output='auto', group_by_key=False, flatten_dict=False)¶
Return values for the specified keys from all the messages.
- Parameters:
keys (
str,list,tuple) – Specify the metadata keys to extract. Can be a single key (str) or multiple keys as a list/tuple of str. Keys are assumed to be of the form “component.key”. For example, “time.valid_datetime” or “parameter.name”. It is also allowed to specify just the component name like “time” or “parameter”. In this case the corresponding component’sto_dict()method is called and its result is returned. For other keys, the method looks for them in the private components of the fields (if any) and returns the value from the first private component that contains it.default (
Any,None) – Specify the default value(s) forkeys. Returned when the given key is not found andraise_on_missingis False. Whendefaultis a single value, it is used for all the keys. Otherwise it must be a list/tuple of the same length askeys.astype (
type as str,intorfloat) – Return type forkeys. Whenastypeis a single type, it is used for all the keys. Otherwise it must be a list/tuple of the same length askeys.raise_on_missing (
bool) – When True, raises KeyError if any ofkeysis not found.output (
type,str) –Specify the output structure type in conjunction with
group_by_key. Whengroup_byis False (default) the output is a list with one item per field andoutputhas the following effect on the items:- ”auto” (default):
when
keysis a str returns a single value per fieldwhen
keysis a list/tuple returns a list/tuple of values per field
list or “list”: returns a list of values per field.
tuple or “tuple”: returns a tuple of values per field.
dict or “dict”: returns a dictionary with keys and their values per field.
When
group_by_keyis True the output is grouped by key as follows and return an object with one item per key. The item contains the list of values for that key from all the fields. Whenoutputis dict a dict is returned otherwise list.group_by_key (
bool) – When True the output is grouped by key as described inoutput.flatten_dict (
bool) – When True andoutputis dict, for each field if any of the values in the returned dict is itself a dict, it is flattened to depth 1 by concatenating the keys with a dot. For example, if the returned dict is{"a": {"x": 1, "y": 2}, "b": 3}, it becomes{"a.x": 1, "a.y": 2, "b": 3}. This option is ignored whenoutputis not dict.remapping (
dict, optional) –Create new metadata keys from existing ones. E.g. to define a new key “param_level” as the concatenated value of the “parameter.variable” and “vertical.level” keys use:
remapping={"param_level": "{parameter.variable}{vertical.level}"}
patch (
dict, optional) – A dictionary of patch to be applied to the returned values.
- Returns:
The returned value depends on the
outputandgroup_by_keyparameters. See above.- Return type:
list,dict- Raises:
KeyError – If
raise_on_missingis True and any ofkeysis not found.
Examples
>>> import earthkit.data >>> ds = earthkit.data.from_source("file", "docs/how-tos/test.grib") >>> ds.get("parameter.variable") ['2t', 'msl'] >>> ds.get(["parameter.variable", "parameter.units"]) [('2t', 'K'), ('msl', 'Pa')] >>> ds.get(("parameter.variable", "parameter.units")) [['2t', 'K'], ['msl', 'Pa']]
- graph(depth=0)¶
- group_by(*keys, sort=True)¶
Iterate through the object in groups defined by metadata keys.
- Parameters:
*keys (
tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.sort (
bool, optional) – IfTrue(default), the object is sorted by the metadatakeysbefore grouping. Sorting is only applied if the object is supporting the sorting operation.
- Returns:
Returns an iterator yielding batches of elements grouped by the metadata
keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of thekeyschange.- Return type:
object
- head(n=5, **kwargs)¶
Generate a list like summary of the first
nBUFRMEssages using a set of metadata keys. Same as callinglswithn.- Parameters:
n (
int,None) – The number of messages (n> 0) to be printed from the front.**kwargs (
dict, optional) – Other keyword arguments passed tols.
- Returns:
See
ls.- Return type:
Pandas DataFrame
Notes
The following calls are equivalent:
ds.head() ds.head(5) ds.head(n=5) ds.ls(5) ds.ls(n=5)
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- ls(n=None, keys='default', extra_keys=None)¶
Generate a list like summary of the BUFR message list using a set of metadata keys.
- Parameters:
n (
int,None) – The number ofBUFRMEssages to be listed.Nonemeans all the messages,n > 0means messages from the front, whilen < 0means messages from the back of the list.keys (
listofstr,dict,None) –Metadata keys. To specify a column title for each key in the output use a dict with keys as the metadata keys and values as the column titles. If
keysis None the following dict will be used to define the titles and the keys:[ "edition", "dataCategory", "dataSubCategory", "bufrHeaderCentre", "masterTablesVersionNumber", "localTablesVersionNumber", "numberOfSubsets", "compressedData", "typicalDate", "typicalTime", "ident", "localLatitude", "localLongitude", ]
extra_keys (
listofstr,dict,None) – List of additional keys to ``keys``s. To specify a column title for each key in the output use a dict.
- Returns:
DataFrame with one row per
BUFRMEssage.- Return type:
Pandas DataFrame
Examples
- classmethod merge(sources)¶
- metadata(*args, **kwargs)¶
Return the metadata values for each message.
- Parameters:
*args (
tuple) – Positional arguments defining the metadata keys. Passed toBUFRMessage.metadata()**kwargs (
dict, optional) – Keyword arguments passed toBUFRMessage.metadata()
- Returns:
List with one item per
BUFRMessage- Return type:
list
- mutate()¶
- mutate_source()¶
- name = None¶
- classmethod new_mask_index(*args, **kwargs)¶
- order_by(*args, **kwargs)¶
Change the order of the messages in a BUFRList object.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)**kwargs (
dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)
- Returns:
Returns a new object with reordered messages. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
- property parent¶
The parent source, if any.
- sel(*args, remapping=None, **kwargs)¶
Use header metadata values to select only certain messages from a BUFRList object.
- Parameters:
*args (
tuple) – Positional arguments specifying the filter condition as dict. (See below for details).**kwargs (
dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).
- Returns:
Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Notes
Filter conditions are specified by a set of metadata keys either by a dictionary (in
*args) or a set of**kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:single value:
ds.sel(dataCategory="2")
list of values:
ds.sel(dataCategory=[1, 2])
slice of values (defines a closed interval, so treated as inclusive of both the start
and stop values, unlike normal Python indexing):
# filter dataCategory between 1 and 4 inclusively ds.sel(dataCategory=slice(1,4))
- source_filename = None¶
- tail(n=5, **kwargs)¶
Generate a list like summary of the last
nBUFRMEssages using a set of metadata keys. Same as callinglswith-n.- Parameters:
n (
int,None) – The number of messages (n> 0) to be printed from the back.**kwargs (
dict, optional) – Other keyword arguments passed tols.
- Returns:
See
ls.- Return type:
Pandas DataFrame
Notes
The following calls are equivalent:
ds.tail() ds.tail(5) ds.tail(n=5) ds.ls(-5) ds.ls(n=-5)
- to_data_object()¶
Convert this source into a data object, if possible.
- to_numpy(*args, **kwargs)¶
- to_pandas(columns=None, filters=None, **kwargs)¶
Extract BUFR data into a pandas DataFrame using pdbufr.
- Parameters:
columns (
str,sequence[str]) – List of ecCodes BUFR keys to extract for each BUFR message/subset. See: pdbufr.read_bufr() for details.filters (
dict) – Defines the conditions when to extract the specifiedcolumns. See: pdbufr.read_bufr() for details.**kwargs (
dict, optional) – Other keyword arguments passed to pdbufr.read_bufr().
- Return type:
Pandas DataFrame
Examples
- to_target(target, *args, **kwargs)¶
- unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)¶
Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to collect unique values for.sort (
bool, optional) – Whether to sort the collected unique values. Default is False.drop_none (
bool, optional) – Whether to drop None values from the collected unique values. Default is True.squeeze (
bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.remapping (
dict, optional) – A dictionary for remapping keys or values during collection. Default is None.patch (
dict, optional) – A dictionary for patching key values during collection. Default is None.progress_bar (
bool, optional) – Whether to display a progress bar during collection. Default is False.cache (
bool, optional) – Whether to use a cached collector. Default is False.
- class earthkit.data.readers.bufr.file.MultiBUFRList(*args, **kwargs)¶
Bases:
BUFRList,earthkit.data.core.index.MultiIndexRepresent a list of
BUFRMessages.- batched(n)¶
Iterate through the object in batches of
n.- Parameters:
n (
int) – Batch size.- Returns:
Returns an iterator yielding batches of
nelements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer thannelements.- Return type:
object
- describe(*args, **kwargs)¶
Generate a summary of the fieldlist.
- get(keys, default=None, astype=None, raise_on_missing=False, output='auto', group_by_key=False, flatten_dict=False)¶
Return values for the specified keys from all the messages.
- Parameters:
keys (
str,list,tuple) – Specify the metadata keys to extract. Can be a single key (str) or multiple keys as a list/tuple of str. Keys are assumed to be of the form “component.key”. For example, “time.valid_datetime” or “parameter.name”. It is also allowed to specify just the component name like “time” or “parameter”. In this case the corresponding component’sto_dict()method is called and its result is returned. For other keys, the method looks for them in the private components of the fields (if any) and returns the value from the first private component that contains it.default (
Any,None) – Specify the default value(s) forkeys. Returned when the given key is not found andraise_on_missingis False. Whendefaultis a single value, it is used for all the keys. Otherwise it must be a list/tuple of the same length askeys.astype (
type as str,intorfloat) – Return type forkeys. Whenastypeis a single type, it is used for all the keys. Otherwise it must be a list/tuple of the same length askeys.raise_on_missing (
bool) – When True, raises KeyError if any ofkeysis not found.output (
type,str) –Specify the output structure type in conjunction with
group_by_key. Whengroup_byis False (default) the output is a list with one item per field andoutputhas the following effect on the items:- ”auto” (default):
when
keysis a str returns a single value per fieldwhen
keysis a list/tuple returns a list/tuple of values per field
list or “list”: returns a list of values per field.
tuple or “tuple”: returns a tuple of values per field.
dict or “dict”: returns a dictionary with keys and their values per field.
When
group_by_keyis True the output is grouped by key as follows and return an object with one item per key. The item contains the list of values for that key from all the fields. Whenoutputis dict a dict is returned otherwise list.group_by_key (
bool) – When True the output is grouped by key as described inoutput.flatten_dict (
bool) – When True andoutputis dict, for each field if any of the values in the returned dict is itself a dict, it is flattened to depth 1 by concatenating the keys with a dot. For example, if the returned dict is{"a": {"x": 1, "y": 2}, "b": 3}, it becomes{"a.x": 1, "a.y": 2, "b": 3}. This option is ignored whenoutputis not dict.remapping (
dict, optional) –Create new metadata keys from existing ones. E.g. to define a new key “param_level” as the concatenated value of the “parameter.variable” and “vertical.level” keys use:
remapping={"param_level": "{parameter.variable}{vertical.level}"}
patch (
dict, optional) – A dictionary of patch to be applied to the returned values.
- Returns:
The returned value depends on the
outputandgroup_by_keyparameters. See above.- Return type:
list,dict- Raises:
KeyError – If
raise_on_missingis True and any ofkeysis not found.
Examples
>>> import earthkit.data >>> ds = earthkit.data.from_source("file", "docs/how-tos/test.grib") >>> ds.get("parameter.variable") ['2t', 'msl'] >>> ds.get(["parameter.variable", "parameter.units"]) [('2t', 'K'), ('msl', 'Pa')] >>> ds.get(("parameter.variable", "parameter.units")) [['2t', 'K'], ['msl', 'Pa']]
- graph(depth=0)¶
- group_by(*keys, sort=True)¶
Iterate through the object in groups defined by metadata keys.
- Parameters:
*keys (
tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.sort (
bool, optional) – IfTrue(default), the object is sorted by the metadatakeysbefore grouping. Sorting is only applied if the object is supporting the sorting operation.
- Returns:
Returns an iterator yielding batches of elements grouped by the metadata
keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of thekeyschange.- Return type:
object
- head(n=5, **kwargs)¶
Generate a list like summary of the first
nBUFRMEssages using a set of metadata keys. Same as callinglswithn.- Parameters:
n (
int,None) – The number of messages (n> 0) to be printed from the front.**kwargs (
dict, optional) – Other keyword arguments passed tols.
- Returns:
See
ls.- Return type:
Pandas DataFrame
Notes
The following calls are equivalent:
ds.head() ds.head(5) ds.head(n=5) ds.ls(5) ds.ls(n=5)
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- ls(n=None, keys='default', extra_keys=None)¶
Generate a list like summary of the BUFR message list using a set of metadata keys.
- Parameters:
n (
int,None) – The number ofBUFRMEssages to be listed.Nonemeans all the messages,n > 0means messages from the front, whilen < 0means messages from the back of the list.keys (
listofstr,dict,None) –Metadata keys. To specify a column title for each key in the output use a dict with keys as the metadata keys and values as the column titles. If
keysis None the following dict will be used to define the titles and the keys:[ "edition", "dataCategory", "dataSubCategory", "bufrHeaderCentre", "masterTablesVersionNumber", "localTablesVersionNumber", "numberOfSubsets", "compressedData", "typicalDate", "typicalTime", "ident", "localLatitude", "localLongitude", ]
extra_keys (
listofstr,dict,None) – List of additional keys to ``keys``s. To specify a column title for each key in the output use a dict.
- Returns:
DataFrame with one row per
BUFRMEssage.- Return type:
Pandas DataFrame
Examples
- classmethod merge(sources)¶
- metadata(*args, **kwargs)¶
Return the metadata values for each message.
- Parameters:
*args (
tuple) – Positional arguments defining the metadata keys. Passed toBUFRMessage.metadata()**kwargs (
dict, optional) – Keyword arguments passed toBUFRMessage.metadata()
- Returns:
List with one item per
BUFRMessage- Return type:
list
- mutate()¶
- mutate_source()¶
- name = None¶
- classmethod new_mask_index(*args, **kwargs)¶
- order_by(*args, **kwargs)¶
Change the order of the messages in a BUFRList object.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)**kwargs (
dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)
- Returns:
Returns a new object with reordered messages. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
- property parent¶
The parent source, if any.
- sel(*args, remapping=None, **kwargs)¶
Use header metadata values to select only certain messages from a BUFRList object.
- Parameters:
*args (
tuple) – Positional arguments specifying the filter condition as dict. (See below for details).**kwargs (
dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).
- Returns:
Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Notes
Filter conditions are specified by a set of metadata keys either by a dictionary (in
*args) or a set of**kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:single value:
ds.sel(dataCategory="2")
list of values:
ds.sel(dataCategory=[1, 2])
slice of values (defines a closed interval, so treated as inclusive of both the start
and stop values, unlike normal Python indexing):
# filter dataCategory between 1 and 4 inclusively ds.sel(dataCategory=slice(1,4))
- source_filename = None¶
- tail(n=5, **kwargs)¶
Generate a list like summary of the last
nBUFRMEssages using a set of metadata keys. Same as callinglswith-n.- Parameters:
n (
int,None) – The number of messages (n> 0) to be printed from the back.**kwargs (
dict, optional) – Other keyword arguments passed tols.
- Returns:
See
ls.- Return type:
Pandas DataFrame
Notes
The following calls are equivalent:
ds.tail() ds.tail(5) ds.tail(n=5) ds.ls(-5) ds.ls(n=-5)
- to_data_object()¶
Convert this source into a data object, if possible.
- to_numpy(*args, **kwargs)¶
- to_pandas(columns=None, filters=None, **kwargs)¶
Extract BUFR data into a pandas DataFrame using pdbufr.
- Parameters:
columns (
str,sequence[str]) – List of ecCodes BUFR keys to extract for each BUFR message/subset. See: pdbufr.read_bufr() for details.filters (
dict) – Defines the conditions when to extract the specifiedcolumns. See: pdbufr.read_bufr() for details.**kwargs (
dict, optional) – Other keyword arguments passed to pdbufr.read_bufr().
- Return type:
Pandas DataFrame
Examples
- to_target(target, *args, **kwargs)¶
- unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)¶
Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to collect unique values for.sort (
bool, optional) – Whether to sort the collected unique values. Default is False.drop_none (
bool, optional) – Whether to drop None values from the collected unique values. Default is True.squeeze (
bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.remapping (
dict, optional) – A dictionary for remapping keys or values during collection. Default is None.patch (
dict, optional) – A dictionary for patching key values during collection. Default is None.progress_bar (
bool, optional) – Whether to display a progress bar during collection. Default is False.cache (
bool, optional) – Whether to use a cached collector. Default is False.
- class earthkit.data.readers.bufr.file.MultiBUFRReader(sources)¶
Bases:
earthkit.data.sources.Source,earthkit.data.readers.bufr.core.BUFRReaderBaseBase class for all sources.
- property appendable¶
- property binary¶
- property filter¶
- graph(depth=0)¶
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- classmethod merge(sources)¶
- property merger¶
- mutate()¶
- mutate_source()¶
- name = None¶
- property parent¶
The parent source, if any.
- property parts¶
- path¶
- property source¶
- source_filename = None¶
- sources¶
- property stream¶
- to_data_object()¶
Convert this source into a data object, if possible.
- to_featurelist()¶
- to_pandas(*args, **kwargs)¶
- to_target(target, *args, **kwargs)¶