earthkit.data.core.index¶
Classes¶
Base class for all sources. |
|
Base class for all sources. |
|
Base class for all sources. |
|
Module Contents¶
- class earthkit.data.core.index.Index(**kwargs)¶
Bases:
earthkit.data.sources.Source,earthkit.data.core.EncodableBase class for all sources.
- batched(n)¶
Iterate through the object in batches of
n.- Parameters:
n (
int) – Batch size.- Returns:
Returns an iterator yielding batches of
nelements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer thannelements.- Return type:
object
- graph(depth=0)¶
- group_by(*keys, sort=True)¶
Iterate through the object in groups defined by metadata keys.
- Parameters:
*keys (
tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.sort (
bool, optional) – IfTrue(default), the object is sorted by the metadatakeysbefore grouping. Sorting is only applied if the object is supporting the sorting operation.
- Returns:
Returns an iterator yielding batches of elements grouped by the metadata
keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of thekeyschange.- Return type:
object
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- classmethod merge(sources)¶
- mutate()¶
- mutate_source()¶
- name = None¶
- order_by(*args, remapping=None, patch=None, **kwargs)¶
Change the order of the elements in a fieldlist.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)remapping (
dict) –Defines new metadata keys from existing ones that we can refer to in
*argsand**kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:remapping={"param_level": "{param}{level}"}
See below for a more elaborate example.
**kwargs (
dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)
- Returns:
Returns a new object with reordered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Examples
Ordering by a single metadata key (“param”). The default ordering direction is
ascending:>>> import earthkit.data as ekd >>> ds = ekd.from_source("sample", "test6.grib").to_fieldlist() >>> for f in ds.order_by("parameter.variable"): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Ordering by multiple keys (first by “vertical.level” then by “parameter.variable”):
>>> for f in ds.order_by(["vertical.level", "parameter.variable"]): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Specifying the ordering direction:
>>> for f in ds.order_by(**{"parameter.variable": "ascending", "vertical.level": "descending"}): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Using the list of all the values of a key (“parameter.variable”) to define the order:
>>> for f in ds.order_by(**{"parameter.variable": ["u", "t", "v"]}): ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Using
remappingto specify the order by a key created from two other keys (we created key “param_level” from “param” and “levelist”):>>> ordering = ["t850", "t1000", "u1000", "v850", "v1000", "u850"] >>> remapping = {"param_level": "{parameter.variable}{vertical.level}"} >>> for f in ds.order_by(param_level=ordering, remapping=remapping): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
- property parent¶
The parent source, if any.
- sel(*args, remapping=None, **kwargs)¶
Uses metadata values to select a subset of the elements from a fieldlist object.
- Parameters:
*args (
tuple) – Positional arguments specifying the filter condition as dict. (See below for details).remapping (
dict) –Creates new metadata keys from existing ones that we can refer to in
*argsand**kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:remapping={"param_level": "{param}{level}"}
See below for a more elaborate example.
**kwargs (
dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).
- Returns:
Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Notes
Filter conditions are specified by a set of metadata keys either by a dictionary (in
*args) or a set of**kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:single value:
ds.sel({"parameter.variable": "t"})
list of values:
ds.sel({"parameter.variable": ["u", "v"]})
slice of values (defines a closed interval, so treated as inclusive of both the start
and stop values, unlike normal Python indexing):
# filter levels between 300 and 500 inclusively ds.sel({"vertical.level": slice(300, 500)})
Examples
>>> import earthkit.data >>> fl = earthkit.data.from_source("sample", "tuv_pl.grib").to_fieldlist() >>> len(fl) 18
Selecting by a single key (“parameter.variable”) with a single value:
>>> fl1 = fl.sel({"parameter.variable": "t"}) >>> for f in fl1: ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 300, pressure, 0, regular_ll)
Selecting by multiple keys (“parameter.variable”, “vertical.level”) with a list and slice of values:
>>> fl1 = fl.sel({"parameter.variable": ["u", "v"], "vertical.level": slice(400, 700)}) >>> for f in fl1: ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Using
remappingto specify the selection by a key created from two other keys (we created key “param_level” from “parameter.variable” and “vertical.level”):>>> fl1 = fl.sel( ... {"param_level": ["t850", "u1000"], ... "remapping": {"param_level": "{parameter.variable}{vertical.level}"}}) ... ) >>> for f in fl1: ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
- source_filename = None¶
- to_data_object()¶
Convert this source into a data object, if possible.
- to_numpy(*args, **kwargs)¶
- abstractmethod to_target(target, *args, **kwargs)¶
- unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)¶
Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to collect unique values for.sort (
bool, optional) – Whether to sort the collected unique values. Default is False.drop_none (
bool, optional) – Whether to drop None values from the collected unique values. Default is True.squeeze (
bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.remapping (
dict, optional) – A dictionary for remapping keys or values during collection. Default is None.patch (
dict, optional) – A dictionary for patching key values during collection. Default is None.progress_bar (
bool, optional) – Whether to display a progress bar during collection. Default is False.cache (
bool, optional) – Whether to use a cached collector. Default is False.
- class earthkit.data.core.index.MaskIndex(index, indices)¶
Bases:
IndexBase class for all sources.
- batched(n)¶
Iterate through the object in batches of
n.- Parameters:
n (
int) – Batch size.- Returns:
Returns an iterator yielding batches of
nelements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer thannelements.- Return type:
object
- graph(depth=0)¶
- group_by(*keys, sort=True)¶
Iterate through the object in groups defined by metadata keys.
- Parameters:
*keys (
tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.sort (
bool, optional) – IfTrue(default), the object is sorted by the metadatakeysbefore grouping. Sorting is only applied if the object is supporting the sorting operation.
- Returns:
Returns an iterator yielding batches of elements grouped by the metadata
keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of thekeyschange.- Return type:
object
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- classmethod merge(sources)¶
- mutate()¶
- mutate_source()¶
- name = None¶
- order_by(*args, remapping=None, patch=None, **kwargs)¶
Change the order of the elements in a fieldlist.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)remapping (
dict) –Defines new metadata keys from existing ones that we can refer to in
*argsand**kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:remapping={"param_level": "{param}{level}"}
See below for a more elaborate example.
**kwargs (
dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)
- Returns:
Returns a new object with reordered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Examples
Ordering by a single metadata key (“param”). The default ordering direction is
ascending:>>> import earthkit.data as ekd >>> ds = ekd.from_source("sample", "test6.grib").to_fieldlist() >>> for f in ds.order_by("parameter.variable"): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Ordering by multiple keys (first by “vertical.level” then by “parameter.variable”):
>>> for f in ds.order_by(["vertical.level", "parameter.variable"]): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Specifying the ordering direction:
>>> for f in ds.order_by(**{"parameter.variable": "ascending", "vertical.level": "descending"}): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Using the list of all the values of a key (“parameter.variable”) to define the order:
>>> for f in ds.order_by(**{"parameter.variable": ["u", "t", "v"]}): ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Using
remappingto specify the order by a key created from two other keys (we created key “param_level” from “param” and “levelist”):>>> ordering = ["t850", "t1000", "u1000", "v850", "v1000", "u850"] >>> remapping = {"param_level": "{parameter.variable}{vertical.level}"} >>> for f in ds.order_by(param_level=ordering, remapping=remapping): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
- property parent¶
The parent source, if any.
- sel(*args, remapping=None, **kwargs)¶
Uses metadata values to select a subset of the elements from a fieldlist object.
- Parameters:
*args (
tuple) – Positional arguments specifying the filter condition as dict. (See below for details).remapping (
dict) –Creates new metadata keys from existing ones that we can refer to in
*argsand**kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:remapping={"param_level": "{param}{level}"}
See below for a more elaborate example.
**kwargs (
dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).
- Returns:
Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Notes
Filter conditions are specified by a set of metadata keys either by a dictionary (in
*args) or a set of**kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:single value:
ds.sel({"parameter.variable": "t"})
list of values:
ds.sel({"parameter.variable": ["u", "v"]})
slice of values (defines a closed interval, so treated as inclusive of both the start
and stop values, unlike normal Python indexing):
# filter levels between 300 and 500 inclusively ds.sel({"vertical.level": slice(300, 500)})
Examples
>>> import earthkit.data >>> fl = earthkit.data.from_source("sample", "tuv_pl.grib").to_fieldlist() >>> len(fl) 18
Selecting by a single key (“parameter.variable”) with a single value:
>>> fl1 = fl.sel({"parameter.variable": "t"}) >>> for f in fl1: ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 300, pressure, 0, regular_ll)
Selecting by multiple keys (“parameter.variable”, “vertical.level”) with a list and slice of values:
>>> fl1 = fl.sel({"parameter.variable": ["u", "v"], "vertical.level": slice(400, 700)}) >>> for f in fl1: ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Using
remappingto specify the selection by a key created from two other keys (we created key “param_level” from “parameter.variable” and “vertical.level”):>>> fl1 = fl.sel( ... {"param_level": ["t850", "u1000"], ... "remapping": {"param_level": "{parameter.variable}{vertical.level}"}}) ... ) >>> for f in fl1: ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
- source_filename = None¶
- to_data_object()¶
Convert this source into a data object, if possible.
- to_numpy(*args, **kwargs)¶
- abstractmethod to_target(target, *args, **kwargs)¶
- unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)¶
Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to collect unique values for.sort (
bool, optional) – Whether to sort the collected unique values. Default is False.drop_none (
bool, optional) – Whether to drop None values from the collected unique values. Default is True.squeeze (
bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.remapping (
dict, optional) – A dictionary for remapping keys or values during collection. Default is None.patch (
dict, optional) – A dictionary for patching key values during collection. Default is None.progress_bar (
bool, optional) – Whether to display a progress bar during collection. Default is False.cache (
bool, optional) – Whether to use a cached collector. Default is False.
- class earthkit.data.core.index.MultiIndex(indexes, *args, **kwargs)¶
Bases:
IndexBase class for all sources.
- batched(n)¶
Iterate through the object in batches of
n.- Parameters:
n (
int) – Batch size.- Returns:
Returns an iterator yielding batches of
nelements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer thannelements.- Return type:
object
- graph(depth=0)¶
- group_by(*keys, sort=True)¶
Iterate through the object in groups defined by metadata keys.
- Parameters:
*keys (
tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.sort (
bool, optional) – IfTrue(default), the object is sorted by the metadatakeysbefore grouping. Sorting is only applied if the object is supporting the sorting operation.
- Returns:
Returns an iterator yielding batches of elements grouped by the metadata
keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of thekeyschange.- Return type:
object
- ignore()¶
Indicates to ignore this source in concatenation/merging.
- Return type:
bool
- classmethod merge(sources)¶
- mutate()¶
- mutate_source()¶
- name = None¶
- order_by(*args, remapping=None, patch=None, **kwargs)¶
Change the order of the elements in a fieldlist.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)remapping (
dict) –Defines new metadata keys from existing ones that we can refer to in
*argsand**kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:remapping={"param_level": "{param}{level}"}
See below for a more elaborate example.
**kwargs (
dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)
- Returns:
Returns a new object with reordered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Examples
Ordering by a single metadata key (“param”). The default ordering direction is
ascending:>>> import earthkit.data as ekd >>> ds = ekd.from_source("sample", "test6.grib").to_fieldlist() >>> for f in ds.order_by("parameter.variable"): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Ordering by multiple keys (first by “vertical.level” then by “parameter.variable”):
>>> for f in ds.order_by(["vertical.level", "parameter.variable"]): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Specifying the ordering direction:
>>> for f in ds.order_by(**{"parameter.variable": "ascending", "vertical.level": "descending"}): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Using the list of all the values of a key (“parameter.variable”) to define the order:
>>> for f in ds.order_by(**{"parameter.variable": ["u", "t", "v"]}): ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Using
remappingto specify the order by a key created from two other keys (we created key “param_level” from “param” and “levelist”):>>> ordering = ["t850", "t1000", "u1000", "v850", "v1000", "u850"] >>> remapping = {"param_level": "{parameter.variable}{vertical.level}"} >>> for f in ds.order_by(param_level=ordering, remapping=remapping): ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
- property parent¶
The parent source, if any.
- sel(*args, **kwargs)¶
Uses metadata values to select a subset of the elements from a fieldlist object.
- Parameters:
*args (
tuple) – Positional arguments specifying the filter condition as dict. (See below for details).remapping (
dict) –Creates new metadata keys from existing ones that we can refer to in
*argsand**kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:remapping={"param_level": "{param}{level}"}
See below for a more elaborate example.
**kwargs (
dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).
- Returns:
Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.
- Return type:
object
Notes
Filter conditions are specified by a set of metadata keys either by a dictionary (in
*args) or a set of**kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:single value:
ds.sel({"parameter.variable": "t"})
list of values:
ds.sel({"parameter.variable": ["u", "v"]})
slice of values (defines a closed interval, so treated as inclusive of both the start
and stop values, unlike normal Python indexing):
# filter levels between 300 and 500 inclusively ds.sel({"vertical.level": slice(300, 500)})
Examples
>>> import earthkit.data >>> fl = earthkit.data.from_source("sample", "tuv_pl.grib").to_fieldlist() >>> len(fl) 18
Selecting by a single key (“parameter.variable”) with a single value:
>>> fl1 = fl.sel({"parameter.variable": "t"}) >>> for f in fl1: ... print(f) ... Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 300, pressure, 0, regular_ll)
Selecting by multiple keys (“parameter.variable”, “vertical.level”) with a list and slice of values:
>>> fl1 = fl.sel({"parameter.variable": ["u", "v"], "vertical.level": slice(400, 700)}) >>> for f in fl1: ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll) Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll) Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Using
remappingto specify the selection by a key created from two other keys (we created key “param_level” from “parameter.variable” and “vertical.level”):>>> fl1 = fl.sel( ... {"param_level": ["t850", "u1000"], ... "remapping": {"param_level": "{parameter.variable}{vertical.level}"}}) ... ) >>> for f in fl1: ... print(f) ... Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll) Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
- source_filename = None¶
- to_data_object()¶
Convert this source into a data object, if possible.
- to_numpy(*args, **kwargs)¶
- abstractmethod to_target(target, *args, **kwargs)¶
- unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)¶
Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.
- Parameters:
*args (
tuple) – Positional arguments specifying the metadata keys to collect unique values for.sort (
bool, optional) – Whether to sort the collected unique values. Default is False.drop_none (
bool, optional) – Whether to drop None values from the collected unique values. Default is True.squeeze (
bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.remapping (
dict, optional) – A dictionary for remapping keys or values during collection. Default is None.patch (
dict, optional) – A dictionary for patching key values during collection. Default is None.progress_bar (
bool, optional) – Whether to display a progress bar during collection. Default is False.cache (
bool, optional) – Whether to use a cached collector. Default is False.
- class earthkit.data.core.index.Order(kwargs, remapping)¶
Bases:
OrderBase- actions¶
- build_actions(kwargs)¶
- compare_elements(a, b)¶
- property is_empty¶
- remapping¶