earthkit.data.core.index

Classes

Index

Base class for all sources.

MaskIndex

Base class for all sources.

MultiIndex

Base class for all sources.

Order

OrderBase

OrderOrSelection

Selection

Module Contents

class earthkit.data.core.index.Index(**kwargs)

Bases: earthkit.data.sources.Source, earthkit.data.core.Encodable

Base class for all sources.

batched(n)

Iterate through the object in batches of n.

Parameters:

n (int) – Batch size.

Returns:

Returns an iterator yielding batches of n elements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer than n elements.

Return type:

object

graph(depth=0)
group_by(*keys, sort=True)

Iterate through the object in groups defined by metadata keys.

Parameters:
  • *keys (tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.

  • sort (bool, optional) – If True (default), the object is sorted by the metadata keys before grouping. Sorting is only applied if the object is supporting the sorting operation.

Returns:

Returns an iterator yielding batches of elements grouped by the metadata keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of the keys change.

Return type:

object

ignore()

Indicates to ignore this source in concatenation/merging.

Return type:

bool

classmethod merge(sources)
mutate()
mutate_source()
name = None
order_by(*args, remapping=None, patch=None, **kwargs)

Change the order of the elements in a fieldlist.

Parameters:
  • *args (tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)

  • remapping (dict) –

    Defines new metadata keys from existing ones that we can refer to in *args and **kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:

    remapping={"param_level": "{param}{level}"}
    

    See below for a more elaborate example.

  • **kwargs (dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)

Returns:

Returns a new object with reordered elements. It contains a view to the data in the original object, so no data is copied.

Return type:

object

Examples

Ordering by a single metadata key (“param”). The default ordering direction is ascending:

>>> import earthkit.data as ekd
>>> ds = ekd.from_source("sample", "test6.grib").to_fieldlist()
>>> for f in ds.order_by("parameter.variable"):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Ordering by multiple keys (first by “vertical.level” then by “parameter.variable”):

>>> for f in ds.order_by(["vertical.level", "parameter.variable"]):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)

Specifying the ordering direction:

>>> for f in ds.order_by(**{"parameter.variable": "ascending", "vertical.level": "descending"}):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Using the list of all the values of a key (“parameter.variable”) to define the order:

>>> for f in ds.order_by(**{"parameter.variable": ["u", "t", "v"]}):
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Using remapping to specify the order by a key created from two other keys (we created key “param_level” from “param” and “levelist”):

>>> ordering = ["t850", "t1000", "u1000", "v850", "v1000", "u850"]
>>> remapping = {"param_level": "{parameter.variable}{vertical.level}"}
>>> for f in ds.order_by(param_level=ordering, remapping=remapping):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
property parent

The parent source, if any.

sel(*args, remapping=None, **kwargs)

Uses metadata values to select a subset of the elements from a fieldlist object.

Parameters:
  • *args (tuple) – Positional arguments specifying the filter condition as dict. (See below for details).

  • remapping (dict) –

    Creates new metadata keys from existing ones that we can refer to in *args and **kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:

    remapping={"param_level": "{param}{level}"}
    

    See below for a more elaborate example.

  • **kwargs (dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).

Returns:

Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.

Return type:

object

Notes

Filter conditions are specified by a set of metadata keys either by a dictionary (in *args) or a set of **kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:

  • single value:

    ds.sel({"parameter.variable": "t"})
    
  • list of values:

    ds.sel({"parameter.variable": ["u", "v"]})
    
  • slice of values (defines a closed interval, so treated as inclusive of both the start

and stop values, unlike normal Python indexing):

# filter levels between 300 and 500 inclusively
ds.sel({"vertical.level": slice(300, 500)})

Examples

>>> import earthkit.data
>>> fl = earthkit.data.from_source("sample", "tuv_pl.grib").to_fieldlist()
>>> len(fl)
18

Selecting by a single key (“parameter.variable”) with a single value:

>>> fl1 = fl.sel({"parameter.variable": "t"})
>>> for f in fl1:
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 300, pressure, 0, regular_ll)

Selecting by multiple keys (“parameter.variable”, “vertical.level”) with a list and slice of values:

>>> fl1 = fl.sel({"parameter.variable": ["u", "v"], "vertical.level": slice(400, 700)})
>>> for f in fl1:
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)

Using remapping to specify the selection by a key created from two other keys (we created key “param_level” from “parameter.variable” and “vertical.level”):

>>> fl1 = fl.sel(
...    {"param_level": ["t850", "u1000"],
...    "remapping": {"param_level": "{parameter.variable}{vertical.level}"}})
... )
>>> for f in fl1:
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
source_filename = None
to_data_object()

Convert this source into a data object, if possible.

to_numpy(*args, **kwargs)
abstractmethod to_target(target, *args, **kwargs)
unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)

Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.

Parameters:
  • *args (tuple) – Positional arguments specifying the metadata keys to collect unique values for.

  • sort (bool, optional) – Whether to sort the collected unique values. Default is False.

  • drop_none (bool, optional) – Whether to drop None values from the collected unique values. Default is True.

  • squeeze (bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.

  • remapping (dict, optional) – A dictionary for remapping keys or values during collection. Default is None.

  • patch (dict, optional) – A dictionary for patching key values during collection. Default is None.

  • progress_bar (bool, optional) – Whether to display a progress bar during collection. Default is False.

  • cache (bool, optional) – Whether to use a cached collector. Default is False.

class earthkit.data.core.index.MaskIndex(index, indices)

Bases: Index

Base class for all sources.

batched(n)

Iterate through the object in batches of n.

Parameters:

n (int) – Batch size.

Returns:

Returns an iterator yielding batches of n elements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer than n elements.

Return type:

object

graph(depth=0)
group_by(*keys, sort=True)

Iterate through the object in groups defined by metadata keys.

Parameters:
  • *keys (tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.

  • sort (bool, optional) – If True (default), the object is sorted by the metadata keys before grouping. Sorting is only applied if the object is supporting the sorting operation.

Returns:

Returns an iterator yielding batches of elements grouped by the metadata keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of the keys change.

Return type:

object

ignore()

Indicates to ignore this source in concatenation/merging.

Return type:

bool

classmethod merge(sources)
mutate()
mutate_source()
name = None
order_by(*args, remapping=None, patch=None, **kwargs)

Change the order of the elements in a fieldlist.

Parameters:
  • *args (tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)

  • remapping (dict) –

    Defines new metadata keys from existing ones that we can refer to in *args and **kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:

    remapping={"param_level": "{param}{level}"}
    

    See below for a more elaborate example.

  • **kwargs (dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)

Returns:

Returns a new object with reordered elements. It contains a view to the data in the original object, so no data is copied.

Return type:

object

Examples

Ordering by a single metadata key (“param”). The default ordering direction is ascending:

>>> import earthkit.data as ekd
>>> ds = ekd.from_source("sample", "test6.grib").to_fieldlist()
>>> for f in ds.order_by("parameter.variable"):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Ordering by multiple keys (first by “vertical.level” then by “parameter.variable”):

>>> for f in ds.order_by(["vertical.level", "parameter.variable"]):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)

Specifying the ordering direction:

>>> for f in ds.order_by(**{"parameter.variable": "ascending", "vertical.level": "descending"}):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Using the list of all the values of a key (“parameter.variable”) to define the order:

>>> for f in ds.order_by(**{"parameter.variable": ["u", "t", "v"]}):
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Using remapping to specify the order by a key created from two other keys (we created key “param_level” from “param” and “levelist”):

>>> ordering = ["t850", "t1000", "u1000", "v850", "v1000", "u850"]
>>> remapping = {"param_level": "{parameter.variable}{vertical.level}"}
>>> for f in ds.order_by(param_level=ordering, remapping=remapping):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
property parent

The parent source, if any.

sel(*args, remapping=None, **kwargs)

Uses metadata values to select a subset of the elements from a fieldlist object.

Parameters:
  • *args (tuple) – Positional arguments specifying the filter condition as dict. (See below for details).

  • remapping (dict) –

    Creates new metadata keys from existing ones that we can refer to in *args and **kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:

    remapping={"param_level": "{param}{level}"}
    

    See below for a more elaborate example.

  • **kwargs (dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).

Returns:

Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.

Return type:

object

Notes

Filter conditions are specified by a set of metadata keys either by a dictionary (in *args) or a set of **kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:

  • single value:

    ds.sel({"parameter.variable": "t"})
    
  • list of values:

    ds.sel({"parameter.variable": ["u", "v"]})
    
  • slice of values (defines a closed interval, so treated as inclusive of both the start

and stop values, unlike normal Python indexing):

# filter levels between 300 and 500 inclusively
ds.sel({"vertical.level": slice(300, 500)})

Examples

>>> import earthkit.data
>>> fl = earthkit.data.from_source("sample", "tuv_pl.grib").to_fieldlist()
>>> len(fl)
18

Selecting by a single key (“parameter.variable”) with a single value:

>>> fl1 = fl.sel({"parameter.variable": "t"})
>>> for f in fl1:
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 300, pressure, 0, regular_ll)

Selecting by multiple keys (“parameter.variable”, “vertical.level”) with a list and slice of values:

>>> fl1 = fl.sel({"parameter.variable": ["u", "v"], "vertical.level": slice(400, 700)})
>>> for f in fl1:
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)

Using remapping to specify the selection by a key created from two other keys (we created key “param_level” from “parameter.variable” and “vertical.level”):

>>> fl1 = fl.sel(
...    {"param_level": ["t850", "u1000"],
...    "remapping": {"param_level": "{parameter.variable}{vertical.level}"}})
... )
>>> for f in fl1:
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
source_filename = None
to_data_object()

Convert this source into a data object, if possible.

to_numpy(*args, **kwargs)
abstractmethod to_target(target, *args, **kwargs)
unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)

Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.

Parameters:
  • *args (tuple) – Positional arguments specifying the metadata keys to collect unique values for.

  • sort (bool, optional) – Whether to sort the collected unique values. Default is False.

  • drop_none (bool, optional) – Whether to drop None values from the collected unique values. Default is True.

  • squeeze (bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.

  • remapping (dict, optional) – A dictionary for remapping keys or values during collection. Default is None.

  • patch (dict, optional) – A dictionary for patching key values during collection. Default is None.

  • progress_bar (bool, optional) – Whether to display a progress bar during collection. Default is False.

  • cache (bool, optional) – Whether to use a cached collector. Default is False.

class earthkit.data.core.index.MultiIndex(indexes, *args, **kwargs)

Bases: Index

Base class for all sources.

batched(n)

Iterate through the object in batches of n.

Parameters:

n (int) – Batch size.

Returns:

Returns an iterator yielding batches of n elements. Each batch is a new object containing a view to the data in the original object, so no data is copied. The last batch may contain fewer than n elements.

Return type:

object

graph(depth=0)
group_by(*keys, sort=True)

Iterate through the object in groups defined by metadata keys.

Parameters:
  • *keys (tuple) – Positional arguments specifying the metadata keys to group by. Keys can be a single or multiple str, or a list or tuple of str.

  • sort (bool, optional) – If True (default), the object is sorted by the metadata keys before grouping. Sorting is only applied if the object is supporting the sorting operation.

Returns:

Returns an iterator yielding batches of elements grouped by the metadata keys. Each batch is a new object containing a view to the data in the original object, so no data is copied. It generates a new group every time the value of the keys change.

Return type:

object

ignore()

Indicates to ignore this source in concatenation/merging.

Return type:

bool

classmethod merge(sources)
mutate()
mutate_source()
name = None
order_by(*args, remapping=None, patch=None, **kwargs)

Change the order of the elements in a fieldlist.

Parameters:
  • *args (tuple) – Positional arguments specifying the metadata keys to perform the ordering on. (See below for details)

  • remapping (dict) –

    Defines new metadata keys from existing ones that we can refer to in *args and **kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:

    remapping={"param_level": "{param}{level}"}
    

    See below for a more elaborate example.

  • **kwargs (dict, optional) – Other keyword arguments specifying the metadata keys to perform the ordering on. (See below for details)

Returns:

Returns a new object with reordered elements. It contains a view to the data in the original object, so no data is copied.

Return type:

object

Examples

Ordering by a single metadata key (“param”). The default ordering direction is ascending:

>>> import earthkit.data as ekd
>>> ds = ekd.from_source("sample", "test6.grib").to_fieldlist()
>>> for f in ds.order_by("parameter.variable"):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Ordering by multiple keys (first by “vertical.level” then by “parameter.variable”):

>>> for f in ds.order_by(["vertical.level", "parameter.variable"]):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)

Specifying the ordering direction:

>>> for f in ds.order_by(**{"parameter.variable": "ascending", "vertical.level": "descending"}):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Using the list of all the values of a key (“parameter.variable”) to define the order:

>>> for f in ds.order_by(**{"parameter.variable": ["u", "t", "v"]}):
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Using remapping to specify the order by a key created from two other keys (we created key “param_level” from “param” and “levelist”):

>>> ordering = ["t850", "t1000", "u1000", "v850", "v1000", "u850"]
>>> remapping = {"param_level": "{parameter.variable}{vertical.level}"}
>>> for f in ds.order_by(param_level=ordering, remapping=remapping):
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
property parent

The parent source, if any.

sel(*args, **kwargs)

Uses metadata values to select a subset of the elements from a fieldlist object.

Parameters:
  • *args (tuple) – Positional arguments specifying the filter condition as dict. (See below for details).

  • remapping (dict) –

    Creates new metadata keys from existing ones that we can refer to in *args and **kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:

    remapping={"param_level": "{param}{level}"}
    

    See below for a more elaborate example.

  • **kwargs (dict, optional) – Other keyword arguments specifying the filter conditions. (See below for details).

Returns:

Returns a new object with the filtered elements. It contains a view to the data in the original object, so no data is copied.

Return type:

object

Notes

Filter conditions are specified by a set of metadata keys either by a dictionary (in *args) or a set of **kwargs. Both single or multiple keys are allowed to use and each can specify the following type of filter values:

  • single value:

    ds.sel({"parameter.variable": "t"})
    
  • list of values:

    ds.sel({"parameter.variable": ["u", "v"]})
    
  • slice of values (defines a closed interval, so treated as inclusive of both the start

and stop values, unlike normal Python indexing):

# filter levels between 300 and 500 inclusively
ds.sel({"vertical.level": slice(300, 500)})

Examples

>>> import earthkit.data
>>> fl = earthkit.data.from_source("sample", "tuv_pl.grib").to_fieldlist()
>>> len(fl)
18

Selecting by a single key (“parameter.variable”) with a single value:

>>> fl1 = fl.sel({"parameter.variable": "t"})
>>> for f in fl1:
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 300, pressure, 0, regular_ll)

Selecting by multiple keys (“parameter.variable”, “vertical.level”) with a list and slice of values:

>>> fl1 = fl.sel({"parameter.variable": ["u", "v"], "vertical.level": slice(400, 700)})
>>> for f in fl1:
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 700, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 500, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 400, pressure, 0, regular_ll)

Using remapping to specify the selection by a key created from two other keys (we created key “param_level” from “parameter.variable” and “vertical.level”):

>>> fl1 = fl.sel(
...    {"param_level": ["t850", "u1000"],
...    "remapping": {"param_level": "{parameter.variable}{vertical.level}"}})
... )
>>> for f in fl1:
...     print(f)
...
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
source_filename = None
to_data_object()

Convert this source into a data object, if possible.

to_numpy(*args, **kwargs)
abstractmethod to_target(target, *args, **kwargs)
unique(*args, sort=False, drop_none=True, squeeze=False, unwrap_single=False, remapping=None, patch=None, progress_bar=False, cache=True)

Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes.

Parameters:
  • *args (tuple) – Positional arguments specifying the metadata keys to collect unique values for.

  • sort (bool, optional) – Whether to sort the collected unique values. Default is False.

  • drop_none (bool, optional) – Whether to drop None values from the collected unique values. Default is True.

  • squeeze (bool, optional) – Whether to return a single value instead of a list if there is only one unique value for a key. Default is False.

  • remapping (dict, optional) – A dictionary for remapping keys or values during collection. Default is None.

  • patch (dict, optional) – A dictionary for patching key values during collection. Default is None.

  • progress_bar (bool, optional) – Whether to display a progress bar during collection. Default is False.

  • cache (bool, optional) – Whether to use a cached collector. Default is False.

class earthkit.data.core.index.Order(kwargs, remapping)

Bases: OrderBase

actions
build_actions(kwargs)
compare_elements(a, b)
property is_empty
remapping
class earthkit.data.core.index.OrderBase(kwargs, remapping)

Bases: OrderOrSelection

actions
abstractmethod build_actions(kwargs)
compare_elements(a, b)
property is_empty
remapping
class earthkit.data.core.index.OrderOrSelection
actions
property is_empty
class earthkit.data.core.index.Selection(kwargs, remapping=None)

Bases: OrderOrSelection

actions
property is_empty
match_element(element)
remapping = None