data.sources.array_list

Classes

ArrayField

Represent a field consisting of an array and metadata object.

Module Contents

class data.sources.array_list.ArrayField(array, metadata)

Bases: earthkit.data.core.fieldlist.Field

Represent a field consisting of an array and metadata object.

Parameters:
  • array (array) – Array storing the values of the field

  • metadata (Metadata) – Metadata object describing the field metadata.

an_datetime()
property array_backend

Return the array namespace of the field.

Type:

ArrayBackend

property array_namespace

Return the array namespace of the field.

Type:

ArrayBackend

base_datetime()
batched(*args)

Return iterator for batches of data

bounding_box()

Return the bounding box of the field.

Return type:

BoundingBox

clone(**kwargs)

Create a new ClonedField the with updated values and/or metadata.

Parameters:
  • values (array-like or None) – The values to be stored in the new ClonedField. When it is None the resulting ClonedField will access the values from the original field.

  • metadata (dict, Metadata or None) – If it is a dictionary, it is merged with **kwargs and interpreted in the same way as **kwargs. If it is a Metadata object, it is used as the new metadata. In this case **kwargs cannot be used.

  • **kwargs (dict, optional) – Keys and values to update the metadata with. Metadata values can also be callables with the following positional arguments: original_field, key, original_metadata. The new ClonedField will contain a reference to the original metadata object and keys not present in kwargs will be accessed from the original field.

Returns:

The new field with updated values and/or metadata keeping a reference to the original field.

Return type:

ClonedField

Raises:

ValueError – If metadata is a Metadata object and **kwargs is not empty.

copy(*, values=None, flatten=False, dtype=None, array_backend=None, array_namespace=None, device=None, metadata=None)

Create a new ArrayField by copying the values and metadata.

Parameters:
  • values (array-like or None) – The values to be stored in the new Field. When it is None the values extracted from the original field by using to_array with flatten, dtype and array_backend and copied to the new field.

  • flatten (bool) – Control the shape of the values when they are extracted from the original field. When True, flatten the array, otherwise the field’s shape is kept. Only used when values is not provided.

  • dtype (str, array.dtype or None) – Control the typecode or data-type of the values when they are extracted from the original field. If None, the default type used by the underlying data accessor is used. For GRIB it is float64. Only used when values is not provided.

  • array_backend (str, array_namespace or None) – Control the array namespace of the values when they are extracted from the original field. If None, the underlying array format of the field is used. Only used when values is not provided. Deprecated since version 0.19.0. Use array_namespace instead. In versions before 0.19.0 an ArrayBackend was also accepted here, which is no longer the case.

  • array_namespace (str, array_namespace or None) – Control the array namespace of the values when they are extracted from the original field. When it is None the underlying array format of the field is used. New in version 0.19.0.

  • device (str or None) – The device where the array will be allocated. When it is None the default device is used.

  • metadata (Metadata or None) – The metadata to be stored in the new Field. When it is None a copy of the metadata of the original field is used.

Return type:

ArrayField

data(keys=('lat', 'lon', 'value'), flatten=False, dtype=None, index=None)

Return the values and/or the geographical coordinates for each grid point.

Parameters:
  • keys (str, list or tuple) – Specifies the type of data to be returned. Any combination of “lat”, “lon” and “value” is allowed here.

  • flatten (bool) – When it is True a flat array per key is returned. Otherwise an array with the field’s shape is returned for each key.

  • dtype (str, array.dtype or None) – Typecode or data-type of the arrays. When it is None the default type used by the underlying data accessor is used. For GRIB it is float64.

  • index (array indexing object, optional) – The index of the values and or the latitudes/longitudes to be extracted. When it is None all the values and/or coordinates are extracted.

Returns:

An multi-dimensional array containing one array per key is returned (following the order in keys). The underlying array format of the field is used. When keys is a single value only the array belonging to the key is returned.

Return type:

array-like

Examples

  • /examples/grib_lat_lon_value.ipynb

>>> import earthkit.data
>>> ds = earthkit.data.from_source("file", "docs/examples/test6.grib")
>>> d = ds[0].data()
>>> d.shape
(3, 7, 12)
>>> d[0, 0, 0]  # first latitude
90.0
>>> d[1, 0, 0]  # first longitude
0.0
>>> d[2, 0, 0]  # first value
272.56417847
>>> d = ds[0].data(keys="lon")
>>> d.shape
(7, 12)
>>> d[0, 0]  # first longitude
0.0
datetime()

Return the date and time of the field.

Returns:

Dict with items “base_time” and “valid_time”.

Return type:

dict of datatime.datetime

Examples

>>> import earthkit.data
>>> ds = earthkit.data.from_source("file", "tests/data/t_time_series.grib")
>>> ds[4].datetime()
{'base_time': datetime.datetime(2020, 12, 21, 12, 0),
'valid_time': datetime.datetime(2020, 12, 21, 18, 0)}
default_encoder()
describe(*args, **kwargs)

Generate a summary of the Field.

dump(namespace=all, **kwargs)

Generate dump with all the metadata keys belonging to namespace.

In a Jupyter notebook it is represented as a tabbed interface.

Parameters:
  • namespace (str, list, tuple, None or all) –

    The namespace to dump. The following namespace values have a special meaning:

    • all: all the available namespaces will be used.

    • None or empty str: all the available keys will be used

      (without a namespace qualifier)

  • **kwargs (dict, optional) – Other keyword arguments used for testing only

Returns:

Dict-like object with one item per namespace. In a Jupyter notebook represented as a tabbed interface to browse the dump contents.

Return type:

NamespaceDump

Examples

GRIB: inspecting contents

grid_points()
grid_points_unrotated()
group_by(*args)

Return iterator for batches of data grouped by metadata keys

h_datetime()
property handle
indexing_datetime()
abstract isel(*args, **kwargs)
ls(*args, **kwargs)

Generate a list like summary using a set of metadata keys.

Parameters:
  • *args (tuple) – Positional arguments passed to FieldList.ls.

  • **kwargs (dict, optional) – Other keyword arguments passed to FieldList.ls.

Returns:

DataFrame with one row.

Return type:

Pandas DataFrame

property mars_area
property mars_grid
classmethod merge(*args, **kwargs)

Merge the object with other ones.

abstract message()

Return a buffer containing the encoded message for message based formats (e.g. GRIB).

Return type:

bytes

metadata(*keys, astype=None, remapping=None, patches=None, **kwargs)

Return metadata values from the field.

When called without any arguments returns a Metadata object.

Parameters:
  • *keys (tuple) – Positional arguments specifying metadata keys. Can be empty, in this case all the keys from the specified namespace will be used. (See examples below).

  • astype (type name, list or tuple) – Return types for keys. A single value is accepted and applied to all the keys. Otherwise, must have same the number of elements as keys. Only used when keys is not empty.

  • remapping (dict, optional) –

    Creates new metadata keys from existing ones that we can refer to in *args and **kwargs. E.g. to define a new key “param_level” as the concatenated value of the “param” and “level” keys use:

    remapping={"param_level": "{param}{level}"}
    

  • **kwargs (dict, optional) –

    Other keyword arguments:

    • namespace: str, list, tuple, None or all

      The namespace to choose the keys from. When keys is empty and namespace is all all the available namespaces will be used. When keys is non empty namespace cannot specify multiple values and it cannot be all. When namespace is None or empty str all the available keys will be used (without a namespace qualifier).

    • default: value, optional

      Specifies the same default value for all the keys specified. When default is not present and a key is not found or its value is a missing value metadata will raise KeyError.

Returns:

  • when called without any arguments returns a Metadata object

  • when keys is not empty:
    • returns single value when keys is a str

    • otherwise returns the same type as that of keys (list or tuple)

  • when keys is empty:
    • when namespace is None or an empty str returns a dict with all the available keys and values

    • when namespace is str returns a dict with the keys and values in that namespace

    • otherwise returns a dict with one item per namespace (dict of dict)

Return type:

single value, list, tuple, dict or Metadata

Raises:

KeyError – If no default is set and a key is not found in the message or it has a missing value.

Examples

>>> import earthkit.data
>>> ds = earthkit.data.from_source("file", "docs/examples/test.grib")

Calling without arguments:

>>> r = ds[0].metadata()
>>> r
<earthkit.data.readers.grib.metadata.GribMetadata object at 0x164ace170>
>>> r["name"]
'2 metre temperature'

Getting keys with their native type:

>>> ds[0].metadata("param")
'2t'
>>> ds[0].metadata("param", "units")
('2t', 'K')
>>> ds[0].metadata(("param", "units"))
('2t', 'K')
>>> ds[0].metadata(["param", "units"])
['2t', 'K']
>>> ds[0].metadata(["param"])
['2t']
>>> ds[0].metadata("badkey")
KeyError: 'badkey'
>>> ds[0].metadata("badkey", default=None)

Prescribing key types:

>>> ds[0].metadata("centre", astype=int)
98
>>> ds[0].metadata(["paramId", "centre"], astype=int)
[167, 98]
>>> ds[0].metadata(["centre", "centre"], astype=[int, str])
[98, 'ecmf']

Using namespaces:

>>> ds[0].metadata(namespace="parameter")
{'centre': 'ecmf', 'paramId': 167, 'units': 'K', 'name': '2 metre temperature', 'shortName': '2t'}
>>> ds[0].metadata(namespace=["parameter", "vertical"])
{'parameter': {'centre': 'ecmf', 'paramId': 167, 'units': 'K', 'name': '2 metre temperature',
 'shortName': '2t'},
 'vertical': {'typeOfLevel': 'surface', 'level': 0}}
>>> r = ds[0].metadata(namespace=all)
>>> r.keys()
dict_keys(['default', 'ls', 'geography', 'mars', 'parameter', 'statistics', 'time', 'vertical'])
>>> r = ds[0].metadata(namespace=None)
>>> len(r)
186
>>> r["name"]
'2 metre temperature'
mutate()
abstract order_by(*args, **kwargs)

Reorder the elements of the object.

projection()

Return information about the projection.

Return type:

Projection

Examples

>>> import earthkit.data
>>> ds = earthkit.data.from_source("file", "docs/examples/test.grib")
>>> ds.projection()
<Projected CRS: +proj=eqc +ellps=WGS84 +a=6378137.0 +lon_0=0.0 +to ...>
Name: unknown
Axis Info [cartesian]:
- E[east]: Easting (unknown)
- N[north]: Northing (unknown)
- h[up]: Ellipsoidal height (metre)
Area of Use:
- undefined
Coordinate Operation:
- name: unknown
- method: Equidistant Cylindrical
Datum: Unknown based on WGS 84 ellipsoid
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
>>> ds.projection().to_proj_string()
'+proj=eqc +ellps=WGS84 +a=6378137.0 +lon_0=0.0 +to_meter=111319.4907932736 +no_defs +type=crs'
property resolution
property rotation
save(filename, append=False, **kwargs)

Write the field into a file.

Parameters:
  • filename (str, optional) – The target file path, if not defined attempts will be made to detect the filename

  • append (bool, optional) – When it is true append data to the target file. Otherwise the target file be overwritten if already exists. Default is False

  • **kwargs (dict, optional) – Other keyword arguments passed to write.

abstract sel(*args, **kwargs)

Filter the object based on metadata.

property shape

Get the shape of the field.

For structured grids the shape is a tuple in the form of (Nj, Ni) where:

  • ni: the number of gridpoints in i direction (longitude for a regular latitude-longitude grid)

  • nj: the number of gridpoints in j direction (latitude for a regular latitude-longitude grid)

For other grid types the number of gridpoints is returned as (num,)

Type:

tuple

to_array(flatten=False, dtype=None, array_backend=None, array_namespace=None, device=None, index=None)

Return the values stored in the field.

Parameters:
  • flatten (bool) – When it is True a flat array is returned. Otherwise an array with the field’s shape is returned.

  • dtype (str, array.dtype or None) – Typecode or data-type of the array. When it is None the default type used by the underlying data accessor is used. For GRIB it is float64.

  • array_backend (str, array_namespace or None) – The array namespace to be used. When it is None the underlying array format of the field is used. Deprecated since version 0.19.0. Use array_namespace instead. In versions before 0.19.0 an ArrayBackend was also accepted here, which is no longer the case.

  • array_namespace (str, array_namespace or None) – The array namespace to be used. When it is None the underlying array format of the field is used. New in version 0.19.0.

  • device (str or None) – The device where the array will be allocated. When it is None the default device is used.

  • index (array indexing object, optional) – The index of the values and to be extracted. When it is None all the values are extracted

Returns:

Field values.

Return type:

array-array

to_latlon(flatten=False, dtype=None, index=None)

Return the latitudes/longitudes of all the gridpoints in the field.

Parameters:
  • flatten (bool) – When it is True 1D arrays are returned. Otherwise arrays with the field’s shape are returned.

  • dtype (str, array.dtype or None) – Typecode or data-type of the arrays. When it is None the default type used by the underlying data accessor is used. For GRIB it is float64.

  • index (array indexing object, optional) – The index of the latitudes/longitudes to be extracted. When it is None all the values are extracted.

Returns:

Dictionary with items “lat” and “lon”, containing the arrays of the latitudes and longitudes, respectively. The underlying array format of the field is used.

Return type:

dict

See also

to_points

to_numpy(flatten=False, dtype=None, index=None)

Return the values stored in the field as an ndarray.

Parameters:
  • flatten (bool) – When it is True a flat ndarray is returned. Otherwise an ndarray with the field’s shape is returned.

  • dtype (str, numpy.dtype or None) – Typecode or data-type of the array. When it is None the default type used by the underlying data accessor is used. For GRIB it is float64.

  • index (ndarray indexing object, optional) – The index of the values and to be extracted. When it is None all the values are extracted

Returns:

Field values

Return type:

ndarray

abstract to_pandas(**kwargs)

Convert into a pandas dataframe

to_points(flatten=False, dtype=None, index=None)

Return the geographical coordinates in the data’s original Coordinate Reference System (CRS).

Parameters:
  • flatten (bool) – When it is True 1D arrays are returned. Otherwise arrays with the field’s shape are returned.

  • dtype (str, array.dtype or None) – Typecode or data-type of the arrays. When it is None the default type used by the underlying data accessor is used. For GRIB it is float64.

  • index (array indexing object, optional) – The index of the coordinates to be extracted. When it is None all the values are extracted.

Returns:

Dictionary with items “x” and “y”, containing the arrays of the x and y coordinates, respectively. The underlying array format of the field is used.

Return type:

dict

Raises:

ValueError – When the coordinates in the data’s original CRS are not available.

See also

to_latlon

to_target(target, *args, **kwargs)

Write the field into a target object.

Parameters:
  • target (object) – The target object to write the field into.

  • *args (tuple) – Positional arguments used to specify the target object.

  • **kwargs (dict, optional) – Other keyword arguments used to write the field into the target object.

to_xarray(*args, **kwargs)

Convert the Field into an Xarray Dataset.

Parameters:
  • *args (tuple) – Positional arguments passed to FieldList.to_xarray.

  • **kwargs (dict, optional) – Other keyword arguments passed to FieldList.to_xarray.

Return type:

Xarray Dataset

unique_values(*coords, remapping=None, patches=None, progress_bar=False)

Given a list of metadata attributes, such as date, param, levels, returns the list of unique values for each attributes

valid_datetime()
property values

Get the values stored in the field as a 1D array.

Type:

array-like

write(f, **kwargs)