Data objects

Warning

This guide is currently under construction and may be incomplete or inaccurate.

Methods from_source() and from_object() return a Data object. This only provides some basic information about the data and its primary goal is to allow conversions to suitable representations for further work. The actual data loading is deferred as much as possible, until the data is converted into a given type.

The list of available conversion types can be checked with the available_types property of the returned object. Then conversions can be performed by calling any of the corresponding to_* methods to convert the data to the desired type. E.g. to convert GRIB data to a fieldlist we can do:

>>> import earthkit.data as ekd
>>> data = ekd.from_source("file", "test6.grib")
>>> data.available_types
['fieldlist', 'xarray', 'pandas', 'numpy', 'array']
# to convert to a fieldlist
>>> fl = data.to_fieldlist()

Data object returned by from_source

File input

When from_source() reads a file input (can be data on disk, URL or memory or from a remote service) one of the following objects is returned:

Streams

When the data is read as a stream with from_source() one of the following objects is returned:

Types related to file formats

Input data type

Resulting data object

GRIB

earthkit.data.data.stream.StreamFieldListData

CovJSON

earthkit.data.data.stream.StreamFeatureListData

To access the stream we need to convert the data into a stream fieldlist (GRIB) with to_fieldlist or a stream featurelist (CovJSON) with to_featurelist. Then we can use the resulting object to iterate through the stream once.

>>> import earthkit.data as ekd
>>> url = "https://sites.ecmwf.int/repository/earthkit-data/how-tos/test4.grib"
>>> ds = ekd.from_source("url", url, stream=True)
>>> fl = ds.to_fieldlist()
>>> for f in fl:
...     print(f)
...
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)

Examples

Special cases

There a complex cases with mixed input data types when the returned object might be one of the following:

Data object returned by from_object

The method from_object() is used to turn a Python object into an earthkit Data Data object. When it is called with an earthkit-data object it returns the object itself. Otherwise, it returns the following objects depending on the input: