FieldList¶

A FieldList is an ordered, indexable collection of Field objects. It is the primary interface returned by from_source() and acts as the main entry point for working with multi-field datasets in EarthKit Data.

>>> import earthkit.data as ekd
>>> ds = ekd.from_source("sample", "tuv_pl.grib").to_fieldlist()
>>> len(ds)
18
>>> ds[0]
GribField(u, 1000, 2020-01-01 00:00, 0, None, None)

Indexing and slicing¶

FieldLists support integer indexing and Python slice notation. Slicing returns a new FieldList containing the selected fields:

>>> ds[0]           # single field
>>> ds[0:3]         # slice — returns a FieldList
>>> ds[-1]          # last field

Iteration¶

Iterating over a FieldList yields individual Field objects one at a time:

>>> for field in ds:
...     print(field.parameter.variable(), field.vertical.level())

Selection¶

sel() filters a FieldList by metadata values and returns a new FieldList containing only the matching fields. Keys follow the "component.key" convention used throughout EarthKit Data:

>>> pl500 = ds.sel({"vertical.level": 500})
>>> wind = ds.sel({"parameter.variable": ["u", "v"]})

Source-native keys (e.g. GRIB shortName, level) can also be used by prefixing them with "metadata.":

>>> plt500 = ds.sel({"metadata.shortName": "t", "metadata.level": 500})

Ordering¶

order_by() returns a new FieldList sorted by one or more metadata keys. Multiple keys can be passed as a list and are applied in order:

>>> ds_sorted = ds.order_by(["parameter.variable", "vertical.level"])

Metadata access¶

get() returns a list of metadata values, one per field, using the "component.key" keys described in the component pages:

>>> ds.get("parameter.variable")
['u', 'v', 't', ...]
>>> ds.get(["parameter.variable", "vertical.level"])
[('u', 1000), ('v', 1000), ('t', 1000), ...]

For source-native keys (e.g. GRIB shortName, level), metadata() can be used directly without any prefix:

>>> ds.metadata("shortName")
['u', 'v', 't', ...]

The ls() method provides a quick tabular summary of the most commonly used metadata keys, In Jupyter notebooks, the output is rendered as a table; in other environments e.g. terminal, it has to be printed with print() to see the table:

>>> ds.ls() # in Jupyter notebook, this is rendered as a table

>>> print(ds.ls()) # in terminal, an extra print() is needed to render the table

Extracting data¶

to_numpy() returns the field values stacked into a NumPy array and accepts dtype, copy, and other arguments. The shape of the result depends on the shape of the individual fields:

for fields with a 1-D grid (e.g. unstructured grids), the result has shape (number_of_fields, number_of_grid_points);
for fields with a 2-D grid (e.g. regular lat/lon grids), the result has shape (number_of_fields, Ny, Nx).

By default a copy of the data is returned. Pass copy=False to avoid the copy when the data layout allows it; note that a copy may still be made if the underlying source cannot provide a zero-copy view:

>>> ds.to_numpy(dtype="float32").shape
(18, 19, 36)
>>> ds.to_numpy(copy=False).shape
(18, 19, 36)

values is a convenience property that always returns a 2-D array of shape (number_of_fields, number_of_grid_points), where each row is the flat 1-D array of values for one field. The array type matches the native array format of the underlying data (e.g. NumPy for GRIB and NetCDF fields, or a GPU array for array-backed FieldLists). Each access returns a copy:

>>> ds.values.shape
(18, 684)

Converting to Xarray¶

to_xarray() converts the FieldList to an xarray.Dataset. EarthKit Data uses a dedicated Xarray engine that maps field metadata to dataset dimensions and coordinates:

>>> xr_ds = ds.to_xarray()
>>> xr_ds
<xarray.Dataset>
Dimensions:  ...

Concatenation¶

Two FieldLists can be concatenated with the + operator, producing a new FieldList that contains all fields from both operands in order:

>>> combined = ds1 + ds2
>>> len(combined) == len(ds1) + len(ds2)
True

FieldList types¶

There are several concrete FieldList implementations, each suited to a different access pattern:

SimpleFieldList — in-memory list, produced by most operations such as sel(), order_by(), and +.
StreamFieldList — backed by a streaming source (e.g. FDB or URL stream). Supports forward iteration only; use to_fieldlist() with read_all=True to materialise into memory.
ArrayFieldList — backed by NumPy arrays, used when constructing fields programmatically.
FileFieldList — backed by an on-disk file (e.g. a cached GRIB file).