Using GRIB data
We will work with a GRIB file containing 6 messages. First we ensure the example file is available, then read the file with from_source().
[1]:
import earthkit.data as ekd
ekd.download_example_file("test6.grib")
[2]:
ds = ekd.from_source("file", "test6.grib")
No GRIB data is actually loaded at this point.
Iteration
A GRIB data object is represented by a GribFieldList consisting of GribFields. When used in iteration these are automatically created and released when going out of scope. As a result, during the iteration only one GRIB message at a time is kept in memory:
[3]:
for f in ds:
print(f)
GribField(t,1000,20180801,1200,0,0)
GribField(u,1000,20180801,1200,0,0)
GribField(v,1000,20180801,1200,0,0)
GribField(t,850,20180801,1200,0,0)
GribField(u,850,20180801,1200,0,0)
GribField(v,850,20180801,1200,0,0)
Inspecting the contents
We can use ls() or describe().
[4]:
len(ds)
[4]:
6
[5]:
ds.ls()
[5]:
| centre | shortName | typeOfLevel | level | dataDate | dataTime | stepRange | dataType | number | gridType | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ecmf | t | isobaricInhPa | 1000 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
| 1 | ecmf | u | isobaricInhPa | 1000 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
| 2 | ecmf | v | isobaricInhPa | 1000 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
| 3 | ecmf | t | isobaricInhPa | 850 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
| 4 | ecmf | u | isobaricInhPa | 850 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
| 5 | ecmf | v | isobaricInhPa | 850 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
[6]:
ds.describe()
[6]:
| level | date | time | step | paramId | class | stream | type | experimentVersionNumber | ||
|---|---|---|---|---|---|---|---|---|---|---|
| shortName | typeOfLevel | |||||||||
| t | isobaricInhPa | 1000,850 | 20180801 | 1200 | 0 | 130 | od | oper | an | 0001 |
| u | isobaricInhPa | 1000,850 | 20180801 | 1200 | 0 | 131 | od | oper | an | 0001 |
| v | isobaricInhPa | 1000,850 | 20180801 | 1200 | 0 | 132 | od | oper | an | 0001 |
Slicing
Standard Python slicing is available. It does not involve any loading/copying of GRIB data.
[7]:
g = ds[1]
g
[7]:
GribField(u,1000,20180801,1200,0,0)
[8]:
g = ds[1:3]
g.ls()
[8]:
| centre | shortName | typeOfLevel | level | dataDate | dataTime | stepRange | dataType | number | gridType | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ecmf | u | isobaricInhPa | 1000 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
| 1 | ecmf | v | isobaricInhPa | 1000 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
[9]:
g = ds[-1]
g
[9]:
GribField(v,850,20180801,1200,0,0)
Getting data values
Using values
The values property always returns a flat array per field:
[10]:
v = ds[0].values
v.shape
[10]:
(84,)
[11]:
v[0:4]
[11]:
array([272.56417847, 272.56417847, 272.56417847, 272.56417847])
When called on the whole fieldlist values returns a 2D array:
[12]:
v = ds.values
v.shape
[12]:
(6, 84)
Using to_numpy()
With to_numpy() the field shape is set on the array:
[13]:
v = ds[0].to_numpy()
print(v.shape)
print(ds[0].shape)
(7, 12)
(7, 12)
to_numpy() behaves in the same way when called on a fieldlist:
[14]:
v = ds.to_numpy()
v.shape
[14]:
(6, 7, 12)
Metadata
metadata() access works both on individual fields and slices:
[15]:
ds[0].metadata("typeOfLevel")
[15]:
'isobaricInhPa'
[16]:
ds[0:2].metadata(["level", "paramId"])
[16]:
[[1000, 130], [1000, 131]]
We can also call metadata() on a fieldlist:
[17]:
ds.metadata("level")
[17]:
[1000, 1000, 1000, 850, 850, 850]
Key qualifiers can be used to prescribe the required metadata type:
[18]:
ds[0].metadata(["centre", "centre", "centre"], astype=(None, int, str))
[18]:
['ecmf', 98, 'ecmf']
For each filed we can get the metadata as an object:
[19]:
md = ds[0].metadata()
md
[19]:
<earthkit.data.readers.grib.metadata.GribFieldMetadata at 0x2b49be2c0>
[20]:
md["level"]
[20]:
1000
Selection
sel() offers metadata-based selection and always creates a “view”, so no copying of GRIB data is involved.
[21]:
g = ds.sel(shortName=["u", "v"], level=850)
g.ls()
[21]:
| centre | shortName | typeOfLevel | level | dataDate | dataTime | stepRange | dataType | number | gridType | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ecmf | u | isobaricInhPa | 850 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
| 1 | ecmf | v | isobaricInhPa | 850 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
[22]:
g = ds.sel(param="t")
g.ls()
[22]:
| centre | shortName | typeOfLevel | level | dataDate | dataTime | stepRange | dataType | number | gridType | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ecmf | t | isobaricInhPa | 1000 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
| 1 | ecmf | t | isobaricInhPa | 850 | 20180801 | 1200 | 0 | an | 0 | regular_ll |
Xarray
Xarray conversion with to_xarray() does not involve disk writing. Under the hood it uses cfgrib.
[23]:
xds = ds.to_xarray(engine="cfgrib")
xds
[23]:
<xarray.Dataset> Size: 2kB
Dimensions: (number: 1, time: 1, step: 1, isobaricInhPa: 2, latitude: 7,
longitude: 12)
Coordinates:
* number (number) int64 8B 0
* time (time) datetime64[ns] 8B 2018-08-01T12:00:00
* step (step) timedelta64[ns] 8B 00:00:00
* isobaricInhPa (isobaricInhPa) float64 16B 1e+03 850.0
* latitude (latitude) float64 56B 90.0 60.0 30.0 0.0 -30.0 -60.0 -90.0
* longitude (longitude) float64 96B 0.0 30.0 60.0 ... 270.0 300.0 330.0
valid_time (time, step) datetime64[ns] 8B ...
Data variables:
t (number, time, step, isobaricInhPa, latitude, longitude) float32 672B ...
u (number, time, step, isobaricInhPa, latitude, longitude) float32 672B ...
v (number, time, step, isobaricInhPa, latitude, longitude) float32 672B ...
Attributes:
GRIB_edition: 1
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts
history: 2025-03-10T15:18 GRIB to CDM+CF via cfgrib-0.9.1...We can pass all the kwargs arguments cfgrib accepts. On top of that earthkit-data provides the ignore_keys option to omit keys from the default set of keys used by cfgrib:
[24]:
xds = ds.to_xarray(engine="cfgrib", xarray_open_dataset_kwargs={"backend_kwargs": {"ignore_keys": ["number"]}})
xds
[24]:
<xarray.Dataset> Size: 2kB
Dimensions: (time: 1, step: 1, isobaricInhPa: 2, latitude: 7,
longitude: 12)
Coordinates:
* time (time) datetime64[ns] 8B 2018-08-01T12:00:00
* step (step) timedelta64[ns] 8B 00:00:00
* isobaricInhPa (isobaricInhPa) float64 16B 1e+03 850.0
* latitude (latitude) float64 56B 90.0 60.0 30.0 0.0 -30.0 -60.0 -90.0
* longitude (longitude) float64 96B 0.0 30.0 60.0 ... 270.0 300.0 330.0
valid_time (time, step) datetime64[ns] 8B ...
Data variables:
t (time, step, isobaricInhPa, latitude, longitude) float32 672B ...
u (time, step, isobaricInhPa, latitude, longitude) float32 672B ...
v (time, step, isobaricInhPa, latitude, longitude) float32 672B ...
Attributes:
GRIB_edition: 1
GRIB_centre: ecmf
GRIB_centreDescription: European Centre for Medium-Range Weather Forecasts
GRIB_subCentre: 0
Conventions: CF-1.7
institution: European Centre for Medium-Range Weather Forecasts
history: 2025-03-10T15:18 GRIB to CDM+CF via cfgrib-0.9.1...[ ]: