Xarray engine: overview
Earthkit-data comes with its own Xarray engine called “earthkit” to perform conversions between GRIB and Xarray data.
To start with, we get the example data will use in this notebook and read it into a GRIB fieldlist.
[1]:
import earthkit.data as ekd
ds_fl = ekd.from_source("sample", "pl.grib")
Creating Xarray
To convert a GRIB fieldlist to Xarray we need to use to_xarray().
[2]:
ds = ds_fl.to_xarray()
ds
[2]:
<xarray.Dataset> Size: 176kB
Dimensions: (forecast_reference_time: 4, step: 2, level: 2,
latitude: 19, longitude: 36)
Coordinates:
* forecast_reference_time (forecast_reference_time) datetime64[ns] 32B 202...
* step (step) timedelta64[ns] 16B 00:00:00 06:00:00
* level (level) int64 16B 500 700
* latitude (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
* longitude (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
r (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
t (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
Attributes:
class: od
stream: oper
levtype: pl
type: fc
expver: 0001
date: 20240603
time: 0
domain: g
number: 0
Conventions: CF-1.8
institution: ECMWFto_xarray() has a large number of keyword arguments to control how the Xarray dataset is generated. To simplify the usage we can define profiles providing custom defaults for most of the keyword arguments. At the moment, there are 2 pre-defined profiles available: “mars” (the default) and “grib”. We can pass them to_xarray() via the profile kwarg.
Writing back to GRIB
This is an experimental feature!
In order to write back the Xarray into a GRIB it has to keep the original variable attributes that the eartkit engine generated. By default, variable attributes are not kept in Xarray computations so we need to set the global Xarray keep_attrs option to enable it as shown in the following cell:
[3]:
import xarray as xr
xr.set_options(keep_attrs=True)
ds = ds_fl.to_xarray()
ds += 1
Generating a fieldlist
To create GRIB fieldlist we need to call to_fieldlist() on the “earthkit” accessor. The result is an array fieldlist holding all the data in memory.
[4]:
ds_fl1 = ds.earthkit.to_fieldlist()
ds_fl1.head()
[4]:
| centre | shortName | typeOfLevel | level | dataDate | dataTime | stepRange | dataType | number | gridType | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ecmf | r | isobaricInhPa | 500 | 20240603 | 0 | 0 | fc | 0 | regular_ll |
| 1 | ecmf | r | isobaricInhPa | 700 | 20240603 | 0 | 0 | fc | 0 | regular_ll |
| 2 | ecmf | r | isobaricInhPa | 500 | 20240603 | 0 | 6 | fc | 0 | regular_ll |
| 3 | ecmf | r | isobaricInhPa | 700 | 20240603 | 0 | 6 | fc | 0 | regular_ll |
| 4 | ecmf | r | isobaricInhPa | 500 | 20240603 | 1200 | 0 | fc | 0 | regular_ll |
We can see that the GRIB field values changed as expected if we compare the original and resulting fieldlists.
[8]:
m_0 = ds_fl.sel(param="t", step=6, level=500)[0].values.mean()
m_1 = ds_fl1.sel(param="t", step=6, level=500)[0].values.mean()
m_0, m_1
[8]:
(254.25649845948692, 255.25649845948692)
Generating a GRIB file
Once we have the GRIB fieldlist it can be saved to disk using save() method.
[6]:
ds_fl1.to_target("file", "_from_xr_1.grib")
ekd.from_source("file", "_from_xr_1.grib").head()
[6]:
| centre | shortName | typeOfLevel | level | dataDate | dataTime | stepRange | dataType | number | gridType | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ecmf | r | isobaricInhPa | 500 | 20240603 | 0 | 0 | fc | 0 | regular_ll |
| 1 | ecmf | r | isobaricInhPa | 700 | 20240603 | 0 | 0 | fc | 0 | regular_ll |
| 2 | ecmf | r | isobaricInhPa | 500 | 20240603 | 0 | 6 | fc | 0 | regular_ll |
| 3 | ecmf | r | isobaricInhPa | 700 | 20240603 | 0 | 6 | fc | 0 | regular_ll |
| 4 | ecmf | r | isobaricInhPa | 500 | 20240603 | 1200 | 0 | fc | 0 | regular_ll |
It is also possible to directly write the Xarray into a GRIB file when calling to_grib() on the earthkit accessor. This will be a more memory efficient way to write the data to disk than generating a fieldlist first.
[7]:
ds.earthkit.to_grib("_from_xr_2.grib")
ekd.from_source("file", "_from_xr_2.grib").head()
[7]:
| centre | shortName | typeOfLevel | level | dataDate | dataTime | stepRange | dataType | number | gridType | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ecmf | r | isobaricInhPa | 500 | 20240603 | 0 | 0 | fc | 0 | regular_ll |
| 1 | ecmf | r | isobaricInhPa | 700 | 20240603 | 0 | 0 | fc | 0 | regular_ll |
| 2 | ecmf | r | isobaricInhPa | 500 | 20240603 | 0 | 6 | fc | 0 | regular_ll |
| 3 | ecmf | r | isobaricInhPa | 700 | 20240603 | 0 | 6 | fc | 0 | regular_ll |
| 4 | ecmf | r | isobaricInhPa | 500 | 20240603 | 1200 | 0 | fc | 0 | regular_ll |
[ ]: