Xarray engine: writing back to GRIB¶
Warning: converting back Xarray to GRIB is an experimental feature and is not yet fully supported.
First, we get some example GRIB data and convert it into Xarray.
[1]:
import earthkit.data as ekd
ds_fl = ekd.from_source("sample", "pl.grib").to_fieldlist()
ds_xr = ds_fl.to_xarray()
ds_xr
[1]:
<xarray.Dataset> Size: 176kB
Dimensions: (forecast_reference_time: 4, step: 2, level: 2,
latitude: 19, longitude: 36)
Coordinates:
* forecast_reference_time (forecast_reference_time) datetime64[ns] 32B 202...
* step (step) timedelta64[ns] 16B 00:00:00 06:00:00
* level (level) int64 16B 500 700
* latitude (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
* longitude (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
r (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
t (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
Attributes:
Conventions: CF-1.8
institution: ECMWFBy default, add_earthkit_attrs=True in to_xarray() and some special earthkit attributes are added to the dataset. This is needed for the Xarray to GRIB conversion. For this reason, if the Xarray is modified we must ensure the variable attributes are copied to the new Xarray dataset. By default, variable attributes are not kept in Xarray computations so we need to set the global Xarray keep_attrs option to enable it.
[2]:
# ensure earthkit attributes are set
import xarray as xr
xr.set_options(keep_attrs=True)
# modify values
ds_xr += 1
Using to_target()¶
It is possible to directly write the Xarray dataset created with the earthkit engine into a GRIB file with to_target(). This is a memory efficient way to write GRIB to disk since only one field is loaded into memory at a time. We can call to_target() either on the earthkit accessor or as a top level function.
First, we write a datarray into a GRIB file.
[3]:
# option1: writing to GRIB file using the accessor
ds_xr["t"].earthkit.to_target("file", "_from_xr_1.grib")
# option2: writing to GRIB file using the top level function
ekd.to_target("file", "_from_xr_1a.grib", data=ds_xr["t"])
# check the results
ds_tmp1 = ekd.from_source("file", "_from_xr_1.grib").to_fieldlist()
ds_tmp1[0]
[3]:
| number_of_values | 684 |
| array_type | ndarray |
| array_dtype | float64 |
| variable | t |
| standard_name | air_temperature |
| long_name | Temperature |
| units | kelvin |
| chem_variable | None |
| valid_datetime | 2024-06-03 00:00:00 |
| base_datetime | 2024-06-03 00:00:00 |
| step | 0:00:00 |
| level | 500 |
| layer | None |
| level_type | pressure |
| member | 0 |
| grid_spec | {'grid': [10, 10]} |
| grid_type | regular_ll |
| shape | (19, 36) |
| area | (90.0, 0.0, -90.0, 350.0) |
Next, we write the whole dataset into a GRIB file.
[4]:
# option1: writing to GRIB file using the accessor
ds_xr.earthkit.to_target("file", "_from_xr_2.grib")
# option2: writing to GRIB file using the top level function
ekd.to_target("file", "_from_xr_2a.grib", data=ds_xr)
# check the results
ds_tmp2 = ekd.from_source("file", "_from_xr_2.grib").to_fieldlist()
ds_tmp2.head()
[4]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | r | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll |
| 1 | r | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 700 | pressure | 0 | regular_ll |
| 2 | r | 2024-06-03 06:00:00 | 2024-06-03 00:00:00 | 0 days 06:00:00 | 500 | pressure | 0 | regular_ll |
| 3 | r | 2024-06-03 06:00:00 | 2024-06-03 00:00:00 | 0 days 06:00:00 | 700 | pressure | 0 | regular_ll |
| 4 | r | 2024-06-03 12:00:00 | 2024-06-03 12:00:00 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll |
We check if the computation results were correctly written to the generated GRIB data.
[5]:
# original GRIB data
print(ds_fl.sel({"parameter.variable": "t", "time.step": 0, "vertical.level": 500})[0].values[0])
# GRIB data converted from the modified xarray object
print(ds_tmp1.sel({"parameter.variable": "t", "time.step": 0, "vertical.level": 500})[0].values[0])
250.22500610351562
251.22500610351562
Using to_fieldlist()¶
We can also convert the Xarray dataset into a GRIB fieldlist by using to_fieldlist() on the earthkit accessor of the Xarray object. Please note that this will generate a fieldlist entirely stored in memory.
First, we convert a dataarray to a GRIB fieldlist.
[6]:
ds_fl1 = ds_xr["t"].earthkit.to_fieldlist()
ds_fl1.head()
[6]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll |
| 1 | t | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 700 | pressure | 0 | regular_ll |
| 2 | t | 2024-06-03 06:00:00 | 2024-06-03 00:00:00 | 0 days 06:00:00 | 500 | pressure | 0 | regular_ll |
| 3 | t | 2024-06-03 06:00:00 | 2024-06-03 00:00:00 | 0 days 06:00:00 | 700 | pressure | 0 | regular_ll |
| 4 | t | 2024-06-03 12:00:00 | 2024-06-03 12:00:00 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll |
Next, we convert back the whole dataset into a GRIB fieldlist.
[7]:
ds_fl2 = ds_xr.earthkit.to_fieldlist()
ds_fl2.head()
[7]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | r | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll |
| 1 | r | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 700 | pressure | 0 | regular_ll |
| 2 | r | 2024-06-03 06:00:00 | 2024-06-03 00:00:00 | 0 days 06:00:00 | 500 | pressure | 0 | regular_ll |
| 3 | r | 2024-06-03 06:00:00 | 2024-06-03 00:00:00 | 0 days 06:00:00 | 700 | pressure | 0 | regular_ll |
| 4 | r | 2024-06-03 12:00:00 | 2024-06-03 12:00:00 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll |
The generated GRIB fieldlist can be saved to disk using the to_target() method.
[8]:
out_name = "_from_xr_3.grib"
ds_fl1.to_target("file", out_name)
# read back and check the saved GRIB
ds_tmp = ekd.from_source("file", out_name).to_fieldlist()
ds_tmp[0]
[8]:
| number_of_values | 684 |
| array_type | ndarray |
| array_dtype | float64 |
| variable | t |
| standard_name | air_temperature |
| long_name | Temperature |
| units | kelvin |
| chem_variable | None |
| valid_datetime | 2024-06-03 00:00:00 |
| base_datetime | 2024-06-03 00:00:00 |
| step | 0:00:00 |
| level | 500 |
| layer | None |
| level_type | pressure |
| member | 0 |
| grid_spec | {'grid': [10, 10]} |
| grid_type | regular_ll |
| shape | (19, 36) |
| area | (90.0, 0.0, -90.0, 350.0) |
[ ]: