Xarray engine: writing back to GRIB¶

Warning: converting back Xarray to GRIB is an experimental feature and is not yet fully supported.

First, we get some example GRIB data and convert it into Xarray.

[1]:

import earthkit.data as ekd

ds_fl = ekd.from_source("sample", "pl.grib").to_fieldlist()
ds_xr = ds_fl.to_xarray()
ds_xr

By default, add_earthkit_attrs=True in to_xarray() and some special earthkit attributes are added to the dataset. This is needed for the Xarray to GRIB conversion. For this reason, if the Xarray is modified we must ensure the variable attributes are copied to the new Xarray dataset. By default, variable attributes are not kept in Xarray computations so we need to set the global Xarray keep_attrs option to enable it.

[2]:

# ensure earthkit attributes are set
import xarray as xr

xr.set_options(keep_attrs=True)

# modify values
ds_xr += 1

Using `to_target()`¶

It is possible to directly write the Xarray dataset created with the earthkit engine into a GRIB file with to_target(). This is a memory efficient way to write GRIB to disk since only one field is loaded into memory at a time. We can call to_target() either on the earthkit accessor or as a top level function.

First, we write a datarray into a GRIB file.

[3]:

# option1: writing to GRIB file using the accessor
ds_xr["t"].earthkit.to_target("file", "_from_xr_1.grib")

# option2: writing to GRIB file using the top level function
ekd.to_target("file", "_from_xr_1a.grib", data=ds_xr["t"])

# check the results
ds_tmp1 = ekd.from_source("file", "_from_xr_1.grib").to_fieldlist()
ds_tmp1[0]

[3]:

Field

data

number_of_values	684
array_type	ndarray
array_dtype	float64

parameter

variable	t
standard_name	air_temperature
long_name	Temperature
units	kelvin
chem_variable	None

time

valid_datetime	2024-06-03 00:00:00
base_datetime	2024-06-03 00:00:00
step	0:00:00

vertical

level	500
layer	None
level_type	pressure

ensemble

member

geography

grid_spec	{'grid': [10, 10]}
grid_type	regular_ll
shape	(19, 36)
area	(90.0, 0.0, -90.0, 350.0)

Next, we write the whole dataset into a GRIB file.

[4]:

# option1: writing to GRIB file using the accessor
ds_xr.earthkit.to_target("file", "_from_xr_2.grib")

# option2: writing to GRIB file using the top level function
ekd.to_target("file", "_from_xr_2a.grib", data=ds_xr)

# check the results
ds_tmp2 = ekd.from_source("file", "_from_xr_2.grib").to_fieldlist()
ds_tmp2.head()

[4]:

	parameter.variable	time.valid_datetime	time.base_datetime	time.step	vertical.level	vertical.level_type	geography.grid_type
0	r	2024-06-03 00:00:00	2024-06-03 00:00:00	0 days 00:00:00	500	pressure	regular_ll
1	r	2024-06-03 00:00:00	2024-06-03 00:00:00	0 days 00:00:00	700	pressure	regular_ll
2	r	2024-06-03 06:00:00	2024-06-03 00:00:00	0 days 06:00:00	500	pressure	regular_ll
3	r	2024-06-03 06:00:00	2024-06-03 00:00:00	0 days 06:00:00	700	pressure	regular_ll
4	r	2024-06-03 12:00:00	2024-06-03 12:00:00	0 days 00:00:00	500	pressure	regular_ll

We check if the computation results were correctly written to the generated GRIB data.

[5]:

# original GRIB data
print(ds_fl.sel({"parameter.variable": "t", "time.step": 0, "vertical.level": 500})[0].values[0])
# GRIB data converted from the modified xarray object
print(ds_tmp1.sel({"parameter.variable": "t", "time.step": 0, "vertical.level": 500})[0].values[0])

250.22500610351562
251.22500610351562

Using `to_fieldlist()`¶

We can also convert the Xarray dataset into a GRIB fieldlist by using to_fieldlist() on the earthkit accessor of the Xarray object. Please note that this will generate a fieldlist entirely stored in memory.

First, we convert a dataarray to a GRIB fieldlist.

[6]:

ds_fl1 = ds_xr["t"].earthkit.to_fieldlist()
ds_fl1.head()

[6]:

	parameter.variable	time.valid_datetime	time.base_datetime	time.step	vertical.level	vertical.level_type	geography.grid_type
0	t	2024-06-03 00:00:00	2024-06-03 00:00:00	0 days 00:00:00	500	pressure	regular_ll
1	t	2024-06-03 00:00:00	2024-06-03 00:00:00	0 days 00:00:00	700	pressure	regular_ll
2	t	2024-06-03 06:00:00	2024-06-03 00:00:00	0 days 06:00:00	500	pressure	regular_ll
3	t	2024-06-03 06:00:00	2024-06-03 00:00:00	0 days 06:00:00	700	pressure	regular_ll
4	t	2024-06-03 12:00:00	2024-06-03 12:00:00	0 days 00:00:00	500	pressure	regular_ll

Next, we convert back the whole dataset into a GRIB fieldlist.

[7]:

ds_fl2 = ds_xr.earthkit.to_fieldlist()
ds_fl2.head()

[7]:

	parameter.variable	time.valid_datetime	time.base_datetime	time.step	vertical.level	vertical.level_type	geography.grid_type
0	r	2024-06-03 00:00:00	2024-06-03 00:00:00	0 days 00:00:00	500	pressure	regular_ll
1	r	2024-06-03 00:00:00	2024-06-03 00:00:00	0 days 00:00:00	700	pressure	regular_ll
2	r	2024-06-03 06:00:00	2024-06-03 00:00:00	0 days 06:00:00	500	pressure	regular_ll
3	r	2024-06-03 06:00:00	2024-06-03 00:00:00	0 days 06:00:00	700	pressure	regular_ll
4	r	2024-06-03 12:00:00	2024-06-03 12:00:00	0 days 00:00:00	500	pressure	regular_ll

The generated GRIB fieldlist can be saved to disk using the to_target() method.

[8]:

out_name = "_from_xr_3.grib"
ds_fl1.to_target("file", out_name)
# read back and check the saved GRIB
ds_tmp = ekd.from_source("file", out_name).to_fieldlist()
ds_tmp[0]

[8]:

Field

data

number_of_values	684
array_type	ndarray
array_dtype	float64

parameter

variable	t
standard_name	air_temperature
long_name	Temperature
units	kelvin
chem_variable	None

time

valid_datetime	2024-06-03 00:00:00
base_datetime	2024-06-03 00:00:00
step	0:00:00

vertical

level	500
layer	None
level_type	pressure

ensemble

member

geography

grid_spec	{'grid': [10, 10]}
grid_type	regular_ll
shape	(19, 36)
area	(90.0, 0.0, -90.0, 350.0)

[ ]:

Xarray engine: writing back to GRIB¶

Using to_target()¶

Using to_fieldlist()¶

Using `to_target()`¶

Using `to_fieldlist()`¶