Xarray engine: remappingΒΆ
[1]:
import earthkit.data as ekd
Remapping used to define a custom dimensionΒΆ
Let us consider 3 ensemble members: 1 control (cf) and 2 perturbed members (pf).
[2]:
ds_fl = ekd.from_source("sample", "ens_cf_pf.grib").to_fieldlist()
ds_fl.ls(extra_keys="metadata.dataType")
[2]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | metadata.dataType | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | t | 2024-06-03 00:00:00 | 2024-06-03 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll | cf |
| 1 | t | 2024-06-03 06:00:00 | 2024-06-03 | 0 days 06:00:00 | 500 | pressure | 0 | regular_ll | cf |
| 2 | t | 2024-06-03 00:00:00 | 2024-06-03 | 0 days 00:00:00 | 500 | pressure | 1 | regular_ll | pf |
| 3 | t | 2024-06-03 00:00:00 | 2024-06-03 | 0 days 00:00:00 | 500 | pressure | 2 | regular_ll | pf |
| 4 | t | 2024-06-03 06:00:00 | 2024-06-03 | 0 days 06:00:00 | 500 | pressure | 1 | regular_ll | pf |
| 5 | t | 2024-06-03 06:00:00 | 2024-06-03 | 0 days 06:00:00 | 500 | pressure | 2 | regular_ll | pf |
Suppose we want to organise this field list along a custom dimension called "custom_member", whose coordinates are constructed by combining the metadata keys "metadata.dataType" and "ensemble.member", for example: ["cf_0", "pf_1", "pf_2"].
To achieve this, we
use the
remappingoption to define a virtual key"custom_member", anddeclare
"custom_member"as a new dimension.
[3]:
ds = ds_fl.to_xarray(
remapping={"custom_member": "{metadata.dataType}_{ensemble.member}"},
extra_dims="custom_member",
add_earthkit_attrs=False,
)
ds
[3]:
<xarray.Dataset> Size: 33kB
Dimensions: (custom_member: 3, step: 2, latitude: 19, longitude: 36)
Coordinates:
* custom_member (custom_member) <U4 48B 'cf_0' 'pf_1' 'pf_2'
* step (step) timedelta64[ns] 16B 00:00:00 06:00:00
* latitude (latitude) float64 152B 90.0 80.0 70.0 ... -70.0 -80.0 -90.0
* longitude (longitude) float64 288B 0.0 10.0 20.0 ... 330.0 340.0 350.0
Data variables:
t (custom_member, step, latitude, longitude) float64 33kB ...
Attributes:
Conventions: CF-1.8
institution: ECMWFNote that it is not necessary to explicitly remove the predefined dimension "member" using the drop_dims option. The Xarray engine automatically drops it because it is already incorporated into another dimension β in this case, "custom_member".
Below, we present a more elaborate example illustrating how remapping can be used in conjunction with the extra_dims and dims_as_attrs options.
[4]:
ds2 = ds_fl.to_xarray(
squeeze=True,
remapping={
"custom_member": "{metadata.dataType}_{ensemble.member}",
"mars": "{metadata.class}_{metadata.stream}",
},
extra_dims=["custom_member", "mars"],
dims_as_attrs="mars",
add_earthkit_attrs=False,
)
ds2
[4]:
<xarray.Dataset> Size: 33kB
Dimensions: (custom_member: 3, step: 2, latitude: 19, longitude: 36)
Coordinates:
* custom_member (custom_member) <U4 48B 'cf_0' 'pf_1' 'pf_2'
* step (step) timedelta64[ns] 16B 00:00:00 06:00:00
* latitude (latitude) float64 152B 90.0 80.0 70.0 ... -70.0 -80.0 -90.0
* longitude (longitude) float64 288B 0.0 10.0 20.0 ... 330.0 340.0 350.0
Data variables:
t (custom_member, step, latitude, longitude) float64 33kB ...
Attributes:
Conventions: CF-1.8
institution: ECMWFAbove, we declared "mars" as a new dimension whose coordinates combine the "class" and "stream" metadata keys. Because this dimension has size 1, it is squeezed by default. However, the "dims_as_attrs" option causes the coordinate value of this dimension to be preserved as a variable attribute.
[5]:
ds2["t"].attrs
[5]:
{'standard_name': 'air_temperature',
'long_name': 'Temperature',
'units': 'kelvin',
'level_type': 'pressure',
'mars': 'od_enfo'}
Remapping used to define a custom variable nameΒΆ
The following GRIB dataset contains the parameters t and u on both pressure levels and hybrid (model) levels.
[6]:
ds_fl2 = ekd.from_source("sample", "mixed_pl_ml.grib").to_fieldlist()
ds_fl2.ls()
[6]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 700 | pressure | 0 | regular_ll |
| 1 | u | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 700 | pressure | 0 | regular_ll |
| 2 | t | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll |
| 3 | u | 2024-06-03 00:00:00 | 2024-06-03 00:00:00 | 0 days 00:00:00 | 500 | pressure | 0 | regular_ll |
| 4 | t | 2024-06-03 06:00:00 | 2024-06-03 00:00:00 | 0 days 06:00:00 | 700 | pressure | 0 | regular_ll |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 59 | u | 2024-06-04 12:00:00 | 2024-06-04 12:00:00 | 0 days 00:00:00 | 137 | hybrid | 0 | regular_ll |
| 60 | t | 2024-06-04 18:00:00 | 2024-06-04 12:00:00 | 0 days 06:00:00 | 90 | hybrid | 0 | regular_ll |
| 61 | u | 2024-06-04 18:00:00 | 2024-06-04 12:00:00 | 0 days 06:00:00 | 90 | hybrid | 0 | regular_ll |
| 62 | t | 2024-06-04 18:00:00 | 2024-06-04 12:00:00 | 0 days 06:00:00 | 137 | hybrid | 0 | regular_ll |
| 63 | u | 2024-06-04 18:00:00 | 2024-06-04 12:00:00 | 0 days 06:00:00 | 137 | hybrid | 0 | regular_ll |
64 rows Γ 8 columns
When converting this field list into an Xarray dataset, we must handle the incompatibility between the level types associated with the same variables. One possible approach is to create a separate variable for each combination of parameter and level, for example: "t_hybrid_90", "t_hybrid_137", "t_pressure_500", "t_pressure_700", and similarly for u.
[7]:
ds3 = ds_fl2.to_xarray(
remapping={"my_custom_var_key": "{parameter.variable}_{vertical.level_type}_{vertical.level}"},
variable_key="my_custom_var_key",
add_earthkit_attrs=False,
)
ds3
[7]:
<xarray.Dataset> Size: 351kB
Dimensions: (forecast_reference_time: 4, step: 2,
latitude: 19, longitude: 36)
Coordinates:
* forecast_reference_time (forecast_reference_time) datetime64[ns] 32B 202...
* step (step) timedelta64[ns] 16B 00:00:00 06:00:00
* latitude (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
* longitude (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
t_hybrid_137 (forecast_reference_time, step, latitude, longitude) float64 44kB ...
t_hybrid_90 (forecast_reference_time, step, latitude, longitude) float64 44kB ...
t_pressure_500 (forecast_reference_time, step, latitude, longitude) float64 44kB ...
t_pressure_700 (forecast_reference_time, step, latitude, longitude) float64 44kB ...
u_hybrid_137 (forecast_reference_time, step, latitude, longitude) float64 44kB ...
u_hybrid_90 (forecast_reference_time, step, latitude, longitude) float64 44kB ...
u_pressure_500 (forecast_reference_time, step, latitude, longitude) float64 44kB ...
u_pressure_700 (forecast_reference_time, step, latitude, longitude) float64 44kB ...
Attributes:
Conventions: CF-1.8
institution: ECMWFAn alternative approach, which results in a more compact hypercube structure, is described below:
[8]:
ds4 = ds_fl2.to_xarray(
level_dim_mode="level_per_type",
remapping={"my_custom_var_key": "{parameter.variable}_{vertical.level_type}"},
variable_key="my_custom_var_key",
add_earthkit_attrs=False,
)
ds4
[8]:
<xarray.Dataset> Size: 351kB
Dimensions: (forecast_reference_time: 4, step: 2, hybrid: 2,
latitude: 19, longitude: 36, pressure: 2)
Coordinates:
* forecast_reference_time (forecast_reference_time) datetime64[ns] 32B 202...
* step (step) timedelta64[ns] 16B 00:00:00 06:00:00
* hybrid (hybrid) int64 16B 90 137
* latitude (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
* longitude (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
* pressure (pressure) int64 16B 500 700
Data variables:
t_hybrid (forecast_reference_time, step, hybrid, latitude, longitude) float64 88kB ...
t_pressure (forecast_reference_time, step, pressure, latitude, longitude) float64 88kB ...
u_hybrid (forecast_reference_time, step, hybrid, latitude, longitude) float64 88kB ...
u_pressure (forecast_reference_time, step, pressure, latitude, longitude) float64 88kB ...
Attributes:
Conventions: CF-1.8
institution: ECMWF