Xarray engine: variable keyΒΆ

The variable_key option in to_xarray() controls which metadata key is used to name the Xarray Dataset variables. In the default profile earthkit, it is set to "parameter.variable".

Please note it is also possible to generate an Xarray with a single DataArray containing all the parameters from a fieldlist. See e.g. the Xarray engine: mono variable notebook for details.

[1]:
import earthkit.data as ekd

ds_fl = ekd.from_source("sample", "pl.grib").to_fieldlist()
ds = ds_fl.to_xarray()
ds

[1]:
<xarray.Dataset> Size: 176kB
Dimensions:                  (forecast_reference_time: 4, step: 2, level: 2,
                              latitude: 19, longitude: 36)
Coordinates:
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 32B 202...
  * step                     (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * level                    (level) int64 16B 500 700
  * latitude                 (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
  * longitude                (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
    r                        (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
    t                        (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF

Using remappingΒΆ

Using the remapping option can come in handy when there are parameters with different level types in the input data.

[2]:
ds_fl = ekd.from_source("sample", "mixed_pl_sfc.grib").to_fieldlist()
ds = ds_fl.to_xarray(variable_key="p_l", remapping={"p_l": "{parameter.variable}_{vertical.level}"})
ds

[2]:
<xarray.Dataset> Size: 1MB
Dimensions:                  (forecast_reference_time: 4, step: 2,
                              latitude: 19, longitude: 36)
Coordinates:
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 32B 202...
  * step                     (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * latitude                 (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
  * longitude                (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables: (12/32)
    2t_0                     (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    msl_0                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    r_1000                   (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    r_300                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    r_400                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    r_500                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    ...                       ...
    z_1000                   (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_300                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_400                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_500                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_700                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_850                    (forecast_reference_time, step, latitude, longitude) float64 44kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF

We can take it one step further and define a metadata key that combines the parameter name, the level and the level type into a single key.

[3]:
ds_fl = ekd.from_source("sample", "mixed_pl_sfc.grib").to_fieldlist()
ds = ds_fl.to_xarray(
    variable_key="p_l_t", remapping={"p_l_t": "{parameter.variable}_{vertical.level}_{vertical.level_type}"}
)
ds

[3]:
<xarray.Dataset> Size: 1MB
Dimensions:                  (forecast_reference_time: 4, step: 2,
                              latitude: 19, longitude: 36)
Coordinates:
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 32B 202...
  * step                     (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * latitude                 (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
  * longitude                (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables: (12/32)
    2t_0_surface             (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    msl_0_surface            (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    r_1000_pressure          (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    r_300_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    r_400_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    r_500_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    ...                       ...
    z_1000_pressure          (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_300_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_400_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_500_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_700_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    z_850_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF

This technique is particularly useful when the same parameter is available on multiple level types in the input data. In this case using just the parameter name as the variable_key (the default) does not result in a full hypercube, however the remappings that we used above do.

[4]:
ds_fl = ekd.from_source("sample", "mixed_pl_ml.grib").to_fieldlist()
ds = ds_fl.to_xarray(
    variable_key="p_l_t", remapping={"p_l_t": "{parameter.variable}_{vertical.level}_{vertical.level_type}"}
)
ds

[4]:
<xarray.Dataset> Size: 351kB
Dimensions:                  (forecast_reference_time: 4, step: 2,
                              latitude: 19, longitude: 36)
Coordinates:
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 32B 202...
  * step                     (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * latitude                 (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
  * longitude                (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
    t_137_hybrid             (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    t_500_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    t_700_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    t_90_hybrid              (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    u_137_hybrid             (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    u_500_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    u_700_pressure           (forecast_reference_time, step, latitude, longitude) float64 44kB ...
    u_90_hybrid              (forecast_reference_time, step, latitude, longitude) float64 44kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF