Xarray engine: extra dimensions¶
Quantiles in a probabilistic forecast
[1]:
import earthkit.data as ekd
Let us now consider a probabilistic forecast of 2-metre temperature.
[2]:
fl = ekd.from_source("sample", "quantiles_pd.grib").to_fieldlist()
In this dataset, the fields are indexed by the GRIB metadata key "quantile".
[3]:
fl.ls(
keys=[
"parameter.variable",
"time.base_datetime",
"time.step",
"ensemble.member",
"metadata.number",
"metadata.numberOfForecastsInEnsemble",
"metadata.quantile",
]
)
[3]:
| parameter.variable | time.base_datetime | time.step | ensemble.member | metadata.number | metadata.numberOfForecastsInEnsemble | metadata.quantile | |
|---|---|---|---|---|---|---|---|
| 0 | 2tp | 2025-12-09 | 7 days | 1 | 1 | 3 | 1:3 |
| 1 | 2tp | 2025-12-09 | 7 days | 1 | 1 | 5 | 1:5 |
| 2 | 2tp | 2025-12-09 | 7 days | 1 | 1 | 10 | 1:10 |
| 3 | 2tp | 2025-12-09 | 7 days | 2 | 2 | 3 | 2:3 |
| 4 | 2tp | 2025-12-09 | 7 days | 2 | 2 | 5 | 2:5 |
| 5 | 2tp | 2025-12-09 | 7 days | 2 | 2 | 10 | 2:10 |
| 6 | 2tp | 2025-12-09 | 7 days | 3 | 3 | 3 | 3:3 |
| 7 | 2tp | 2025-12-09 | 7 days | 3 | 3 | 5 | 3:5 |
| 8 | 2tp | 2025-12-09 | 7 days | 3 | 3 | 10 | 3:10 |
| 9 | 2tp | 2025-12-09 | 7 days | 4 | 4 | 5 | 4:5 |
| 10 | 2tp | 2025-12-09 | 7 days | 4 | 4 | 10 | 4:10 |
| 11 | 2tp | 2025-12-09 | 7 days | 5 | 5 | 5 | 5:5 |
| 12 | 2tp | 2025-12-09 | 7 days | 5 | 5 | 10 | 5:10 |
| 13 | 2tp | 2025-12-09 | 7 days | 6 | 6 | 10 | 6:10 |
| 14 | 2tp | 2025-12-09 | 7 days | 7 | 7 | 10 | 7:10 |
| 15 | 2tp | 2025-12-09 | 7 days | 8 | 8 | 10 | 8:10 |
| 16 | 2tp | 2025-12-09 | 7 days | 9 | 9 | 10 | 9:10 |
| 17 | 2tp | 2025-12-09 | 7 days | 10 | 10 | 10 | 10:10 |
By default, the ensemble dimension "member" is derived from the "ensemble.member" key. This key itself is extracted from the "number" GRIB key.
In the GRIB listing above we can see the usual meaning of the GRIB metadata key "number" (and the related "numberOfForecastsInEnsemble") is overridden by "quantile". As a result, the ensemble dimension "member" is no longer applicable.
For this reason, we must:
declare
"quantile"as an extra dimension, andremove the predefined ensemble dimension
"member", since it would otherwise conflict with the"quantile"dimension.
[4]:
ds = fl.to_xarray(
squeeze=False,
extra_dims="metadata.quantile",
drop_dims="member",
add_earthkit_attrs=False,
)
ds
[4]:
<xarray.Dataset> Size: 13kB
Dimensions: (quantile: 18, forecast_reference_time: 1,
step: 1, level: 1, level_type: 1, latitude: 7,
longitude: 12)
Coordinates:
* quantile (quantile) <U5 360B '10:10' '1:10' ... '9:10'
* forecast_reference_time (forecast_reference_time) datetime64[ns] 8B 2025...
* step (step) timedelta64[ns] 8B 7 days
* level (level) int64 8B 0
* level_type (level_type) <U7 28B 'surface'
* latitude (latitude) float64 56B 90.0 60.0 ... -60.0 -90.0
* longitude (longitude) float64 96B 0.0 30.0 ... 300.0 330.0
Data variables:
2tp (quantile, forecast_reference_time, step, level, level_type, latitude, longitude) float64 12kB ...
Attributes:
Conventions: CF-1.8
institution: ECMWFThe option ensure_dims vs extra_dims¶
The extra_dims and ensure_dims options partially overlap in their usage - when introducing a new dimension that must not be squeezed, it is sufficient to list it in ensure_dims. In this case, there is no need to repeat the same dimension in extra_dims.
[5]:
ds2 = fl.sel({"metadata.quantile": "2:3"}).to_xarray(
squeeze=True,
ensure_dims="metadata.quantile",
drop_dims="member",
add_earthkit_attrs=False,
)
ds2
[5]:
<xarray.Dataset> Size: 836B
Dimensions: (quantile: 1, latitude: 7, longitude: 12)
Coordinates:
* quantile (quantile) <U3 12B '2:3'
* latitude (latitude) float64 56B 90.0 60.0 30.0 0.0 -30.0 -60.0 -90.0
* longitude (longitude) float64 96B 0.0 30.0 60.0 90.0 ... 270.0 300.0 330.0
Data variables:
2tp (quantile, latitude, longitude) float64 672B ...
Attributes:
Conventions: CF-1.8
institution: ECMWF