Xarray engine: temporal dimensions

We can convert a GRIB fieldlist to Xarray with to_xarray(). This notebook discusses the temporal options used by this method.

First, we get some forecast GRIB data defined on pressure levels and read it into a fieldlist.

[1]:
import earthkit.data as ekd

ds_fl = ekd.from_source("sample", "pl.grib").to_fieldlist()

Temporal dimensions

The temporal dimensions are based on the dim_roles and time_dims options. The dim_roles are a mapping between predefined dimension roles and metadata keys used to build the given dimensions. See Predefined dimensions and dimension roles for details:

With regards to the temporal structure the following roles are involved:

  • “forecast_reference_time”

  • “step”

  • “valid_time”

  • “date”

  • “time”

These roles can be mapped to different metadata keys according to the profile (see the profiles).

time_dims=[“forecast_reference_time”, “step”]

When time_dims=["forecast_reference_time", "step"] the the dimension “forecast_reference_time” and “step” are used.

[2]:
# default mode
ds = ds_fl.to_xarray(time_dims=["forecast_reference_time", "step"])
ds
[2]:
<xarray.Dataset> Size: 176kB
Dimensions:                  (forecast_reference_time: 4, step: 2, level: 2,
                              latitude: 19, longitude: 36)
Coordinates:
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 32B 202...
  * step                     (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * level                    (level) int64 16B 500 700
  * latitude                 (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
  * longitude                (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
    r                        (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
    t                        (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF

time_dims=”valid_time”

When time_dims=["valid_time"] the only temporal dimension is “valid_time”. This dimension can only be generated if each field has a distinct valid time, so it typically fits for analysis/climate data.

[3]:
ds_fl_an = ekd.from_source("sample", "msl_analysis.grib")
ds_xr = ds_fl_an.to_xarray(time_dims=["valid_time"])
ds_xr

[3]:
<xarray.Dataset> Size: 6kB
Dimensions:     (valid_time: 8, latitude: 7, longitude: 12)
Coordinates:
  * valid_time  (valid_time) datetime64[ns] 64B 2016-09-25 ... 2016-09-26T18:...
  * latitude    (latitude) float64 56B 80.0 70.0 60.0 50.0 40.0 30.0 20.0
  * longitude   (longitude) float64 96B -70.0 -60.0 -50.0 ... 20.0 30.0 40.0
Data variables:
    msl         (valid_time, latitude, longitude) float64 5kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF

This mode can also be used for suitable forecasts data. To use it for the original forecast data first we need to filter it.

[4]:
ds = ds_fl.sel({"time.valid_datetime": ["2024-06-03T00", "2024-06-03T06"]}).to_xarray(time_dims=["valid_time"])
ds
[4]:
<xarray.Dataset> Size: 44kB
Dimensions:     (valid_time: 2, level: 2, latitude: 19, longitude: 36)
Coordinates:
  * valid_time  (valid_time) datetime64[ns] 16B 2024-06-03 2024-06-03T06:00:00
  * level       (level) int64 16B 500 700
  * latitude    (latitude) float64 152B 90.0 80.0 70.0 ... -70.0 -80.0 -90.0
  * longitude   (longitude) float64 288B 0.0 10.0 20.0 ... 330.0 340.0 350.0
Data variables:
    r           (valid_time, level, latitude, longitude) float64 22kB ...
    t           (valid_time, level, latitude, longitude) float64 22kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF

time_dims=”raw”

When time_dims=["date", "time", "step"] the “date”, “time” and “step” roles are used to form the temporal dimensions.

[5]:
ds = ds_fl.to_xarray(time_dims=["date", "time", "step"])
ds
[5]:
<xarray.Dataset> Size: 176kB
Dimensions:    (date: 2, time: 2, step: 2, level: 2, latitude: 19, longitude: 36)
Coordinates:
  * date       (date) datetime64[ns] 16B 2024-06-03 2024-06-04
  * time       (time) timedelta64[ns] 16B 00:00:00 12:00:00
  * step       (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * level      (level) int64 16B 500 700
  * latitude   (latitude) float64 152B 90.0 80.0 70.0 60.0 ... -70.0 -80.0 -90.0
  * longitude  (longitude) float64 288B 0.0 10.0 20.0 30.0 ... 330.0 340.0 350.0
Data variables:
    r          (date, time, step, level, latitude, longitude) float64 88kB ...
    t          (date, time, step, level, latitude, longitude) float64 88kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF

Adding valid_time coord

When add_valid_time_coord=True it adds the coordinate valid_time containing the valid times for all the different temporal dimensions as datetime64. When time_dims=["valid_time"] this coordinate is always added irrespective of the value of add_valid_time_coord.

[6]:
ds = ds_fl.to_xarray(time_dims=["date", "time", "step"], add_valid_time_coord=True)
ds
[6]:
<xarray.Dataset> Size: 176kB
Dimensions:     (date: 2, time: 2, step: 2, level: 2, latitude: 19,
                 longitude: 36)
Coordinates:
  * date        (date) datetime64[ns] 16B 2024-06-03 2024-06-04
  * time        (time) timedelta64[ns] 16B 00:00:00 12:00:00
  * step        (step) timedelta64[ns] 16B 00:00:00 06:00:00
    valid_time  (date, time, step) datetime64[ns] 64B ...
  * level       (level) int64 16B 500 700
  * latitude    (latitude) float64 152B 90.0 80.0 70.0 ... -70.0 -80.0 -90.0
  * longitude   (longitude) float64 288B 0.0 10.0 20.0 ... 330.0 340.0 350.0
Data variables:
    r           (date, time, step, level, latitude, longitude) float64 88kB ...
    t           (date, time, step, level, latitude, longitude) float64 88kB ...
Attributes:
    Conventions:  CF-1.8
    institution:  ECMWF

Decoding temporal coords

When decode_times=True (the default) the following coordinates will be stored as datetime64:

  • coordinates representing the date-like roles (“e.g. “date”) or keys (e.g. “metadata.dataDate”)

  • datetime coordinates (e.g. “forecast_reference_time” etc.)

When decode_timedelta=True (the default) the following coordinates will be stored as timedelta64:

  • coordinates representing the time-like roles (e.g. “time”) or keys (e.g. “metadata.time”)

  • duration-like coordinates (e.g. “step”)

[7]:
ds = ds_fl.to_xarray(time_dims=["date", "time", "step"], decode_times=True, decode_timedelta=True)
ds.coords
[7]:
Coordinates:
  * date       (date) datetime64[ns] 16B 2024-06-03 2024-06-04
  * time       (time) timedelta64[ns] 16B 00:00:00 12:00:00
  * step       (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * level      (level) int64 16B 500 700
  * latitude   (latitude) float64 152B 90.0 80.0 70.0 60.0 ... -70.0 -80.0 -90.0
  * longitude  (longitude) float64 288B 0.0 10.0 20.0 30.0 ... 330.0 340.0 350.0