Xarray engine: temporal options

We can convert a GRIB fieldlist to Xarray with to_xarray(). This notebook discusses the temporal options used by this method.

First, we get some forecast GRIB data defined on pressure levels and read it into a GRIB fieldlist.

[1]:
import earthkit.data as ekd
ds_fl = ekd.from_source("sample", "pl.grib")

Temporal dimensions

The temporal dimensions are based on the dim_roles and time_dim_mode options. The dim_roles are a mapping between predefined dimension roles and metadata keys used to build the given dimensions. With regards to the temporal structure the following roles are involved:

  • “date”: metadata key interpreted as date part of the “forecast_reference_time”. By default, mapped to the “date” key.

  • “time”: metadata key interpreted as time part of the “forecast_reference_time”. By default, mapped to the “time” key.

  • “step”: metadata key interpreted as the forecast step. By default, mapped to the “step_timedelta” metadata key that is built from the “endStep” key.

  • “forecast_reference_time”: by default, built using the “date” and “time” roles

  • “valid_time”: by default, built using the “validityDate” and “validityTime” keys

These roles can be mapped to different GRIB keys according to the profile, but for the current built-in profiles the actual mappings are the same as the default (see the profiles).

time_dim_mode=raw

When time_dim_mode="raw" the “date”, “time” and “step” roles are used to form the temporal dimensions.

[2]:
ds = ds_fl.to_xarray(time_dim_mode="raw")
ds
[2]:
<xarray.Dataset> Size: 176kB
Dimensions:    (date: 2, time: 2, step: 2, level: 2, latitude: 19, longitude: 36)
Coordinates:
  * date       (date) datetime64[ns] 16B 2024-06-03 2024-06-04
  * time       (time) timedelta64[ns] 16B 00:00:00 12:00:00
  * step       (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * level      (level) int64 16B 500 700
  * latitude   (latitude) float64 152B 90.0 80.0 70.0 60.0 ... -70.0 -80.0 -90.0
  * longitude  (longitude) float64 288B 0.0 10.0 20.0 30.0 ... 330.0 340.0 350.0
Data variables:
    r          (date, time, step, level, latitude, longitude) float64 88kB ...
    t          (date, time, step, level, latitude, longitude) float64 88kB ...
Attributes:
    class:        od
    stream:       oper
    levtype:      pl
    type:         fc
    expver:       0001
    domain:       g
    number:       0
    Conventions:  CF-1.8
    institution:  ECMWF

time_dim_mode=forecast

When time_dim_mode="forecast" the “date” and “time” roles are merged to form the dimension “forecats_reference_time”. It also adds the “step” dimension.

[3]:
# default mode
ds = ds_fl.to_xarray(time_dim_mode="forecast")
ds
[3]:
<xarray.Dataset> Size: 176kB
Dimensions:                  (forecast_reference_time: 4, step: 2, level: 2,
                              latitude: 19, longitude: 36)
Coordinates:
  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 32B 202...
  * step                     (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * level                    (level) int64 16B 500 700
  * latitude                 (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
  * longitude                (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
    r                        (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
    t                        (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
Attributes:
    class:        od
    stream:       oper
    levtype:      pl
    type:         fc
    expver:       0001
    date:         20240603
    time:         0
    domain:       g
    number:       0
    Conventions:  CF-1.8
    institution:  ECMWF

time_dim_mode=valid_time

When time_dim_mode="valid_time" the only temporal dimension is “valid_time”. By default, it is built from the values of the “validityDate” and “validityTime” ecCodes GRIB keys. This dimension can only be generated if each GRIB field has a distinct valid time, so it typically fits for analysis/climate data.

[4]:
ds_fl_an = ekd.from_source("sample", "msl_analysis.grib")
ds_xr = ds_fl_an.to_xarray(time_dim_mode="valid_time")
ds_xr
[4]:
<xarray.Dataset> Size: 6kB
Dimensions:     (valid_time: 8, latitude: 7, longitude: 12)
Coordinates:
  * valid_time  (valid_time) datetime64[ns] 64B 2016-09-25 ... 2016-09-26T18:...
  * latitude    (latitude) float64 56B 80.0 70.0 60.0 50.0 40.0 30.0 20.0
  * longitude   (longitude) float64 96B -70.0 -60.0 -50.0 ... 20.0 30.0 40.0
Data variables:
    msl         (valid_time, latitude, longitude) float64 5kB ...
Attributes: (12/13)
    param:        msl
    paramId:      151
    class:        od
    stream:       oper
    levtype:      sfc
    type:         an
    ...           ...
    date:         20160925
    time:         0
    domain:       g
    number:       0
    Conventions:  CF-1.8
    institution:  ECMWF

This mode can also be used for suitable forecasts data. To use it for the original forecast data first we need to filter it.

[5]:
ds = ds_fl.sel(date=20240603, time=0).to_xarray(time_dim_mode="valid_time")
ds
[5]:
<xarray.Dataset> Size: 44kB
Dimensions:     (valid_time: 2, level: 2, latitude: 19, longitude: 36)
Coordinates:
  * valid_time  (valid_time) datetime64[ns] 16B 2024-06-03 2024-06-03T06:00:00
  * level       (level) int64 16B 500 700
  * latitude    (latitude) float64 152B 90.0 80.0 70.0 ... -70.0 -80.0 -90.0
  * longitude   (longitude) float64 288B 0.0 10.0 20.0 ... 330.0 340.0 350.0
Data variables:
    r           (valid_time, level, latitude, longitude) float64 22kB ...
    t           (valid_time, level, latitude, longitude) float64 22kB ...
Attributes:
    class:        od
    stream:       oper
    levtype:      pl
    type:         fc
    expver:       0001
    date:         20240603
    time:         0
    domain:       g
    number:       0
    Conventions:  CF-1.8
    institution:  ECMWF

Adding valid_time coord

When add_valid_time_coord=True it adds the coordinate valid_time containing the valid times for all the different temporal dimensions as datetime64. When time_dim_mode="valid_time" this coordinate is always added irrespective of the value of add_valid_time_coord.

[6]:
ds = ds_fl.to_xarray(time_dim_mode="raw", add_valid_time_coord=True)
ds
[6]:
<xarray.Dataset> Size: 176kB
Dimensions:     (date: 2, time: 2, step: 2, level: 2, latitude: 19,
                 longitude: 36)
Coordinates:
  * date        (date) datetime64[ns] 16B 2024-06-03 2024-06-04
  * time        (time) timedelta64[ns] 16B 00:00:00 12:00:00
  * step        (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * level       (level) int64 16B 500 700
    valid_time  (date, time, step) datetime64[ns] 64B ...
  * latitude    (latitude) float64 152B 90.0 80.0 70.0 ... -70.0 -80.0 -90.0
  * longitude   (longitude) float64 288B 0.0 10.0 20.0 ... 330.0 340.0 350.0
Data variables:
    r           (date, time, step, level, latitude, longitude) float64 88kB ...
    t           (date, time, step, level, latitude, longitude) float64 88kB ...
Attributes:
    class:        od
    stream:       oper
    levtype:      pl
    type:         fc
    expver:       0001
    domain:       g
    number:       0
    Conventions:  CF-1.8
    institution:  ECMWF

Decoding temporal coords

When decode_times=True (the default) the follwing coordinates will be stored as datetime64:

  • coordinates representing the date-like roles or GRIB keys (e.g. “date”, “validityDate” etc.)

  • datetime coordinates (e.g. “forecast_reference_time” etc.)

When decode_timedelta=True (the default) the following coordinates will be stored as timedelta64:

  • coordinates representing the time-like roles or GRIB keys (e.g. “time”, “validityTime” etc.)

  • duration-like coordinates (e.g. “step”, “endStep”)

[7]:
ds = ds_fl.to_xarray(time_dim_mode="raw", decode_times=True, decode_timedelta=True)
ds.coords
[7]:
Coordinates:
  * date       (date) datetime64[ns] 16B 2024-06-03 2024-06-04
  * time       (time) timedelta64[ns] 16B 00:00:00 12:00:00
  * step       (step) timedelta64[ns] 16B 00:00:00 06:00:00
  * level      (level) int64 16B 500 700
  * latitude   (latitude) float64 152B 90.0 80.0 70.0 60.0 ... -70.0 -80.0 -90.0
  * longitude  (longitude) float64 288B 0.0 10.0 20.0 30.0 ... 330.0 340.0 350.0

When decode_times=False the following rules apply:

  • coordinates representing date-like GRIB keys (e.g. “date”, “validityDate” etc.) will store the native GRIB int values (as yyyymmdd)

  • datetime coordinates (e.g. “forecast_reference_time” etc.) will store datetime64 values

When decode_timedelta=False the following rules apply:

  • coordinates representing the time-like GRIB keys (e.g. “time”, “validityTime” etc.) will store the native GRIB int values (as 100*hours + minutes)

  • duration-like (e.g. “step”) coordinates will store int values with units indicated by the coordinate attribute “units”

[8]:
ds = ds_fl.to_xarray(time_dim_mode="raw", decode_times=False, decode_timedelta=False)
ds.coords
[8]:
Coordinates:
  * date       (date) int64 16B 20240603 20240604
  * time       (time) int64 16B 0 1200
  * step       (step) int64 16B 0 6
  * level      (level) int64 16B 500 700
  * latitude   (latitude) float64 152B 90.0 80.0 70.0 60.0 ... -70.0 -80.0 -90.0
  * longitude  (longitude) float64 288B 0.0 10.0 20.0 30.0 ... 330.0 340.0 350.0
[9]:
ds.coords["step"].attrs
[9]:
{'units': 'hours'}
[ ]: