Dimensions

One of the most important aspect of the Xarray engine is how it generates dimensions in the Xarray dataset with to_xarray().

Predefined dimensions and dimension roles

By default, the following predefined dimensions are generated, in the following order:

  • ensemble forecast member dimension

  • temporal dimensions, controlled by time_dims (see details here)

  • vertical dimensions, controlled by level_dim_mode (see details here)

The predefined dimensions are based on the dim_roles, which is a mapping between the “roles” and the metadata keys. This mapping is defined for each profile and can be customised by the user. The possible roles are as follows:

Default dimension roles

Dimension role

Description

Metadata key (in default profile: earthkit)

“member”

Ensemble forecast member

“ensemble.member”

“forecast_reference_time”

Forecast reference time (base datetime). Can be a single metadata key, or a list/tuple of two metadata keys representing the “date” and “time” parts of the forecast reference time. Alternatively, it can be a dict with “date” and “time” keys specifying the corresponding metadata keys. Used when "forecast_reference_time" is in time_dims.

“time.forecast_reference_time”

“step”

Forecast step

“time.step”

“valid_time”

Valid datetime. Used when "valid_time" is in time_dims or add_valid_time_coord=True.

“time.valid_datetime”

“date”

Date part of the forecast reference time. Used when "date" is in time_dims.

“time.base_date”

“time”

Time part of the forecast reference time. Used when "time" is in time_dims.

“time.base_time”

“level”

Level

“vertical.level”

“level_type”

Level type

“vertical.level_type”

By default, the dimension names are the same as the role names. To use the associated metadata keys instead use the dim_name_from_role_name=False option.

Ensemble member dimension

The ensemble member dimension is a single dimension named "member" by default, unless dim_name_from_role_name=False and dim_roles defines it differently.

Temporal dimensions

The temporal dimensions can be generated in multiple ways, and can be represented by multiple individual dimensions in an Xarray dataset. The time_dims option explicitly specifies which time dimensions are constructed and their order, while dim_roles (together with dim_name_from_role_name) controls their names and the way their coordinates are formed.

Each element of time_dims is a role name (e.g. "forecast_reference_time", "step", "valid_time", "date", "time").

Common time_dims configurations

time_dims

Dimensions generated

[“forecast_reference_time”, “step”] (default)

“forecast_reference_time”, “step”

[“valid_time”]

“valid_time”

[“date”, “time”, “step”]

“date”, “time”, “step”

The following examples demonstrate the temporal dimensions modes:

Vertical dimension modes

The vertical dimensions can be generated in multiple ways, and can be represented by multiple individual dimensions in an Xarray dataset. The level_dim_mode option controls what vertical dimensions are generated in the Xarray dataset, while dim_roles (together with dim_name_from_role_name) controls their names and the way their coordinates are formed.

Vertical dimensions modes

level_dim_mode

Dimensions generated

Remarks

“level” (default)

“level”, “level_type”

The “level_type” dimension usually has size 1, so it is squeezed by default.

“level_per_type”

“<level_per_type>”

Uses a template dimension that is materialised in the Xarray dataset under the name given by the value of the metadata key referenced by “dim_roles[“level_type”]” (for example “surface”, “mean_sea”, “pressure”, “hybrid”).

“level_and_type”

“level_and_type”

Creates a single dimension whose coordinates are formed by concatenating the values of the metadata keys “dim_roles[“level”]” and “dim_roles[“level_type”]” (for example “850pressure”, “137hybrid”, “0surface”).

The following example demonstrates the vertical dimensions modes:

Squeezing/ensuring dimensions

By default, the dimensions are squeezed. This means that if a dimension has only one value, it is removed from the dataset. This can be controlled with the squeeze option. Alternatively, the ensure_dims option can be used to ensure that certain dimensions are always present in the dataset, even if they have only one value. This is useful when you want to keep the dimensions for consistency or for further processing.

See the following notebook for examples of how this works:

Size-1 dimensions as variable attributes

As an alternative to squeezing, a size-1 dimension can be converted into a variable attribute using the dims_as_attrs option. This is particularly useful when working with single-level variables defined on different vertical levels (for example, "mean_sea": 0).

Like squeeze=True, this approach avoids issues caused by incompatible coordinates on size-1 dimensions. In addition, it preserves the associated coordinate information by storing it as a variable attribute.

The dims_as_attrs option can also be combined with ensure_dims, allowing a size-1 dimension to be both preserved as a dimension and exposed as a variable attribute.

For a detailed discussion and examples, see the following notebook:

Extra dimensions

The extra_dims option allows additional dimensions to be introduced into the resulting Xarray dataset, beyond the predefined dimensions.

Each entry in extra_dims refers to a metadata key whose values are used as the coordinates of a newly created dimension.

Extra dimensions are handled in the same way as predefined dimensions: if an extra dimension has size 1, it can be squeezed or ensured, or converted into a variable attribute.

Collision with predefined dimensions

When an extra_dims entry refers to a metadata key that belongs to a predefined dimension that is not part of the current time_dims or level_dim_mode selection, the entry is silently ignored. This prevents the same underlying metadata from appearing twice — once through the predefined dimension machinery and once as an extra dimension — which would lead to conflicts or duplicate axes.

For example, when time_dims=["valid_time"] is used, only the "valid_time" role is selected. The other time-related roles ("forecast_reference_time", "step", "date", "time") and all their associated metadata keys are excluded. If any of those metadata keys are listed in extra_dims, they will not be added as dimensions:

# "time.step" belongs to the "step" time role, which is not in
# time_dims, so it is suppressed even though it appears in extra_dims.
ds = fl.to_xarray(
    profile="earthkit",
    time_dims=["valid_time"],
    extra_dims=["time.step"],
)
assert "valid_time" in ds.dims
assert "step" not in ds.dims       # blocked – collides with excluded predefined dim
assert "time.step" not in ds.dims  # blocked as well

This rule applies to all predefined dimension roles listed in the dimension roles table, not only temporal ones.

For a detailed discussion and examples, see the following notebook:

Remapping

The remapping option allows virtual metadata keys to be defined by combining existing metadata keys. These virtual keys can then be used in the same way as native metadata keys throughout the Xarray engine configuration.

In particular, a virtual metadata key can be used:

  • as a dimension, by including it in extra_dims or ensure_dims; once defined as a dimension, it can also be referenced in dims_as_attrs like any other dimension;

  • as a custom variable name, by specifying it in the variable_key option.

For a detailed discussion and examples, see the following notebook: