earthkit.data.xr_engine.engine¶
Classes¶
|
|
Module Contents¶
- class earthkit.data.xr_engine.engine.EarthkitBackendEntrypoint¶
Bases:
xarray.backends.BackendEntrypointBackendEntrypointis a class container and it is the main interface for the backend plugins, see BackendEntrypoint subclassing. It shall implement:open_datasetmethod: it shall implement reading from file, variables decoding and it returns an instance ofDataset. It shall take in input at leastfilename_or_objargument anddrop_variableskeyword argument. For more details see open_dataset.guess_can_openmethod: it shall returnTrueif the backend is able to openfilename_or_obj,Falseotherwise. The implementation of this method is not mandatory.open_datatreemethod: it shall implement reading from file, variables decoding and it returns an instance ofDataTree. It shall take in input at leastfilename_or_objargument. The implementation of this method is not mandatory. For more details see <reference to open_datatree documentation>.
- open_dataset_parameters¶
A list of
open_datasetmethod parameters. The setting of this attribute is not mandatory.- Type:
tuple, default:None
- description¶
A short string describing the engine. The setting of this attribute is not mandatory.
- Type:
str, default:""
- url¶
A string with the URL to the backend’s documentation. The setting of this attribute is not mandatory.
- Type:
str, default:""
- supports_groups¶
Whether the backend supports opening groups (via open_datatree and open_groups_as_dict) or not.
- Type:
bool, default:False
- description: ClassVar[str] = ''¶
- classmethod guess_can_open(filename_or_obj)¶
Backend open_dataset method used by Xarray in
open_dataset().
- open_dataset(filename_or_obj, source_type='file', profile='earthkit', variable_key=None, drop_variables=None, rename_variables=None, mono_variable=None, extra_dims=None, drop_dims=None, ensure_dims=None, fixed_dims=None, dim_roles=None, dim_name_from_role_name=None, rename_dims=None, dims_as_attrs=None, time_dims=None, level_dim_mode=None, squeeze=None, add_valid_time_coord=None, decode_times=None, decode_timedelta=None, aux_coords=None, add_geo_coords=None, attrs_mode=None, attrs=None, variable_attrs=None, global_attrs=None, coord_attrs=None, add_earthkit_attrs=None, rename_attrs=None, fill_metadata=None, remapping=None, flatten_values=None, lazy_load=None, release_source=None, allow_holes=None, strict=None, dtype=None, array_namespace=None, errors=None)¶
- filename_or_obj, str, Path or earthkit object
Input GRIB file or object to be converted to an Xarray dataset.
- profile: str, dict or None
Provide custom default values for most of the kwargs. The default profile is “earthkit”. An explicit dict can be used. None is equivalent to an empty dict. When a kwarg is specified it will update the corresponding profile value if it is a dict otherwise it will overwrite it. See: Xarray engine: profiles for more information.
- variable_key: str, None
The metadata key which will be used to name the Xarray Dataset variables. Default is “parameter.variable” (which in the case of GRIB data is the same as “metadata.shortName” and “metadata.param”). The same key cannot be used to define any dimension. Only enabled when
mono_variableis False or None.- drop_variables: str, or iterable of str, None
A variable or list of variables to drop from the dataset. Default is None. Only used when
variable_keyis enabled.- rename_variables: dict, None
Mapping to rename variables. Default is None. Only used when
variable_keyis enabled.- mono_variable: bool, str, None
If True or str, the dataset will contain a single variable called “data” (or the value of the
mono_variablekwarg when it is a str). If False, the dataset will contain one variable for each distinct value ofvariable_keymetadata key. The default value (None) expands to False unless theprofileoverwrites it.- extra_dims: str, or list of str, dict or tuple, or None
Define additional dimensions on top of the predefined dimensions. Only enabled when no
fixed_dimsis specified. Default is None. It can be a single metadata key or a list. If a list, each item is either a metadata key, or a dict/tuple defining mapping between the dimension name and the metadata key. The whole option can be a dict. E.g.# use GRIB key "expver" as a dimension extra_dims = "metadata.expver" # use keys "metadata.expver" and "metadata.steam" as a dimension extra_dims = ["metadata.expver", "metadata.stream"] # define dimensions "expver", "mars_stream" and "mars_type" from # GRIB keys "expver", "stream" and "type" extra_dims = [ "metadata.expver", {"mars_stream": "metadata.stream"}, ("mars_type", "metadata.type"), ] extra_dims = [ { "expver": "metadata.expver", "mars_stream": "metadata.stream", "mars_type": "metadata.type", } ]
- drop_dims: str, or iterable of str, None
Single or multiple dimensions to be ignored. Default is None. Default is None.
- ensure_dims: str, or iterable of str, None
Every item may be one of the following:
Dimension name: A dimension that must always be preserved in the output, even when
squeeze=Trueand its size is 1, or when it appears indims_as_attrs.Metadata key: A key whose value defines an additional, non-squeezable dimension. When a metadata key is listed here, it does not need to be repeated in
extra_dims.
Default is None.
- fixed_dims: str, or iterable of str, None
Define all the dimensions to be generated. When used no other dimensions will be created. Might be incompatible with other settings. Default is None. It can be a single item or a list. Each item is either a metadata key, or a dict/tuple defining mapping between the dimension name and the metadata key. The whole option can be a dict. E.g.
# use key "time.step" as a dimension fixed_dims = "time.step" # use keys "time.step" and "vertical.level" as a dimension extra_dims = ["time.step", "vertical.level"] # define dimensions "step", level" and "level_type" from # metadata keys "metadata.step", "metadata.levelist" and "metadata.levtype" extra_dims = [ "metadata.step", {"level": "metadata.levelist"}, ("level_type", "metadata.levtype"), ] extra_dims = [ {"step": "metadata.step", "level": "metadata.levelist", "level_type": "metadata.levtype"} ]
- dim_roles: dict, None
Specify the “roles” used to form the predefined dimensions. The predefined dimensions are automatically generated when no
fixed_dimsspecified and comprise the following (in a fixed order):ensemble forecast member dimension
temporal dimensions (controlled by
time_dims)vertical dimensions (controlled by
level_dim_mode)
dim_rolesis a mapping between the “roles” and the metadata keys representing the roles. The possible roles are as follows:“member”: metadata key interpreted as ensemble forecast members
- “forecast_reference_time”: metadata key interpreted as forecast reference time. Can be a single
metadata key, or a list/tuple of two metadata keys representing the date and time parts of the forecast reference time. Alternatively, it can be a dict with “date” and “time” keys specifying the corresponding metadata keys. Used when
"forecast_reference_time"is intime_dims.
“step”: metadata key interpreted as forecast step
- “valid_time”: metadata key interpreted as valid time. Used when
"valid_time"is in time_dimsoradd_valid_time_coordis True.
- “valid_time”: metadata key interpreted as valid time. Used when
“date”: metadata key interpreted as base date. Used when
"date"is intime_dims.“time”: metadata key interpreted as base time. Used when
"time"is intime_dims.“level”: metadata key interpreted as level
“level_type”: metadata key interpreted as level type
The default values are as follows:
{ "member": "ensemble.member", "forecast_reference_time": "time.forecast_reference_time", "step": "time.step", "valid_time": "time.valid_datetime", "date": "time.base_date", "time": "time.base_time", "level": "vertical.level", "level_type": "vertical.level_type", }
dims_rolesbehaves differently to the other kwargs in the sense that it does not override but update the default values. So e.g. to change only “member” in the default it is enough to specify: “dim_roles={“member”: “metadata.perturbationNumber”}.- dim_name_from_role_name: bool, None
If True, the dimension names are formed from the role names. Otherwise, the dimension names are formed from the metadata keys specified in
dim_roles. Its default value (None) expands to True unless theprofileoverwrites it. Only used when nofixed_dimsare specified. New in version 0.15.0.- rename_dims: dict, None
Mapping to rename dimensions. Default is None.
- dims_as_attrs: str, or iterable of str, None
A dimension name or a list of dimension names that should be converted into variable attributes when they have only a single value for the corresponding variable. Note that such size-1 dimensions are still preserved if they are explicitly listed in
ensure_dims. The default isNone.- time_dims: str, list of str, or None
Explicitly specify the time dimension(s) to construct, together with their order. Each element is a role name from
dim_roles. The default is["forecast_reference_time", "step"]. Common configurations:["forecast_reference_time", "step"]: two dimensions for forecast reference time and step (default)["valid_time"]: a single valid-time dimension["date", "time", "step"]: three separate raw dimensions
- level_dim_mode: str, None
Controls how predefined vertical dimensions are constructed. The default is
"level". Valid values are:"level": Creates two separate dimensions,"level"and"level_type", as defined by the corresponding roles indim_roles."level_per_type": Uses a template dimension"<level_per_type>"that is expanded into one or more vertical dimensions. The dimension name is taken from the metadata key with the role"level_type"(e.g."pressure"), and the coordinate values come from the metadata key with the role"level"(e.g.[500, 700, 850, 1000])."level_and_type": Produces a single combined dimension,"level_and_type", in which the level value and the level type are merged.
- squeeze: bool, None
Remove dimensions which have only one valid value. Not applies to dimensions in
ensure_dims. Its default value (None) expands to True unless theprofileoverwrites it.- add_valid_time_coord: bool, None
Add the valid_time coordinate containing np.datetime64 values to the dataset. Only takes effect when
"valid_time"is not intime_dims. Its default value (None) expands to False unless theprofileoverwrites it.- decode_times: bool, None
If True, decode date and datetime coordinates into
datetime64values. If False, leave the coordinates in their native type (e.g.intif the coordinates come from the GRIB key like “date” or “validityDate”). The default value (None) expands to True unless theprofileoverwrites it.- decode_timedelta: bool, None
If True, decode time-like or duration-like coordinates into
timedelta64values. If False, leave the coordinates in their native type (e.g.intif the coordinates come from the GRIB key like “time”, “validityTime”, “step”); additionally, the duration-like coordinates (e.g. derived from the GRIB key like “step”, “endStep”, etc.) will have the attribute “units” appropriately set (to “minutes”, “hours”, etc.). If None (default), assume the same value ofdecode_timesunless theprofileoverwrites it.- aux_coords: dict, None
Mapping from an auxiliary coordinate label to a tuple:
(metadata_key: str, dataset_dimension(s): str or iterable of str). The default value is None.- add_geo_coords: bool, None
If True, add geographic coordinates to the dataset when field values are represented by a single “values” dimension. Its default value (None) expands to True unless the
profileoverwrites it.- flatten_values: bool, None
if True, flatten the values per field resulting in a single dimension called “values” representing a field. Otherwise the field shape is used to form the field dimensions. When the fields are defined on an unstructured grid (e.g. reduced Gaussian) or are spectral (e.g. spherical harmonics) this option is ignored and the field values are always represented by a single “values” dimension. Its default value (None) expands to False unless the
profileoverwrites it.- attrs_mode: str, None
Define how attributes are generated. Default is “fixed”. The possible values are:
“fixed”: Use the attributes defined in
variable_attrsas variables attributes andglobal_attrsas global attributes.“unique”: Use all the attributes defined in
attrs,variable_attrsandglobal_attrs. When an attribute fromattrshas unique value for a dataset it will be a global attribute, otherwise it will be a variable attribute. However, this logic is only applied if a unique variable attribute can be a global attribute according to the CF conventions Appendix A. (e.g. “units” cannot be a global attribute). Additionally, keys invariable_attrsare always used as variable attributes, while keys inglobal_attrsare always used as global attributes.
- attrs: str, number, callable, dict or list of these, None
Attribute or list of attributes. Only used when
attrs_modeisunique. Its default value (None) expands to [] unless theprofileoverwrites it. The following attributes are supported:str: Name of the attribute used as a metadata key to generate the value of the attribute. Can also be specified by prefixing with “key=” (e.g. “key=vertical.level”). When prefixed with “namespace=” it specifies a metadata namespace (e.g. “namespace=parameter”), which will be added as a dict to the attribute.
callable: A callable that takes a Field object and returns a dict of attributes, e.g.:
def rounded_wavelength(field): wl = field.get("metadata.wavelength") if wl is not None: return {"wavelength": round(wl)} else: return {}
dict: A dictionary of attributes with the keys as the attribute names. If the value is a callable it takes the attribute name and a Field object and returns the value of the attribute, e.g.:
def ensure_rounded(key, field): val = field.get(key) try: return round(val) except Exception: return val
A str value prefixed with “key=” or “namespace=” is interpreted as explained above. Any other values are used as the pre-defined value for the attribute.
- variable_attrs: str, number, callable, dict or list of these, None
Variable attribute or attributes. For the allowed values see
attrs. Its default value (None) expands to [] unless theprofileoverwrites it.- global_attrs: str, number, dict or list of these, None
Global attribute or attributes. For the allowed values see
attrs. Its default value (None) expands to [] unless theprofileoverwrites it.- coord_attrs: dict, None
To be documented. Default is None.
- add_earthkit_attrs: bool, None
If True, add earthkit specific attributes to the dataset. Its default value (None) expands to True unless the
profileoverwrites it.- rename_attrs: dict, None
A dictionary of attribute to rename. Default is None.
- fill_metadata: dict, None
Define fill values to metadata keys. Default is None.
- remapping: dict, None
Define new metadata keys for indexing. Any key provided in
remappingmay be referenced when specifying options such asvariable_key,extra_dims,ensure_dims,aux_coordsand others. Default is None.- lazy_load: bool, None
If True, the resulting Dataset will load data lazily from the underlying data source. If False, a DataSet holding all the data in memory and decoupled from the backend source will be created. Using
lazy_load=Falsewithrelease_source=Truecan provide optimised memory usage in certain cases. The default value oflazy_load(None) expands to True unless theprofileoverwrites it.- release_source: bool, None
Only used when
lazy_load=False. If True, memory held in the input fields are released as soon as their values are copied into the resulting Dataset. This is done per field to avoid memory spikes. The release operation is currently only supported for GRIB fields stored entirely in memory, e.g. when read from a stream. When a field does not support the release operation, this option is ignored. Having runto_xarraythe input data becomes unusable, so use this option carefully. The default value ofrelease_source(None) expands to False unless theprofileoverwrites it.- allow_holes: bool, None
If False, GRIB fields must form a full hypercube (without holes). If True, a dataset will be created from any GRIB fields and its coordinates will be a union of coordinates of the fields (outer join). Values corresponding to missing GRIB fields will be filled with NaN. The default value of
allow_holes(None) expands to False unless theprofileoverwrites it.- strict: bool, None
If True, perform stricter checks on hypercube consistency. Its default value (None) expands to False unless the
profileoverwrites it.- dtype: str, numpy.dtype or None
Typecode or data-type of the array data.
- array_namespace: str, array namespace, None
The array namespace to use for array operations. The default value (None) is expanded to “numpy”.
- open_dataset_parameters: ClassVar[tuple | None] = None¶
- abstractmethod open_datatree(filename_or_obj, *, drop_variables=None)¶
Backend open_datatree method used by Xarray in
open_datatree().If implemented, set the class variable supports_groups to True.
- abstractmethod open_groups_as_dict(filename_or_obj, *, drop_variables=None)¶
Opens a dictionary mapping from group names to Datasets.
Called by
open_groups(). This function exists to provide a universal way to open all groups in a file, before applying any additional consistency checks or requirements necessary to create a DataTree object (typically done usingfrom_dict()).If implemented, set the class variable supports_groups to True.
- supports_groups: ClassVar[bool] = False¶
- url: ClassVar[str] = ''¶
- class earthkit.data.xr_engine.engine.XarrayEarthkit¶
- to_fieldlist()¶
- to_target(target, *args, **kwargs)¶
- class earthkit.data.xr_engine.engine.XarrayEarthkitDataArray(xarray_obj)¶
Bases:
XarrayEarthkit- property grid_spec: dict | None¶
Return the grid specification of the DataArray.
- property reference_field¶
- set(*args, **kwargs)¶
Return a new DataArray with updated attributes.
- to_device(device, *, array_backend=None, array_namespace=None, **kwargs)¶
Return a new DataArray whose data live on device.
- to_fieldlist()¶
- to_netcdf(*args, **kwargs)¶
Remove earthkit attributes before writing to netcdf.
- to_target(target, *args, **kwargs)¶
- class earthkit.data.xr_engine.engine.XarrayEarthkitDataSet(xarray_obj)¶
Bases:
XarrayEarthkit- property grid_spec¶
Return the grid specification of the DataSet.
- set(*args, **kwargs)¶
Return a new DataArray with updated attributes.
- to_device(device, *, array_backend=None, array_namespace=None, **kwargs)¶
Return a new Dataset with every data variable on the specified
device.
- to_fieldlist()¶
- to_netcdf(*args, **kwargs)¶
Remove earthkit attributes before writing to netcdf.
- to_target(target, *args, **kwargs)¶