Retieving data from WEkEO

[1]:
import earthkit.data as ekd

The wekeo and wekeocds sources provide access to WEkEO.

Using the WEkEO grammar

This example demonstrates how to access data from the WEkEO catalogue.

To retrieve data using the WEkEO grammar (see the hda documentation), use the wekeo source.

The dataset used here is Vegetation Phenology and Productivity, yearly, UTM projection.

The example retrieves the Total Productivity variable, representing the integral of vegetation productivity accumulated over the growing season as the sum of daily values. The product is derived from Copernicus Sentinel-2 observations and distributed in GeoTIFF format using 1°×1° tiles at a spatial resolution of 10 × 10 m.

[2]:
d = ekd.from_source(
    "wekeo",
    "EO:EEA:DAT:CLMS_HRVPP_VPP",
    request={
        "dataset_id": "EO:EEA:DAT:CLMS_HRVPP_VPP",
        "productType": "TPROD",
        "productGroupId": "s1",
        "tileId": "32TQR",
        "resolution": "10",
        "startdate": "2023-01-01T00:00:00.000Z",
        "enddate": "2023-01-01T23:59:59.999Z",
        "bbox": [12.28091069558975, 45.40759897973081, 12.384715295944709, 45.45978517718846],
    },
)

To inspect the downloaded data we convert it into Xarray.

[3]:
d.available_types
[3]:
['xarray', 'pandas', 'fieldlist']
[4]:
ds = d.to_xarray()
ds
[4]:
<xarray.Dataset> Size: 482MB
Dimensions:      (x: 10980, y: 10980)
Coordinates:
  * x            (x) float64 88kB 7e+05 7e+05 7e+05 ... 8.097e+05 8.098e+05
  * y            (y) float64 88kB 5.1e+06 5.1e+06 5.1e+06 ... 4.99e+06 4.99e+06
    spatial_ref  int64 8B 0
Data variables:
    band_1       (y, x) float32 482MB ...
Attributes:
    TIFFTAG_COPYRIGHT:  Copernicus service information 2024
    file_creation:      2024:03:16 10:35:37
    input_time_window:  2018-10-01 to 2022-02-29
    TIIFTAG_SOFTWARE:   Timesat : TIMESAT4.1.8
    AREA_OR_POINT:      Area
[5]:
ds.band_1.plot()
[5]:
<matplotlib.collections.QuadMesh at 0x1218767b0>
../../_images/how-tos_source_wekeo_9_1.png

Using the CDSAPI grammar

We can use the wekeocds source to access datasets from the Copernicus Climate Data Store (CDS) available through WEkEO, using the standard cdsapi request grammar.

Compared to the wekeo source, the wekeocds source accepts queries directly in cdsapi format, without requiring the dataset_id to be repeated in the request payload. This allows users to copy and paste requests directly from the dataset forms and examples provided in CDS-compatible interfaces.

Dataset identifiers can be derived automatically by prepending "EO:ECMWF:DAT:" to the CDS dataset name and converting it to the WEkEO naming convention. For example, the CDS dataset name "reanalysis-era5-single-levels" becomes:

"EO:ECMWF:DAT:REANALYSIS_ERA5_SINGLE_LEVELS"

In practice, this corresponds to replacing hyphens (-) with underscores (_) and converting the name to uppercase.

The following example retrieves ERA5 surface data in GRIB format:

[6]:
import earthkit.data as ekd

d = ekd.from_source(
    "wekeocds",
    "EO:ECMWF:DAT:REANALYSIS_ERA5_SINGLE_LEVELS_MONTHLY_MEANS",
    request=dict(
        variable=["2m_temperature", "mean_sea_level_pressure"],
        product_type=["monthly_averaged_reanalysis_by_hour_of_day"],
        year="2012",
        month="12",
        time="12:00",
        data_format="grib",
        download_format="zip",
    ),
)

d.to_fieldlist().ls()

[6]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 2t 2012-12-01 12:00:00 2012-12-01 12:00:00 0 days 0 surface 0 regular_ll
1 msl 2012-12-01 12:00:00 2012-12-01 12:00:00 0 days 0 surface 0 regular_ll