{ "cells": [ { "cell_type": "markdown", "id": "7b1f3038-fbf6-424e-bfc1-14f23e5559d1", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "## Xarray engine: splitting options" ] }, { "cell_type": "raw", "id": "c679384d-3b81-4901-9a0a-8e551189f944", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "The GRIB data in this example contains pressure and model level fields. Since it cannot form a hypercube :py:meth:`~data.readers.grib.index.GribFieldList.to_xarray` fails." ] }, { "cell_type": "code", "execution_count": 1, "id": "f2d686a3-d2b9-4a25-8315-c956bc46cad4", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6c861eb4640740fab0436a6893174a95", "version_major": 2, "version_minor": 0 }, "text/plain": [ "mixed_pl_ml.grib: 0%| | 0.00/176k [00:00, ?B/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Dimension 'level_type' of variable 't' cannot have multiple values=['hybrid', 'isobaricInhPa']\n" ] } ], "source": [ "import earthkit.data as ekd\n", "ds_fl = ekd.from_source(\"sample\", \"mixed_pl_ml.grib\")\n", "try:\n", " ds_xr = ds_fl.to_xarray(profile=\"grib\")\n", "except Exception as e:\n", " print(e)" ] }, { "cell_type": "raw", "id": "3782ced9-318a-4245-8cd9-3daafa78ae88", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "In this case we can use the ``split_dims`` option to split the hypercube along the problematic dimensions. `split_dims`` does not use dimension names but takes a single or multiple GRIB keys to perform the splitting on. The results a tuple of two lists: \n", "\n", "- the first list contains the Xarray datasets\n", "- the second list contains the corresponding dictionaries with the spitting keys/values (one dictionary per dataset)\n", "\n", "Please note that this option cannot be used when the Xarray is directly generated via :py:meth:`xarray.open_dataset`." ] }, { "cell_type": "code", "execution_count": 2, "id": "914015b8-7709-480a-9fda-c83c1abd6cca", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds_xr, keys_xr = ds_fl.to_xarray(profile=\"grib\", split_dims=\"typeOfLevel\")\n", "len(ds_xr)" ] }, { "cell_type": "markdown", "id": "baeb4411-c768-4f1b-bab7-f8f8d734a5b1", "metadata": {}, "source": [ "The first dataset:" ] }, { "cell_type": "code", "execution_count": 3, "id": "15c63049-f138-44cb-a54e-a85a52f808ad", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
<xarray.Dataset> Size: 176kB\n",
"Dimensions: (forecast_reference_time: 4, step: 2, level: 2,\n",
" latitude: 19, longitude: 36)\n",
"Coordinates:\n",
" * forecast_reference_time (forecast_reference_time) datetime64[ns] 32B 202...\n",
" * step (step) timedelta64[ns] 16B 00:00:00 06:00:00\n",
" * level (level) int64 16B 90 137\n",
" * latitude (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0\n",
" * longitude (longitude) float64 288B 0.0 10.0 ... 340.0 350.0\n",
"Data variables:\n",
" t (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...\n",
" u (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...\n",
"Attributes:\n",
" Conventions: CF-1.8\n",
" institution: ECMWF