{ "cells": [ { "cell_type": "markdown", "id": "3cd3659f-bd6d-49d9-821a-9183cbe84655", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "## Xarray engine: writing back to GRIB" ] }, { "cell_type": "markdown", "id": "4394dbab-dd68-4523-8581-28fc4001048d", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "
\n", "Warning: converting back Xarray to GRIB is an experimental feature and is not yet fully supported.
" ] }, { "cell_type": "markdown", "id": "359a59ce-e285-4202-84ed-995efeea4dda", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "First, we get some example GRIB data and convert it into Xarray." ] }, { "cell_type": "code", "execution_count": 1, "id": "a2ef916d-79aa-4c59-9bff-b92c0dafca1f", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7ddf1047ae7f43cb94d42c6c8bfa4d81", "version_major": 2, "version_minor": 0 }, "text/plain": [ "pl.grib: 0%| | 0.00/48.8k [00:00\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "\n", "
<xarray.Dataset> Size: 176kB\n",
       "Dimensions:                  (forecast_reference_time: 4, step: 2, level: 2,\n",
       "                              latitude: 19, longitude: 36)\n",
       "Coordinates:\n",
       "  * forecast_reference_time  (forecast_reference_time) datetime64[ns] 32B 202...\n",
       "  * step                     (step) timedelta64[ns] 16B 00:00:00 06:00:00\n",
       "  * level                    (level) int64 16B 500 700\n",
       "  * latitude                 (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0\n",
       "  * longitude                (longitude) float64 288B 0.0 10.0 ... 340.0 350.0\n",
       "Data variables:\n",
       "    r                        (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...\n",
       "    t                        (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...\n",
       "Attributes:\n",
       "    class:        od\n",
       "    stream:       oper\n",
       "    levtype:      pl\n",
       "    type:         fc\n",
       "    expver:       0001\n",
       "    date:         20240603\n",
       "    time:         0\n",
       "    domain:       g\n",
       "    number:       0\n",
       "    Conventions:  CF-1.8\n",
       "    institution:  ECMWF
" ], "text/plain": [ " Size: 176kB\n", "Dimensions: (forecast_reference_time: 4, step: 2, level: 2,\n", " latitude: 19, longitude: 36)\n", "Coordinates:\n", " * forecast_reference_time (forecast_reference_time) datetime64[ns] 32B 202...\n", " * step (step) timedelta64[ns] 16B 00:00:00 06:00:00\n", " * level (level) int64 16B 500 700\n", " * latitude (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0\n", " * longitude (longitude) float64 288B 0.0 10.0 ... 340.0 350.0\n", "Data variables:\n", " r (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...\n", " t (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...\n", "Attributes:\n", " class: od\n", " stream: oper\n", " levtype: pl\n", " type: fc\n", " expver: 0001\n", " date: 20240603\n", " time: 0\n", " domain: g\n", " number: 0\n", " Conventions: CF-1.8\n", " institution: ECMWF" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import earthkit.data as ekd\n", "\n", "ds_fl = ekd.from_source(\"sample\", \"pl.grib\")\n", "ds_xr = ds_fl.to_xarray()\n", "ds_xr" ] }, { "cell_type": "raw", "id": "28af4ff2-aaf3-4952-855d-3b91588e2de5", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "By default, ``add_earthkit_attrs=True`` in :py:meth:`~data.readers.grib.index.GribFieldList.to_xarray` and some special earthkit attributes are added to the dataset. This is needed for the Xarray to GRIB conversion. For this reason, if the Xarray is modified we must ensure the variable attributes are copied to the new Xarray dataset. By default, variable attributes are not kept in Xarray computations so we need to set the global Xarray ``keep_attrs`` option to enable it." ] }, { "cell_type": "code", "execution_count": 2, "id": "b98402ad-8e81-42af-9426-c8c59d0d1a45", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "# ensure earthkit attributes are set\n", "import xarray as xr \n", "xr.set_options(keep_attrs=True)\n", "\n", "# modify values\n", "ds_xr += 1" ] }, { "cell_type": "markdown", "id": "40c77039-21e0-48f4-a809-9530672da724", "metadata": {}, "source": [ "#### Using to_target()" ] }, { "cell_type": "raw", "id": "d6c82521-859e-4bc9-ad3e-686071af88e8", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "It is possible to directly write the Xarray dataset created with the earthkit engine into a GRIB file with :func:`to_target`. This is a memory efficient way to write GRIB to disk since only one field is loaded into memory at a time. We can call :func:`to_target` either on the ``earthkit`` accessor or as a top level function." ] }, { "cell_type": "markdown", "id": "fcf81205-624e-4d4c-ad86-0595ff3e588f", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "First, we write a datarray into a GRIB file." ] }, { "cell_type": "code", "execution_count": 3, "id": "68d4ec97-2429-44f8-8f80-5ffd58639222", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "GribField(t,500,20240603,0,0,0)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# option1: writing to GRIB file using the accessor\n", "ds_xr[\"t\"].earthkit.to_target(\"file\", \"_from_xr_1.grib\")\n", "\n", "# option2: writing to GRIB file using the top level function\n", "ekd.to_target(\"file\", \"_from_xr_1a.grib\", data=ds_xr[\"t\"])\n", "\n", "# check the results\n", "ds_tmp1 = ekd.from_source(\"file\", \"_from_xr_1.grib\")\n", "ds_tmp1[0]" ] }, { "cell_type": "markdown", "id": "d31c0a87-ce0f-4365-a9d4-f9fb88e67eaa", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Next, we write the whole dataset into a GRIB file." ] }, { "cell_type": "code", "execution_count": 4, "id": "ced38f9d-20c6-43d7-9e91-32b7bff93a6a", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmfrisobaricInhPa5002024060300fc0regular_ll
1ecmfrisobaricInhPa7002024060300fc0regular_ll
2ecmfrisobaricInhPa5002024060306fc0regular_ll
3ecmfrisobaricInhPa7002024060306fc0regular_ll
4ecmfrisobaricInhPa5002024060312000fc0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf r isobaricInhPa 500 20240603 0 0 \n", "1 ecmf r isobaricInhPa 700 20240603 0 0 \n", "2 ecmf r isobaricInhPa 500 20240603 0 6 \n", "3 ecmf r isobaricInhPa 700 20240603 0 6 \n", "4 ecmf r isobaricInhPa 500 20240603 1200 0 \n", "\n", " dataType number gridType \n", "0 fc 0 regular_ll \n", "1 fc 0 regular_ll \n", "2 fc 0 regular_ll \n", "3 fc 0 regular_ll \n", "4 fc 0 regular_ll " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# option1: writing to GRIB file using the accessor\n", "ds_xr.earthkit.to_target(\"file\", \"_from_xr_2.grib\")\n", "\n", "# option2: writing to GRIB file using the top level function\n", "ekd.to_target(\"file\", \"_from_xr_2a.grib\", data=ds_xr)\n", "\n", "# check the results\n", "ds_tmp2 = ekd.from_source(\"file\", \"_from_xr_2.grib\")\n", "ds_tmp2.head()" ] }, { "cell_type": "markdown", "id": "c75cbad6-a383-4bf2-add9-98a6c313aada", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We check if the computation results were correctly written to the generated GRIB data." ] }, { "cell_type": "code", "execution_count": 5, "id": "0c0bafa4-e3ca-4361-b28c-851b8447f323", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "250.22500610351562\n", "251.22500610351562\n" ] } ], "source": [ "# original GRIB data\n", "print(ds_fl.sel(param=\"t\", step=0, level=500)[0].values[0])\n", "# GRIB data converted from the modified xarray object\n", "print(ds_tmp1.sel(param=\"t\", step=0, level=500)[0].values[0])" ] }, { "cell_type": "markdown", "id": "be7471ec-bb6c-4873-bd32-babb4b4fc2ba", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Using to_fieldlist()" ] }, { "cell_type": "raw", "id": "30ad2d99-db56-4302-99cf-277602202459", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We can also convert the Xarray dataset into a GRIB fieldlist by using :py:meth:`~data.utils.xarray.engine.XarrayEarthkit.to_fieldlist` on the ``earthkit`` accessor of the Xarray object. Please note that this will generate a fieldlist entirely stored in memory." ] }, { "cell_type": "markdown", "id": "07ee1067-6476-450b-bc12-d90b1dc2fc05", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "First, we convert a dataarray to a GRIB fieldlist." ] }, { "cell_type": "code", "execution_count": 6, "id": "3784d2d3-8254-48bf-8ee2-9aa2b5d69686", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa5002024060300fc0regular_ll
1ecmftisobaricInhPa7002024060300fc0regular_ll
2ecmftisobaricInhPa5002024060306fc0regular_ll
3ecmftisobaricInhPa7002024060306fc0regular_ll
4ecmftisobaricInhPa5002024060312000fc0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 500 20240603 0 0 \n", "1 ecmf t isobaricInhPa 700 20240603 0 0 \n", "2 ecmf t isobaricInhPa 500 20240603 0 6 \n", "3 ecmf t isobaricInhPa 700 20240603 0 6 \n", "4 ecmf t isobaricInhPa 500 20240603 1200 0 \n", "\n", " dataType number gridType \n", "0 fc 0 regular_ll \n", "1 fc 0 regular_ll \n", "2 fc 0 regular_ll \n", "3 fc 0 regular_ll \n", "4 fc 0 regular_ll " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds_fl1 = ds_xr[\"t\"].earthkit.to_fieldlist()\n", "ds_fl1.head()" ] }, { "cell_type": "markdown", "id": "b81cc6de-0b91-442b-9df3-4f3c332aa09c", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Next, we convert back the whole dataset into a GRIB fieldlist." ] }, { "cell_type": "code", "execution_count": 7, "id": "b39c4d26-dd9f-4a76-9559-6fb8b6676698", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmfrisobaricInhPa5002024060300fc0regular_ll
1ecmfrisobaricInhPa7002024060300fc0regular_ll
2ecmfrisobaricInhPa5002024060306fc0regular_ll
3ecmfrisobaricInhPa7002024060306fc0regular_ll
4ecmfrisobaricInhPa5002024060312000fc0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf r isobaricInhPa 500 20240603 0 0 \n", "1 ecmf r isobaricInhPa 700 20240603 0 0 \n", "2 ecmf r isobaricInhPa 500 20240603 0 6 \n", "3 ecmf r isobaricInhPa 700 20240603 0 6 \n", "4 ecmf r isobaricInhPa 500 20240603 1200 0 \n", "\n", " dataType number gridType \n", "0 fc 0 regular_ll \n", "1 fc 0 regular_ll \n", "2 fc 0 regular_ll \n", "3 fc 0 regular_ll \n", "4 fc 0 regular_ll " ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds_fl2 = ds_xr.earthkit.to_fieldlist()\n", "ds_fl2.head()" ] }, { "cell_type": "raw", "id": "d67d36c7-e0e4-41d5-b0b6-e8b8743f3d10", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "The generated GRIB fieldlist can be saved to disk using the :func:`to_target` method." ] }, { "cell_type": "code", "execution_count": 8, "id": "36423326-cf54-494b-8932-72d986b908bf", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "GribField(t,500,20240603,0,0,0)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "out_name = \"_from_xr_3.grib\"\n", "ds_fl1.to_target(\"file\", out_name)\n", "# read back and check the saved GRIB\n", "ds_tmp = ekd.from_source(\"file\", out_name)\n", "ds_tmp[0]" ] }, { "cell_type": "code", "execution_count": null, "id": "3c7a3570-2514-42f9-8b7c-0a087fcc9205", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "dev", "language": "python", "name": "dev" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.12" } }, "nbformat": 4, "nbformat_minor": 5 }