{ "cells": [ { "cell_type": "markdown", "id": "6b0660f7-ca38-420e-b6db-64326c0b5266", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "## GRIB: modifying fields" ] }, { "cell_type": "raw", "id": "f65ece3d-feaa-4ea7-a1cb-ca0eef8056d2", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "This notebook explains how to use :py:meth:`~data.core.fieldlist.Field.clone` and :py:meth:`~data.core.fieldlist.Field.copy` to create new fields. The main difference between these methods can be summarised as follows:\n", "\n", "- :py:meth:`~data.core.fieldlist.Field.clone`: the new field keeps a reference to the original field and allows more flexible metadata modification than using :py:meth:`~data.core.fieldlist.Field.copy`\n", "- :py:meth:`~data.core.fieldlist.Field.copy`: always creates a deep copy. The resulting field is always an :py:class:`~data.sources.array_list.ArrayField` storing the all the data in memory" ] }, { "cell_type": "markdown", "id": "5b501f0b-1b9b-4c03-9cf5-7bf475d60916", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We will use the first field of the input data in the examples." ] }, { "cell_type": "code", "execution_count": 1, "id": "2f3fb883-020f-4050-9026-0a3d5c553899", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa5002007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 500 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll " ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import earthkit.data as ekd\n", "\n", "ds = ekd.from_source(\"file\", \"test4.grib\")\n", "f_ori = ds[0]\n", "f_ori.ls()" ] }, { "cell_type": "markdown", "id": "abf1ffe6-7c49-4cbc-b5c7-a267a5d6a68e", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "### Using clone()" ] }, { "cell_type": "markdown", "id": "cfc1e1d7-9220-4e8f-ba55-ad6928432b55", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Modifying metadata" ] }, { "cell_type": "code", "execution_count": 2, "id": "939aa250-f5bb-4752-82e1-2f9327c3761f", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa7002007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 700 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new = f_ori.clone(level=700)\n", "f_new.ls()" ] }, { "cell_type": "raw", "id": "b49e6507-c7e5-4081-9923-990b41692fd5", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Please note that the metadata object in the new field is a wrapper around the original one and \n", ":py:meth:`~data.readers.grib.metadata.GribMetadata.override` is not called to update it with the newly set keys. This means that there is no guarantee that the newly set keys are compatible with original metadata. So, arbitrary metadata keys can be set with no checks performed on their validity. This is generally not a problem for most of the field methods (e.g. :py:meth:`sel`) but can be an issue we we try to write the modified fields into a GRIB file (see below)." ] }, { "cell_type": "markdown", "id": "5b47c4f6-56a3-4e30-ab33-fb00773723a3", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "To demonstrate what was said above we set \"level\" to an invalid value and add a custom (non GRIB) key:" ] }, { "cell_type": "code", "execution_count": 3, "id": "af960845-678a-4e7b-9eb2-3cfd1b314655", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridTypemy_key
0ecmftisobaricInhPaabc2007010112000an0regular_ll123
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa abc 20070101 1200 0 \n", "\n", " dataType number gridType my_key \n", "0 an 0 regular_ll 123 " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new = f_ori.clone(level=\"abc\", my_key=\"123\")\n", "f_new.ls(extra_keys=\"my_key\")" ] }, { "cell_type": "raw", "id": "fc9105b5-4eda-4a5d-9a6b-35ce699d43e7", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Metadata values passed to :py:meth:`~data.core.fieldlist.Field.clone` can be callables." ] }, { "cell_type": "code", "execution_count": 4, "id": "a0b2a842-0b11-479a-8f7e-d2160e18c72e", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridTypecustom_name
0ecmftisobaricInhPa5002007010112000an0regular_llt500
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 500 20070101 1200 0 \n", "\n", " dataType number gridType custom_name \n", "0 an 0 regular_ll t500 " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def _f(field, key, original_metadata):\n", " return original_metadata[\"param\"] + str(original_metadata[\"level\"])\n", "\n", "f_new = f_ori.clone(custom_name=_f)\n", "f_new.ls(extra_keys=\"custom_name\")" ] }, { "cell_type": "markdown", "id": "5dbf44c9-8439-4da0-9caf-1c9c176c62ed", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Modifying values" ] }, { "cell_type": "code", "execution_count": 5, "id": "51a4a474-afa4-48cd-b0ab-55ee295ddd92", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "(array([229.04600525, 229.04600525, 229.04600525]),\n", " array([228.04600525, 228.04600525, 228.04600525]))" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new = f_ori.clone(values=f_ori.values + 1)\n", "f_new.values[0:3], f_ori.values[0:3]" ] }, { "cell_type": "markdown", "id": "601f588b-1c90-431b-89de-889e55d1c521", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Modifying both metadata and values" ] }, { "cell_type": "code", "execution_count": 6, "id": "0419a041-3972-415c-aab9-fbb1dbbfa642", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa7002007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 700 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new = f_ori.clone(values=f_ori.values + 1, level=700)\n", "f_new.ls()" ] }, { "cell_type": "code", "execution_count": 7, "id": "bf72674e-dc3f-4533-aa89-ab54e9a11ca0", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "array([229.04600525, 229.04600525, 229.04600525])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new.values[0:3]" ] }, { "cell_type": "markdown", "id": "21e1dc03-4c9b-46bc-9785-1b0c6b5b2691", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Saving the modifield field into a GRIB file" ] }, { "cell_type": "code", "execution_count": 8, "id": "84aaadd9-901e-4b87-a886-a197b5b504eb", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/var/folders/93/w0p869rx17q98wxk83gn9ys40000gn/T/ipykernel_19855/457062347.py:1: DeprecatedWarning: save is deprecated as of 0.13.0 and will be removed in 0.14.0. Use to_target() instead\n", " f_new.save(\"_modified_field.grib\")\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa7002007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 700 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new.save(\"_modified_field.grib\")\n", "ds_new = ekd.from_source(\"file\", \"_modified_field.grib\")\n", "ds_new[0].ls()" ] }, { "cell_type": "code", "execution_count": 9, "id": "1805823b-9904-4e17-93d3-fafbad6df6f8", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "array([229.04600525, 229.04600525, 229.04600525])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds_new[0].values[0:3]" ] }, { "cell_type": "markdown", "id": "01626f16-47fe-4ad7-9db4-6ce031a208e2", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### Saving is not always possible" ] }, { "cell_type": "markdown", "id": "7905e7d5-887f-4fb9-b08f-1158ae7828d3", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Saving the modified field into a GRIB is only possible if the modified metadata is GRIB compatible. To demonstrate this we add a custom metadata key called \"_level\"." ] }, { "cell_type": "code", "execution_count": 10, "id": "ee9caaf2-4626-4acb-a849-0d8460af5237", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "(700, 500)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new = f_ori.clone(_level=700)\n", "f_new.metadata(\"_level\", \"level\")" ] }, { "cell_type": "markdown", "id": "f280dbf1-d6a7-4d7b-83b1-bdee92aaf249", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "Writing to GRIB is not possible because \"_level\" is not a valid GRIB key and ecCodes raises an exception." ] }, { "cell_type": "code", "execution_count": 11, "id": "5baa362d-ea7d-4b41-ace9-0631dba2f756", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Error setting _level=700\n", "Key/value not found\n", "Traceback (most recent call last):\n", " File \"/Users/cgr/git/earthkit-data/src/earthkit/data/utils/message.py\", line 274, in set\n", " return eccodes.codes_set(self._handle, name, value)\n", " File \"/Users/cgr/git/eccodes-python/gribapi/gribapi.py\", line 2140, in grib_set\n", " grib_set_long(msgid, key, value)\n", " File \"/Users/cgr/git/eccodes-python/gribapi/gribapi.py\", line 1006, in grib_set_long\n", " GRIB_CHECK(lib.grib_set_long(h, key.encode(ENC), value))\n", " File \"/Users/cgr/git/eccodes-python/gribapi/gribapi.py\", line 226, in GRIB_CHECK\n", " errors.raise_grib_error(errid)\n", " File \"/Users/cgr/git/eccodes-python/gribapi/errors.py\", line 381, in raise_grib_error\n", " raise ERROR_MAP[errid](errid)\n", "gribapi.errors.KeyValueNotFoundError: Key/value not found\n" ] }, { "name": "stdout", "output_type": "stream", "text": [ "Key/value not found\n" ] } ], "source": [ "try:\n", " f_new.to_target(\"file\", \"_modified_field1.grib\")\n", "except Exception as e:\n", " print(e)" ] }, { "cell_type": "markdown", "id": "bce26496-7c9b-4ea4-8660-bacd5ef06ef6", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "### Using copy()" ] }, { "cell_type": "raw", "id": "ff2c9b5a-f9ba-4c47-b3b2-db2c6790c9c0", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ ":py:meth:`~data.core.fieldlist.Field.copy` performs a deep copy to generate an :py:class:`~data.sources.array_list.ArrayField` storing all the all the data in memory. \n", "\n", "When :py:meth:`~data.core.fieldlist.Field.copy` is called without arguments both the values and the Metadata object of the original field are copied into the new field. The latter copy is created by calling :py:meth:`~data.readers.grib.metadata.GribMetadata.override` on the original metadata object." ] }, { "cell_type": "code", "execution_count": 12, "id": "1265f65d-c39b-4b22-ab41-3d03d532e08d", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "(array([228.04600525, 228.04600525, 228.04600525]),\n", " array([228.04600525, 228.04600525, 228.04600525]))" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new = f_ori.copy()\n", "f_new.values[:3], f_ori.values[:3]" ] }, { "cell_type": "raw", "id": "de3c15ba-770b-4f79-a134-5086f5aad9a3", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We can pass new values to :py:meth:`~data.core.fieldlist.Field.copy`." ] }, { "cell_type": "code", "execution_count": 13, "id": "43e186a1-195e-424c-b778-356da142e163", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "(array([229.04600525, 229.04600525, 229.04600525]),\n", " array([228.04600525, 228.04600525, 228.04600525]))" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new = f_ori.copy(values=f_ori.values + 1)\n", "f_new.values[:3], f_ori.values[:3]" ] }, { "cell_type": "markdown", "id": "51aae760-50e3-44be-aff6-06cc3b5ec641", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We can also pass a new metadata object." ] }, { "cell_type": "code", "execution_count": 14, "id": "20fb42a2-8d98-43dd-a270-42710922b19f", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa3002007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 300 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll " ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "f_new = f_ori.copy(metadata=f_ori.metadata().override(level=300))\n", "f_new.ls()" ] }, { "cell_type": "code", "execution_count": null, "id": "c4d862b1-21c7-4c8c-9d80-c2510080ca01", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "dev_ecc", "language": "python", "name": "dev_ecc" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" } }, "nbformat": 4, "nbformat_minor": 5 }