{ "cells": [ { "cell_type": "markdown", "id": "22408e0c-ddae-4924-b352-e7d796602c14", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "## GRIB: using array fieldlists" ] }, { "cell_type": "markdown", "id": "05675a78-99c4-404f-aae4-12c1d8ee1ced", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "In this example we will use a GRIB file containing 4 messages. First we ensure the file is available and read it into a fieldlist." ] }, { "cell_type": "code", "execution_count": 1, "id": "076ae474-e6c4-4a0d-a66c-91b03965627a", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "import earthkit.data as ekd\n", "ekd.download_example_file(\"test4.grib\")\n", "ds_in = ekd.from_source(\"file\", \"test4.grib\")" ] }, { "cell_type": "raw", "id": "847c90c4-5928-4481-abc9-c2ed9aada29f", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [], "vscode": { "languageId": "raw" } }, "source": [ "Using the :meth:`~data.core.fieldlist.FieldList.to_fieldlist` method we can convert this object into an array fieldlist where each field contains an array (holding the field values) and a :py:class:`~data.readers.grib.metadata.RestrictedGribMetadata` object representing the related metadata. Array fieldlists are entirely stored in memory. The resulting array format is controlled by ``array_namespace`` keyword argument of :meth:`~data.core.fieldlist.FieldList.to_fieldlist`. When using its default value (None) the underlying array format of the original fieldlist is kept. For GRIB data read from a file or stream this will be \"numpy\". " ] }, { "cell_type": "markdown", "id": "3374ab00-054b-4aba-a355-f119b619be1e", "metadata": {}, "source": [ "### Numpy array fieldlist" ] }, { "cell_type": "markdown", "id": "2d99d5aa-8e9e-4c75-8394-188891c75c29", "metadata": {}, "source": [ "The \"numpy\" fieldlist we generate in the cell below works exactly in the same way as the original one but stores all the data in memory." ] }, { "cell_type": "code", "execution_count": 2, "id": "6ec4489f-2030-4a55-8292-7f50fb845677", "metadata": {}, "outputs": [], "source": [ "ds = ds_in.to_fieldlist()" ] }, { "cell_type": "code", "execution_count": 3, "id": "07e0823d-ff1d-43d1-8ffb-e5a6d0616ce4", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(ds)" ] }, { "cell_type": "markdown", "id": "e4106883-8035-4da5-90eb-b83f0d903c4b", "metadata": {}, "source": [ "### Pytorch array fieldlist" ] }, { "cell_type": "markdown", "id": "e34b8492-60d5-4052-abf9-b02d1a692c1d", "metadata": {}, "source": [ "For the next example we choose the \"torch\" array namespace. Since pytorch is an optional dependency for earthkit-data we need to ensure it is installed in the environment." ] }, { "cell_type": "code", "execution_count": 4, "id": "b3b30d9f-0edb-4938-baec-7026acd70192", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "!pip install torch --quiet" ] }, { "cell_type": "code", "execution_count": 5, "id": "5d2174d7-0f36-4b20-8ad5-bd93fd12f91b", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [], "source": [ "ds = ds_in.to_fieldlist(array_namespace=\"torch\")" ] }, { "cell_type": "code", "execution_count": 6, "id": "3c108a25-5f41-422f-9adb-98c932205dce", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa5002007010112000an0regular_ll
1ecmfzisobaricInhPa5002007010112000an0regular_ll
2ecmftisobaricInhPa8502007010112000an0regular_ll
3ecmfzisobaricInhPa8502007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 500 20070101 1200 0 \n", "1 ecmf z isobaricInhPa 500 20070101 1200 0 \n", "2 ecmf t isobaricInhPa 850 20070101 1200 0 \n", "3 ecmf z isobaricInhPa 850 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll \n", "1 an 0 regular_ll \n", "2 an 0 regular_ll \n", "3 an 0 regular_ll " ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.ls()" ] }, { "cell_type": "markdown", "id": "fae43976-e6c9-4520-a28a-13099a44f08d", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### values" ] }, { "cell_type": "raw", "id": "0019f9b6-3607-48df-a018-d8435cdac15e", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "When we use either :py:attr:`Field.values ` or :py:attr:`FieldList.values ` now we get a pytorch Tensor." ] }, { "cell_type": "code", "execution_count": 7, "id": "21a1e8b1-f0e5-4de6-bdbf-e92c5df5989e", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "tensor([228.0460, 228.0460, 228.0460, 228.0460, 228.0460, 228.0460, 228.0460,\n", " 228.0460, 228.0460, 228.0460], dtype=torch.float64)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds[0].values[:10]" ] }, { "cell_type": "code", "execution_count": 8, "id": "fbd92126-bb5b-47e5-80a7-d4bed3097764", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "torch.Size([65160])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds[0].values.shape" ] }, { "cell_type": "code", "execution_count": 9, "id": "714b320c-90ea-4326-bc5f-340dda66daab", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "torch.Size([4, 65160])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.values.shape" ] }, { "cell_type": "markdown", "id": "c9197df9-3b35-4220-b189-2439de0a4ea9", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### to_array()" ] }, { "cell_type": "raw", "id": "fe0f41b3-15f6-4a09-a4ef-8b6b36686142", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ ":py:meth:`Field.to_array() ` and :py:meth:`FieldList.to_array() ` return the values based on the underlying namespace. " ] }, { "cell_type": "code", "execution_count": 10, "id": "f60797eb-d578-4638-b1d0-bd18949dd249", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "tensor([[228.0460, 228.0460],\n", " [228.6085, 228.5792]], dtype=torch.float64)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds[0].to_array()[:2,:2]" ] }, { "cell_type": "code", "execution_count": 11, "id": "f4d053fa-2acf-4949-9bbd-a0b2ccf30318", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "torch.Size([4, 181, 360])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.to_array().shape" ] }, { "cell_type": "code", "execution_count": 12, "id": "04cd31df-34cd-47a9-90cc-833b9805bd55", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "torch.Size([4, 65160])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.to_array(flatten=True).shape" ] }, { "cell_type": "markdown", "id": "87cff063-95af-45b9-bae1-91cb28ca23ea", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "#### to_numpy()" ] }, { "cell_type": "raw", "id": "23eba4ad-a0f3-4f94-aff3-83b7a0dd07d6", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ ":py:meth:`Field.to_numpy() ` and :py:meth:`FieldList.to_numpy() ` still return ndarrays." ] }, { "cell_type": "code", "execution_count": 13, "id": "b60eea0f-0da1-48f1-a64e-f1b75e36a737", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "array([[228.04600525, 228.04600525],\n", " [228.60850525, 228.57920837]])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds[0].to_numpy()[:2,:2]" ] }, { "cell_type": "code", "execution_count": 14, "id": "ae6bd7cd-b043-4993-b715-e5ea2531490a", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "(4, 181, 360)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "ds.to_numpy().shape" ] }, { "cell_type": "markdown", "id": "f0efdd2b-e078-43e2-936e-35db3a645a6c", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "### Building array fieldlists with from_array()" ] }, { "cell_type": "raw", "id": "3cdcbb75-86b1-4c38-b667-73ac563c8d97", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [], "vscode": { "languageId": "raw" } }, "source": [ "We can build a new array fieldlist straight from metadata and array values using :meth:`~data.core.fieldlist.FieldList.from_array`. This can be used for computations when we want to alter the values and store the result in a new FieldList." ] }, { "cell_type": "code", "execution_count": 15, "id": "a92db7cb-e2e1-472e-92b6-0f42360d2105", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa5002007010112000an0regular_ll
1ecmfzisobaricInhPa5002007010112000an0regular_ll
2ecmftisobaricInhPa8502007010112000an0regular_ll
3ecmfzisobaricInhPa8502007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 500 20070101 1200 0 \n", "1 ecmf z isobaricInhPa 500 20070101 1200 0 \n", "2 ecmf t isobaricInhPa 850 20070101 1200 0 \n", "3 ecmf z isobaricInhPa 850 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll \n", "1 an 0 regular_ll \n", "2 an 0 regular_ll \n", "3 an 0 regular_ll " ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "md = ds.metadata()\n", "v = ds.to_array() + 2\n", "r1 = ekd.FieldList.from_array(v, md)\n", "r1.ls()" ] }, { "cell_type": "markdown", "id": "920f6b98-f0e6-4ffc-a5e4-f8f643c23f76", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "As expected, the values in *r1* are now differing by 2 from the ones in the original fieldlist (*r*)." ] }, { "cell_type": "code", "execution_count": 16, "id": "5b78ea8a-64e9-4bfa-9d11-8b390995994b", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/plain": [ "tensor([230.0460, 230.0460, 230.0460, 230.0460, 230.0460, 230.0460, 230.0460,\n", " 230.0460, 230.0460, 230.0460], dtype=torch.float64)" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "r1[0].values[:10]" ] }, { "cell_type": "markdown", "id": "48f6f5a4-1304-43ef-b357-563cfa071309", "metadata": {}, "source": [ "### Building an array fieldlist in a loop" ] }, { "cell_type": "code", "execution_count": 17, "id": "a4b086a5-406d-4d1c-bd93-1edbde84bf81", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa5002007010112000an0regular_ll
1ecmfzisobaricInhPa5002007010112000an0regular_ll
2ecmftisobaricInhPa8502007010112000an0regular_ll
3ecmfzisobaricInhPa8502007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 500 20070101 1200 0 \n", "1 ecmf z isobaricInhPa 500 20070101 1200 0 \n", "2 ecmf t isobaricInhPa 850 20070101 1200 0 \n", "3 ecmf z isobaricInhPa 850 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll \n", "1 an 0 regular_ll \n", "2 an 0 regular_ll \n", "3 an 0 regular_ll " ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "md = ds.metadata()\n", "v = ds.to_array() + 2\n", "\n", "r1 = ekd.SimpleFieldList()\n", "for k in range(len(md)):\n", " r1.append(ekd.ArrayField(v[k], md[k]))\n", "r1.ls()" ] }, { "cell_type": "markdown", "id": "78b1c896-a892-4192-99a1-bc9dc1cfcd4f", "metadata": { "editable": true, "raw_mimetype": "", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "### Saving to GRIB" ] }, { "cell_type": "raw", "id": "dc2332e6-434c-406f-9f03-567e16afc876", "metadata": { "editable": true, "raw_mimetype": "text/restructuredtext", "slideshow": { "slide_type": "" }, "tags": [] }, "source": [ "We can save array fieldlists into GRIB." ] }, { "cell_type": "code", "execution_count": 18, "id": "9ceea7b3-5059-4378-9a26-d282caa3b74a", "metadata": { "editable": true, "slideshow": { "slide_type": "" }, "tags": [] }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
centreshortNametypeOfLevelleveldataDatedataTimestepRangedataTypenumbergridType
0ecmftisobaricInhPa5002007010112000an0regular_ll
1ecmfzisobaricInhPa5002007010112000an0regular_ll
2ecmftisobaricInhPa8502007010112000an0regular_ll
3ecmfzisobaricInhPa8502007010112000an0regular_ll
\n", "
" ], "text/plain": [ " centre shortName typeOfLevel level dataDate dataTime stepRange \\\n", "0 ecmf t isobaricInhPa 500 20070101 1200 0 \n", "1 ecmf z isobaricInhPa 500 20070101 1200 0 \n", "2 ecmf t isobaricInhPa 850 20070101 1200 0 \n", "3 ecmf z isobaricInhPa 850 20070101 1200 0 \n", "\n", " dataType number gridType \n", "0 an 0 regular_ll \n", "1 an 0 regular_ll \n", "2 an 0 regular_ll \n", "3 an 0 regular_ll " ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "path = \"_from_pytroch.grib\"\n", "r1.to_target(\"file\", path)\n", "ds1 = ekd.from_source(\"file\", path)\n", "ds1.ls()" ] } ], "metadata": { "kernelspec": { "display_name": "dev", "language": "python", "name": "dev" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.1" } }, "nbformat": 4, "nbformat_minor": 5 }