GRIB: using array fieldlists

In this example we will use a GRIB file containing 4 messages. First we ensure the file is available and read it into a fieldlist.

[1]:
import earthkit.data as ekd
ekd.download_example_file("test4.grib")
ds_in = ekd.from_source("file", "test4.grib")

Using the to_fieldlist() method we can convert this object into an array fieldlist where each field contains an array (holding the field values) and a RestrictedGribMetadata object representing the related metadata. Array fieldlists are entirely stored in memory. The resulting array format is controlled by array_namespace keyword argument of to_fieldlist(). When using its default value (None) the underlying array format of the original fieldlist is kept. For GRIB data read from a file or stream this will be “numpy”.

Numpy array fieldlist

The “numpy” fieldlist we generate in the cell below works exactly in the same way as the original one but stores all the data in memory.

[2]:
ds = ds_in.to_fieldlist()
[3]:
len(ds)
[3]:
4

Pytorch array fieldlist

For the next example we choose the “torch” array namespace. Since pytorch is an optional dependency for earthkit-data we need to ensure it is installed in the environment.

[4]:
!pip install torch --quiet
[5]:
ds = ds_in.to_fieldlist(array_namespace="torch")
[6]:
ds.ls()
[6]:
centre shortName typeOfLevel level dataDate dataTime stepRange dataType number gridType
0 ecmf t isobaricInhPa 500 20070101 1200 0 an 0 regular_ll
1 ecmf z isobaricInhPa 500 20070101 1200 0 an 0 regular_ll
2 ecmf t isobaricInhPa 850 20070101 1200 0 an 0 regular_ll
3 ecmf z isobaricInhPa 850 20070101 1200 0 an 0 regular_ll

values

When we use either Field.values or FieldList.values now we get a pytorch Tensor.

[7]:
ds[0].values[:10]
[7]:
tensor([228.0460, 228.0460, 228.0460, 228.0460, 228.0460, 228.0460, 228.0460,
        228.0460, 228.0460, 228.0460], dtype=torch.float64)
[8]:
ds[0].values.shape
[8]:
torch.Size([65160])
[9]:
ds.values.shape
[9]:
torch.Size([4, 65160])

to_array()

Field.to_array() and FieldList.to_array() return the values based on the underlying namespace.

[10]:
ds[0].to_array()[:2,:2]
[10]:
tensor([[228.0460, 228.0460],
        [228.6085, 228.5792]], dtype=torch.float64)
[11]:
ds.to_array().shape
[11]:
torch.Size([4, 181, 360])
[12]:
ds.to_array(flatten=True).shape
[12]:
torch.Size([4, 65160])

to_numpy()

Field.to_numpy() and FieldList.to_numpy() still return ndarrays.

[13]:
ds[0].to_numpy()[:2,:2]
[13]:
array([[228.04600525, 228.04600525],
       [228.60850525, 228.57920837]])
[14]:
ds.to_numpy().shape
[14]:
(4, 181, 360)

Building array fieldlists with from_array()

We can build a new array fieldlist straight from metadata and array values using from_array(). This can be used for computations when we want to alter the values and store the result in a new FieldList.

[15]:
md = ds.metadata()
v = ds.to_array() + 2
r1 = ekd.FieldList.from_array(v, md)
r1.ls()
[15]:
centre shortName typeOfLevel level dataDate dataTime stepRange dataType number gridType
0 ecmf t isobaricInhPa 500 20070101 1200 0 an 0 regular_ll
1 ecmf z isobaricInhPa 500 20070101 1200 0 an 0 regular_ll
2 ecmf t isobaricInhPa 850 20070101 1200 0 an 0 regular_ll
3 ecmf z isobaricInhPa 850 20070101 1200 0 an 0 regular_ll

As expected, the values in r1 are now differing by 2 from the ones in the original fieldlist (r).

[16]:
r1[0].values[:10]
[16]:
tensor([230.0460, 230.0460, 230.0460, 230.0460, 230.0460, 230.0460, 230.0460,
        230.0460, 230.0460, 230.0460], dtype=torch.float64)

Building an array fieldlist in a loop

[17]:
md = ds.metadata()
v = ds.to_array() + 2

r1 = ekd.SimpleFieldList()
for k in range(len(md)):
    r1.append(ekd.ArrayField(v[k], md[k]))
r1.ls()
[17]:
centre shortName typeOfLevel level dataDate dataTime stepRange dataType number gridType
0 ecmf t isobaricInhPa 500 20070101 1200 0 an 0 regular_ll
1 ecmf z isobaricInhPa 500 20070101 1200 0 an 0 regular_ll
2 ecmf t isobaricInhPa 850 20070101 1200 0 an 0 regular_ll
3 ecmf z isobaricInhPa 850 20070101 1200 0 an 0 regular_ll

Saving to GRIB

We can save array fieldlists into GRIB.

[18]:
path = "_from_pytroch.grib"
r1.to_target("file", path)
ds1 = ekd.from_source("file", path)
ds1.ls()
[18]:
centre shortName typeOfLevel level dataDate dataTime stepRange dataType number gridType
0 ecmf t isobaricInhPa 500 20070101 1200 0 an 0 regular_ll
1 ecmf z isobaricInhPa 500 20070101 1200 0 an 0 regular_ll
2 ecmf t isobaricInhPa 850 20070101 1200 0 an 0 regular_ll
3 ecmf z isobaricInhPa 850 20070101 1200 0 an 0 regular_ll