GRIB: using array namespaces¶
In this example we will use a GRIB file containing 4 messages.
[1]:
import earthkit.data as ekd
fl_in = ekd.from_source("sample", "test4.grib").to_fieldlist()
Using the to_fieldlist() method we can convert this object into a fieldlist in memory were each field stores its values as an array. The array format is controlled by array_namespace keyword argument of to_fieldlist(). When using its default value (None) the underlying array format of the original fieldlist is kept. For GRIB data read from a file or stream this will be “numpy”.
Numpy array fieldlist¶
The “numpy” fieldlist we generate in the cell below works exactly in the same way as the original one but stores all the data in memory.
[2]:
fl = fl_in.to_fieldlist()
len(fl)
[2]:
4
Pytorch array fieldlist¶
For the next example we choose the “torch” array namespace. Since pytorch is an optional dependency for earthkit-data we need to ensure it is installed in the environment.
[3]:
!pip install torch --quiet
[4]:
fl = fl_in.to_fieldlist(array_namespace="torch")
[5]:
fl.ls()
[5]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
| 1 | z | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
| 2 | t | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
| 3 | z | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
values¶
When we use either Field.values or FieldList.values now we get a pytorch Tensor.
[6]:
fl[0].values[:10]
[6]:
tensor([228.0460, 228.0460, 228.0460, 228.0460, 228.0460, 228.0460, 228.0460,
228.0460, 228.0460, 228.0460], dtype=torch.float64)
[7]:
fl[0].values.shape
[7]:
torch.Size([65160])
[8]:
fl.values.shape
[8]:
torch.Size([4, 65160])
to_array()¶
Field.to_array() and FieldList.to_array() return the values based on the underlying namespace.
[9]:
fl[0].to_array()[:2, :2]
[9]:
tensor([[228.0460, 228.0460],
[228.6085, 228.5792]], dtype=torch.float64)
[10]:
fl.to_array().shape
[10]:
torch.Size([4, 181, 360])
[11]:
fl.to_array(flatten=True).shape
[11]:
torch.Size([4, 65160])
to_numpy()¶
Field.to_numpy() and FieldList.to_numpy() still return ndarrays.
[12]:
fl[0].to_numpy()[:2, :2]
[12]:
array([[228.04600525, 228.04600525],
[228.60850525, 228.57920837]])
[13]:
fl.to_numpy().shape
[13]:
(4, 181, 360)
Building a fieldlist in a loop¶
The following cell adds 2 to each field value and creates a new fieldlist from the modified fields.
[18]:
fields = []
for f in fl:
fields.append(f.set(values=f.values + 2.0))
r1 = ekd.create_fieldlist(fields)
r1.ls()
[18]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
| 1 | z | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
| 2 | t | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
| 3 | z | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
As expected, the values in r1 are now differing by 2 from the ones in the original fieldlist (r).
[15]:
r1[0].values[:10]
[15]:
tensor([230.0460, 230.0460, 230.0460, 230.0460, 230.0460, 230.0460, 230.0460,
230.0460, 230.0460, 230.0460], dtype=torch.float64)
Saving to GRIB¶
We can save these fieldlists into GRIB.
[16]:
path = "_from_pytroch.grib"
r1.to_target("file", path)
fl1 = ekd.from_source("file", path).to_fieldlist()
fl1.ls()
[16]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
| 1 | z | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
| 2 | t | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
| 3 | z | 2007-01-01 12:00:00 | 2007-01-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
[17]:
# the modified values were correctly written to the GRIB file
fl1[0].values[:10]
[17]:
array([230.04600525, 230.04600525, 230.04600525, 230.04600525,
230.04600525, 230.04600525, 230.04600525, 230.04600525,
230.04600525, 230.04600525])
[ ]: