GRIB: modifying metadata

This notebook demonstrates how to modify the metadata in GRIB fields.

First we read some GRIB data containing pressure level fields.

[1]:
import datetime

import earthkit.data as ekd

fl = ekd.from_source("sample", "tuv_pl.grib").to_fieldlist()

We will use the first field in the rest of the notebook.

[2]:
f = fl[0]
f.ls()
[2]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 t 2018-08-01 12:00:00 2018-08-01 12:00:00 0 days 1000 pressure 0 regular_ll

Using set()

A field can be modified by using set(). It will create a new field with updated metadata.

[3]:
f1 = f.set({"parameter.variable": "u", "parameter.units": "m/s", "vertical.level": 500})
f1.ls()
[3]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 u 2018-08-01 12:00:00 2018-08-01 12:00:00 0 days 500 pressure 0 regular_ll

Only the field component metadata keys can be used in set() and raw metadata keys are not allowed to use. E.g. since the field was created from GRIB data it has the raw (GRIB) metadata key metadata.shortName but we cannot set it. If you need to change the GRIB metadata see the “Changing raw GRIB metadata” section below.

[4]:
print(f.get("metadata.shortName"))
try:
    f.set({"metadata.shortName": "u"})
except Exception as e:
    print(e)
t
'Key metadata.shortName cannot be set on the field.'

Setting time

Setting keys for the “time” field component allows using multiple formats. By default a “datetime” key takes a datatime.datetime object and a “step” key takes a datatime.timedelta object.

[5]:
f1 = f.set({"time.base_datetime": datetime.datetime(2000, 12, 18, 12), "time.step": datetime.timedelta(hours=6)})
f1.ls()
[5]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 t 2000-12-18 18:00:00 2000-12-18 12:00:00 0 days 06:00:00 1000 pressure 0 regular_ll

On top of that, we can also use many compatible formats, e.g:

  • for datetime: ISO date strings, numpy datetime64 values, integers as yyyymmdd (the hour is assumed to be 0 in this case)

  • for timedelta: integers (as hours), strings like “6s”, “6m”, “6h” (for seconds, minutes or hours)

[6]:
f1 = f.set({"time.base_datetime": "2000-12-18T12", "time.step": 6})
f1.ls()
[6]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 t 2000-12-18 18:00:00 2000-12-18 12:00:00 0 days 06:00:00 1000 pressure 0 regular_ll

Setting the step will automatically update the a valid time too.

[7]:
f1 = f.set({"time.step": "10s"})
f1.ls()
[7]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 t 2018-08-01 12:00:10 2018-08-01 12:00:00 0 days 00:00:10 1000 pressure 0 regular_ll

Setting components

It is allowed to set whole individual field components with set(). The simplest way is to specify them as a dict. E.g. the following cell sets a new “time” component on the field.

[8]:
f1 = f.set(time={"base_datetime": "2000-12-18T12", "step": 6})
f1.ls()
[8]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 t 2000-12-18 18:00:00 2000-12-18 12:00:00 0 days 06:00:00 1000 pressure 0 regular_ll

If the dict is not fully specifying the component an exception is raised. E.g. “step” on it is own does not define a time component.

[9]:
try:
    f.set(time={"step": 6})
except Exception as e:
    print(e)
Cannot create ForecastTime from keys: ['step'].

Saving the modified field to disk

We change the level and save the modified field into a GRIB file.

[10]:
f1 = f.set({"vertical.level": 500})
f1.to_target("file", "_res_lev.grib")

# read back the data and compare the values in the first field
f1_w = ekd.from_source("file", "_res_lev.grib").to_fieldlist()
f1_w.ls()
[10]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 t 2018-08-01 12:00:00 2018-08-01 12:00:00 0 days 500 pressure 0 regular_ll

Modified fields and the associated GRIB message

When a field was created from GRIB data the associated GRIB message can be accessed via the field with message().

[11]:
f.message()[:10]
[11]:
b'GRIB\x00\x00\x96\x01\x00\x00'

Having modified the field metadata this GRIB message is not updated and we cannot access it any longer in the new field. The same is true for any raw GRIB metadata.

[12]:
f1 = f.set({"vertical.level": 500})
f1.message()
[13]:
f1.get("metadata.shortName")
[14]:
try:
    f1.metadata("shortName")
except KeyError as e:
    print(e)
'Key metadata.shortName not found in field'

If we want to keep a valid associated GRIB message in the modified field we need to call sync(). This will create a new GRIB handle, update the relevant metadata in it and create a new field out of it.

[15]:
f1 = f1.sync()
f1.get(["metadata.shortName", "metadata.level"])
[15]:
['t', 500]
[16]:
f1.metadata(["shortName", "level"])
[16]:
['t', 500]

Alternatively, if your workflow is strictly GRIB-bound you can carry out the filed modification via the GribEncoder as shown in the next chapter.

Changing raw GRIB metadata

Currently, changing the (raw) GRIB metadata in a field requires the usage of a GribEncoder. When we call its encode() method it will clone the underlying GRIB message, set the GRIB metadata on it and return an object that can be converted to a field.

[17]:
encoder = ekd.create_encoder("grib")
r = encoder.encode(template=f, metadata={"shortName": "u"})
f1 = r.to_field()
f1.ls()
[17]:
parameter.variable time.valid_datetime time.base_datetime time.step vertical.level vertical.level_type ensemble.member geography.grid_type
0 u 2018-08-01 12:00:00 2018-08-01 12:00:00 0 days 1000 pressure 0 regular_ll
[ ]: