GRIB: modifying metadata¶
This notebook demonstrates how to modify the metadata in GRIB fields.
First we read some GRIB data containing pressure level fields.
[1]:
import datetime
import earthkit.data as ekd
fl = ekd.from_source("sample", "tuv_pl.grib").to_fieldlist()
We will use the first field in the rest of the notebook.
[2]:
f = fl[0]
f.ls()
[2]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
Using set()¶
A field can be modified by using set(). It will create a new field with updated metadata.
[3]:
f1 = f.set({"parameter.variable": "u", "parameter.units": "m/s", "vertical.level": 500})
f1.ls()
[3]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
Only the field component metadata keys can be used in set() and raw metadata keys are not allowed to use. E.g. since the field was created from GRIB data it has the raw (GRIB) metadata key metadata.shortName but we cannot set it. If you need to change the GRIB metadata see the “Changing raw GRIB metadata” section below.
[4]:
print(f.get("metadata.shortName"))
try:
f.set({"metadata.shortName": "u"})
except Exception as e:
print(e)
t
'Key metadata.shortName cannot be set on the field.'
Setting time¶
Setting keys for the “time” field component allows using multiple formats. By default a “datetime” key takes a datatime.datetime object and a “step” key takes a datatime.timedelta object.
[5]:
f1 = f.set({"time.base_datetime": datetime.datetime(2000, 12, 18, 12), "time.step": datetime.timedelta(hours=6)})
f1.ls()
[5]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2000-12-18 18:00:00 | 2000-12-18 12:00:00 | 0 days 06:00:00 | 1000 | pressure | 0 | regular_ll |
On top of that, we can also use many compatible formats, e.g:
for datetime: ISO date strings, numpy datetime64 values, integers as yyyymmdd (the hour is assumed to be 0 in this case)
for timedelta: integers (as hours), strings like “6s”, “6m”, “6h” (for seconds, minutes or hours)
[6]:
f1 = f.set({"time.base_datetime": "2000-12-18T12", "time.step": 6})
f1.ls()
[6]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2000-12-18 18:00:00 | 2000-12-18 12:00:00 | 0 days 06:00:00 | 1000 | pressure | 0 | regular_ll |
Setting the step will automatically update the a valid time too.
[7]:
f1 = f.set({"time.step": "10s"})
f1.ls()
[7]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2018-08-01 12:00:10 | 2018-08-01 12:00:00 | 0 days 00:00:10 | 1000 | pressure | 0 | regular_ll |
Setting components¶
It is allowed to set whole individual field components with set(). The simplest way is to specify them as a dict. E.g. the following cell sets a new “time” component on the field.
[8]:
f1 = f.set(time={"base_datetime": "2000-12-18T12", "step": 6})
f1.ls()
[8]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2000-12-18 18:00:00 | 2000-12-18 12:00:00 | 0 days 06:00:00 | 1000 | pressure | 0 | regular_ll |
If the dict is not fully specifying the component an exception is raised. E.g. “step” on it is own does not define a time component.
[9]:
try:
f.set(time={"step": 6})
except Exception as e:
print(e)
Cannot create ForecastTime from keys: ['step'].
Saving the modified field to disk¶
We change the level and save the modified field into a GRIB file.
[10]:
f1 = f.set({"vertical.level": 500})
f1.to_target("file", "_res_lev.grib")
# read back the data and compare the values in the first field
f1_w = ekd.from_source("file", "_res_lev.grib").to_fieldlist()
f1_w.ls()
[10]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
Modified fields and the associated GRIB message¶
When a field was created from GRIB data the associated GRIB message can be accessed via the field with message().
[11]:
f.message()[:10]
[11]:
b'GRIB\x00\x00\x96\x01\x00\x00'
Having modified the field metadata this GRIB message is not updated and we cannot access it any longer in the new field. The same is true for any raw GRIB metadata.
[12]:
f1 = f.set({"vertical.level": 500})
f1.message()
[13]:
f1.get("metadata.shortName")
[14]:
try:
f1.metadata("shortName")
except KeyError as e:
print(e)
'Key metadata.shortName not found in field'
If we want to keep a valid associated GRIB message in the modified field we need to call sync(). This will create a new GRIB handle, update the relevant metadata in it and create a new field out of it.
[15]:
f1 = f1.sync()
f1.get(["metadata.shortName", "metadata.level"])
[15]:
['t', 500]
[16]:
f1.metadata(["shortName", "level"])
[16]:
['t', 500]
Alternatively, if your workflow is strictly GRIB-bound you can carry out the filed modification via the GribEncoder as shown in the next chapter.
Changing raw GRIB metadata¶
Currently, changing the (raw) GRIB metadata in a field requires the usage of a GribEncoder.
When we call its encode() method it will clone the underlying GRIB message, set the GRIB metadata on it and return an object that can be converted to a field.
[17]:
encoder = ekd.create_encoder("grib")
r = encoder.encode(template=f, metadata={"shortName": "u"})
f1 = r.to_field()
f1.ls()
[17]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
[ ]: