GRIB: modifying metadata¶
This notebook demonstrates how to modify the metadata in GRIB fields.
First we read some GRIB data containing pressure level fields.
[1]:
import datetime
import earthkit.data as ekd
fl = ekd.from_source("sample", "tuv_pl.grib").to_fieldlist()
We will use the first field in the rest of the notebook.
[2]:
f = fl[0]
f.ls()
[2]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
Using set()¶
A field can be modified by using set(). It will create a new field with updated metadata.
The preferred way is to use the high-level field metadata keys whenever possible.
[3]:
f1 = f.set({"parameter.variable": "u", "parameter.units": "m/s", "vertical.level": 500})
f1.ls()
[3]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
If you do need to use raw (ecCodes GRIB) metadata keys it is also possible (it only works for fields created from GRIB data).
[4]:
f1 = f.set({"metadata.shortName": "u", "metadata.level": 500})
f1.ls()
[4]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
The two types of keys can be mixed. In this case the rule is that the high level keys are applied first, followed by the raw keys (prefixed with metadata).
[5]:
f1 = f.set({"parameter.variable": "u", "parameter.units": "m/s", "metadata.level": 500})
f1.ls()
[5]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
Please note, there is an important GRIB related difference between using the high-level keys and the raw ecCodes GRIB keys in set(). Please see the “Modified fields and the associated GRIB message” chapter below for details.
Setting time¶
Setting keys for the “time” field component allows using multiple formats. By default a “datetime” key takes a datatime.datetime object and a “step” key takes a datatime.timedelta object.
[6]:
f1 = f.set({"time.base_datetime": datetime.datetime(2000, 12, 18, 12), "time.step": datetime.timedelta(hours=6)})
f1.ls()
[6]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2000-12-18 18:00:00 | 2000-12-18 12:00:00 | 0 days 06:00:00 | 1000 | pressure | 0 | regular_ll |
On top of that, we can also use many compatible formats, e.g:
for datetime: ISO date strings, numpy datetime64 values, integers as yyyymmdd (the hour is assumed to be 0 in this case)
for timedelta: integers (as hours), strings like “6s”, “6m”, “6h” (for seconds, minutes or hours)
[7]:
f1 = f.set({"time.base_datetime": "2000-12-18T12", "time.step": 6})
f1.ls()
[7]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2000-12-18 18:00:00 | 2000-12-18 12:00:00 | 0 days 06:00:00 | 1000 | pressure | 0 | regular_ll |
Setting the step will automatically update the a valid time too.
[8]:
f1 = f.set({"time.step": "10s"})
f1.ls()
[8]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2018-08-01 12:00:10 | 2018-08-01 12:00:00 | 0 days 00:00:10 | 1000 | pressure | 0 | regular_ll |
Setting components¶
It is allowed to set whole individual field components with set(). The simplest way is to specify them as a dict. E.g. the following cell sets a new “time” component on the field.
[9]:
f1 = f.set(time={"base_datetime": "2000-12-18T12", "step": 6})
f1.ls()
[9]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2000-12-18 18:00:00 | 2000-12-18 12:00:00 | 0 days 06:00:00 | 1000 | pressure | 0 | regular_ll |
If the dict is not fully specifying the component an exception is raised. E.g. “step” on it is own does not define a time component.
[10]:
try:
f.set(time={"step": 6})
except Exception as e:
print(e)
Cannot create ForecastTime from keys: ['step'].
Saving the modified field to disk¶
We change the level and save the modified field into a GRIB file.
[11]:
f1 = f.set({"vertical.level": 500})
f1.to_target("file", "_res_lev.grib")
# read back the data and compare the values in the first field
f1_w = ekd.from_source("file", "_res_lev.grib").to_fieldlist()
f1_w.ls()
[11]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 500 | pressure | 0 | regular_ll |
Modified fields and the associated GRIB message¶
When a field was created from GRIB data the associated GRIB message can be accessed via the field with message().
[12]:
f.message()[:10]
[12]:
b'GRIB\x00\x00\x96\x01\x00\x00'
Having modified the high-level field metadata this GRIB message is not updated and out of sync with the high-level field components. As a result, we cannot access it any longer in the new field. The same is true for any raw ecCodes GRIB metadata.
[13]:
f1 = f.set({"vertical.level": 500})
f1.message() # returns None
[14]:
f1.get("metadata.level") # returns None
[15]:
try:
f1.metadata("level")
except KeyError as e:
print(e)
'Key metadata.level not found in field'
If we want to keep a valid associated GRIB message in the modified field we need to call sync(). This will create a new GRIB handle, update the relevant metadata in it and create a new field out of it.
[16]:
f1 = f1.sync()
f1.get("metadata.level")
[16]:
500
[17]:
f1.metadata("level")
[17]:
500
Alternatively, we can use the sync=True kwarg in set() to execute the syncing as part of the setting process.
[18]:
f1 = f.set({"vertical.level": 500}, sync=True)
f1.message()[:10]
[18]:
b'GRIB\x00\x00\x96\x01\x00\x00'
[19]:
f1.get("metadata.level")
[19]:
500
Obviously, when only raw metadata keys are used in set() there is no need for syncing.
[20]:
f1 = f.set({"metadata.level": 500})
f1.message()[:10]
[20]:
b'GRIB\x00\x00\x96\x01\x00\x00'
[21]:
f1.get("metadata.level")
[21]:
500