Using GRIB data¶
We will work with a GRIB file containing 6 messages. First we ensure the example file is available, then read the file with from_source().
[1]:
import earthkit.data as ekd
ekd.download_example_file("test6.grib")
[2]:
ds_in = ekd.from_source("file", "test6.grib")
ds_in
[2]:
| path | test6.grib |
| size | 1.4 KiB |
| types | fieldlist, pandas, xarray, numpy, array |
We load the GRIB data into a fieldlist.
[3]:
ds = ds_in.to_fieldlist()
Iteration¶
A GRIB data object is represented by a FieldList, which is a collection of Fields. When used in iteration the GRIB data is automatically loaded and released when going out of scope (assuming the default settings are used). As a result, during the iteration only one GRIB message at a time is kept in memory:
[4]:
for f in ds:
print(f)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 1000, pressure, 0, regular_ll)
Field(t, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(u, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Field(v, 2018-08-01 12:00:00, 2018-08-01 12:00:00, 0:00:00, 850, pressure, 0, regular_ll)
Inspecting the contents¶
We can use ls() to list the fields in the fieldlist.
[5]:
len(ds)
[5]:
6
[6]:
ds.ls()
[6]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
| 1 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
| 2 | v | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
| 3 | t | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
| 4 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
| 5 | v | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
Slicing¶
Standard Python slicing is available. It does not involve any loading/copying of GRIB data.
[7]:
g = ds[1]
g.ls()
[7]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
[8]:
g = ds[1:3]
g.ls()
[8]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
| 1 | v | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
[9]:
g = ds[-1]
g.ls()
[9]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | v | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
Getting data values¶
Using values¶
The values property always returns a flat array per field:
[10]:
v = ds[0].values
v.shape
[10]:
(84,)
[11]:
v[0:4]
[11]:
array([272.56417847, 272.56417847, 272.56417847, 272.56417847])
When called on the whole fieldlist values returns a 2D array:
[12]:
v = ds.values
v.shape
[12]:
(6, 84)
Using to_numpy()¶
With to_numpy() the field shape is set on the array:
[13]:
v = ds[0].to_numpy()
v.shape
[13]:
(7, 12)
to_numpy() behaves in the same way when called on a fieldlist:
[14]:
v = ds.to_numpy()
v.shape
[14]:
(6, 7, 12)
Fields and metadata¶
A fieldlist is made up of fields. We can use the automatic display to get a quick overview about what a field contains.
[15]:
ds[0]
[15]:
| number_of_values | 84 |
| array_type | ndarray |
| array_dtype | float64 |
| variable | t |
| standard_name | air_temperature |
| long_name | Temperature |
| units | kelvin |
| chem_variable | None |
| valid_datetime | 2018-08-01 12:00:00 |
| base_datetime | 2018-08-01 12:00:00 |
| step | 0:00:00 |
| level | 1000 |
| layer | None |
| level_type | pressure |
| member | 0 |
| grid_spec | {'grid': [30, 30]} |
| grid_type | regular_ll |
| shape | (7, 12) |
| area | (90.0, 0.0, -90.0, 330.0) |
We use get() to access metadata. It works both on individual fields slices or whole fieldlists. The supported metadata keys are format independent.
[16]:
ds[0].get("vertical.level")
[16]:
1000
[17]:
ds[0:2].get(["vertical.level", "parameter.variable"])
[17]:
[[1000, 't'], [1000, 'u']]
We can also call get() on a fieldlist:
[18]:
ds.get("vertical.level")
[18]:
[1000, 1000, 1000, 850, 850, 850]
The underlying ecCodes GRIB metadata keys can also be accessed.
[19]:
ds[0:2].metadata(["level", "shortName", "centre"])
[19]:
[[1000, 't', 'ecmf'], [1000, 'u', 'ecmf']]
Selection¶
sel() offers metadata-based selection and always creates a “view”, so no copying of GRIB data is involved.
[20]:
g = ds.sel({"parameter.variable": ["u", "v"], "vertical.level": 850})
g.ls()
[20]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | u | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
| 1 | v | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
[21]:
g = ds.sel({"parameter.variable": "t"})
g.ls()
[21]:
| parameter.variable | time.valid_datetime | time.base_datetime | time.step | vertical.level | vertical.level_type | ensemble.member | geography.grid_type | |
|---|---|---|---|---|---|---|---|---|
| 0 | t | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 1000 | pressure | 0 | regular_ll |
| 1 | t | 2018-08-01 12:00:00 | 2018-08-01 12:00:00 | 0 days | 850 | pressure | 0 | regular_ll |
Xarray¶
Converting GRIB to Xarray is based on the earthkit Xarray engine (part of earthkit-data) .
[22]:
ds.to_xarray()
[22]:
<xarray.Dataset> Size: 4kB
Dimensions: (level: 2, latitude: 7, longitude: 12)
Coordinates:
* level (level) int64 16B 850 1000
* latitude (latitude) float64 56B 90.0 60.0 30.0 0.0 -30.0 -60.0 -90.0
* longitude (longitude) float64 96B 0.0 30.0 60.0 90.0 ... 270.0 300.0 330.0
Data variables:
t (level, latitude, longitude) float64 1kB ...
u (level, latitude, longitude) float64 1kB ...
v (level, latitude, longitude) float64 1kB ...
Attributes:
Conventions: CF-1.8
institution: ECMWF[ ]: