Writing GRIB data to Zarr¶
[1]:
# get input GRIB data
import earthkit.data as ekd
ds = ekd.from_source("sample", "pl.grib").to_fieldlist()
This data contains 32 fields: several forecasts on pressure levels for 2 parameters.
Using to_target() on the data object¶
We use to_target() to write the GRIB fieldlist/field into a zarr store. First, the data is converted to Xarray then xarray.Dataset.to_zarr() is called to generate the zarr store. We need to set the kwargs accordingly.
[2]:
# with these options each field will be a separate chunk
ds.to_target(
"zarr",
earthkit_to_xarray_kwargs={"chunks": {"forecast_reference_time": 1, "step": 1, "level": 1}},
xarray_to_zarr_kwargs={"store": "_pl.zarr", "mode": "w"},
)
[3]:
import zarr
root = zarr.group("_pl.zarr")
root.tree()
[3]:
[4]:
root["t"].info
[4]:
| Name | /t |
|---|---|
| Type | zarr.core.Array |
| Data type | float64 |
| Shape | (4, 2, 2, 19, 36) |
| Chunk shape | (1, 1, 1, 19, 36) |
| Order | C |
| Read-only | False |
| Compressor | Blosc(cname='lz4', clevel=5, shuffle=SHUFFLE, blocksize=0) |
| Store type | zarr.storage.DirectoryStore |
| No. bytes | 87552 (85.5K) |
| No. bytes stored | 26897 (26.3K) |
| Storage ratio | 3.3 |
| Chunks initialized | 16/16 |
The zarr store can be loaded to Xarray to check its content.
[5]:
import xarray
xarray.open_dataset("_pl.zarr")
[5]:
<xarray.Dataset> Size: 176kB
Dimensions: (forecast_reference_time: 4, step: 2, level: 2,
latitude: 19, longitude: 36)
Coordinates:
* forecast_reference_time (forecast_reference_time) datetime64[ns] 32B 202...
* step (step) timedelta64[ns] 16B 00:00:00 06:00:00
* level (level) int64 16B 500 700
* latitude (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
* longitude (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
r (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
t (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
Attributes:
Conventions: CF-1.8
institution: ECMWF[ ]: