Reading data from URLs as a stream
[1]:
import earthkit.data as ekd
earthkit-data can read GRIB data from a URL as a stream without writing anything to disk. This can be activated with the stream=True kwarg when calling from_source().
[2]:
ds = ekd.from_source("url",
"https://sites.ecmwf.int/repository/earthkit-data/examples/test4.grib",
stream=True)
The resulting object only supports one iteration. Having finsihed the iteration the stream is consumed and no more data is available.
[3]:
for f in ds:
# f is GribField object. It gets deleted when going out of scope
print(f)
GribField(t,500,20070101,1200,0,0)
GribField(z,500,20070101,1200,0,0)
GribField(t,850,20070101,1200,0,0)
GribField(z,850,20070101,1200,0,0)
The iteration can be done in batches by using batched or group_by.
[4]:
ds = ekd.from_source("url",
"https://sites.ecmwf.int/repository/earthkit-data/examples/test4.grib",
stream=True)
for f in ds.batched(2):
# f is a fieldlist
print(f"len={len(f)} {f.metadata(('param', 'level'))}")
len=2 [('t', 500), ('z', 500)]
len=2 [('t', 850), ('z', 850)]
Reading the whole stream into memory
We can load the whole stream into memory by using read_all=True in from_source(). The resulting object will be a FieldList storing all the GRIB messages in memory.
[5]:
ds = ekd.from_source("url",
"https://sites.ecmwf.int/repository/earthkit-data/examples/test4.grib",
stream=True, read_all=True)
len(ds)
[5]:
4
[6]:
ds.ls()
[6]:
| centre | shortName | typeOfLevel | level | dataDate | dataTime | stepRange | dataType | number | gridType | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | ecmf | t | isobaricInhPa | 500 | 20070101 | 1200 | 0 | an | 0 | regular_ll |
| 1 | ecmf | z | isobaricInhPa | 500 | 20070101 | 1200 | 0 | an | 0 | regular_ll |
| 2 | ecmf | t | isobaricInhPa | 850 | 20070101 | 1200 | 0 | an | 0 | regular_ll |
| 3 | ecmf | z | isobaricInhPa | 850 | 20070101 | 1200 | 0 | an | 0 | regular_ll |
Multiple URLs
The stream option works even when the input is a list of URLs.
[7]:
ds = ekd.from_source("url",
["https://sites.ecmwf.int/repository/earthkit-data/examples/test4.grib",
"https://sites.ecmwf.int/repository/earthkit-data/examples/test6.grib"],
stream=True)
for f in ds.batched(3):
# f is a fieldlist
print(f"len={len(f)} {f.metadata(('param', 'level'))}")
len=3 [('t', 500), ('z', 500), ('t', 850)]
len=3 [('z', 850), ('t', 1000), ('u', 1000)]
len=3 [('v', 1000), ('t', 850), ('u', 850)]
len=1 [('v', 850)]
[ ]: