BUFR: using SYNOP data¶
We load a BUFR file with SYNOP observations taken from the ECMWF MARS archive. First we ensure the example file is available.
[1]:
import earthkit.data as ekd
ekd.download_example_file("synop_10.bufr")
[2]:
ds = ekd.from_source("file", "synop_10.bufr")
ds
[2]:
| path | synop_10.bufr |
| size | 2.1 KiB |
| types | pandas, featurelist |
To inspect BUFR data we need to convert it into a featureslist. It is a similar object to a fieldList, but it is an iterable of “features”, where a “feature” can be anything. In a BUFR featurelist each feature is a BUFR message (a BUFRMessage object).
[3]:
fl = ds.to_featurelist()
[4]:
len(fl)
[4]:
10
We use ls() to see metadata from the header section of each BUFR message:
[5]:
fl.ls()
[5]:
| edition | dataCategory | dataSubCategory | bufrHeaderCentre | masterTablesVersionNumber | localTablesVersionNumber | numberOfSubsets | compressedData | typicalDate | typicalTime | ident | localLatitude | localLongitude | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 91648 | -10.75 | 179.50 |
| 1 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 89514 | -70.77 | 11.75 |
| 2 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 60545 | 33.77 | 2.93 |
| 3 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 30823 | 51.83 | 107.60 |
| 4 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 30846 | 51.35 | 112.47 |
| 5 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 48352 | 17.86 | 102.75 |
| 6 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 98747 | 8.41 | 124.61 |
| 7 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 68267 | -26.50 | 29.98 |
| 8 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 68592 | -29.60 | 31.12 |
| 9 | 3 | 0 | 1 | 98 | 13 | 1 | 1 | 0 | 20230602 | 120000 | 91701 | -2.77 | -171.72 |
Extracting 2m temperature¶
BUFR data can be extracted into a Pandas dataframe using to_pandas(), which passes all the arguments to the pdbufr.read_bufr() method from pdbufr.
SYNOP data can be encoded into BUFR in many different ways. For the data we have the location and the 2m temperature can be extracted in the following way into a Pandas dataframe:
[6]:
df = fl.to_pandas(columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M"])
df
[6]:
| latitude | longitude | heightOfStation | airTemperatureAt2M | |
|---|---|---|---|---|
| 0 | -10.75 | 179.50 | 3.0 | 300.4 |
| 1 | -70.77 | 11.75 | NaN | 255.2 |
| 2 | 33.77 | 2.93 | 763.0 | 296.3 |
| 3 | 51.83 | 107.60 | 515.0 | 291.6 |
| 4 | 51.35 | 112.47 | 743.0 | 287.4 |
| 5 | 17.86 | 102.75 | 176.0 | 307.9 |
| 6 | 8.41 | 124.61 | 188.0 | 299.4 |
| 7 | -26.50 | 29.98 | 1774.0 | 281.9 |
| 8 | -29.60 | 31.12 | 105.0 | 299.8 |
| 9 | -2.77 | -171.72 | 2.0 | 302.1 |
Using filters¶
Specify station WMO IDs:
[7]:
df = fl.to_pandas(
columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M", "WMO_station_id"],
filters={"WMO_station_id": [30846, 89514]},
)
df
[7]:
| latitude | longitude | heightOfStation | airTemperatureAt2M | WMO_station_id | |
|---|---|---|---|---|---|
| 0 | -70.77 | 11.75 | NaN | 255.2 | 89514 |
| 1 | 51.35 | 112.47 | 743.0 | 287.4 | 30846 |
Temperature values <= 290 K:
[8]:
df = fl.to_pandas(
columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M"],
filters={"airTemperatureAt2M": slice(None, 290)},
)
df
[8]:
| latitude | longitude | heightOfStation | airTemperatureAt2M | |
|---|---|---|---|---|
| 0 | -70.77 | 11.75 | NaN | 255.2 |
| 1 | 51.35 | 112.47 | 743.0 | 287.4 |
| 2 | -26.50 | 29.98 | 1774.0 | 281.9 |
Temperature values >= 290 K and <= 300 K:
[9]:
df = fl.to_pandas(
columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M"],
filters={"airTemperatureAt2M": slice(290, 300)},
)
df
[9]:
| latitude | longitude | heightOfStation | airTemperatureAt2M | |
|---|---|---|---|---|
| 0 | 33.77 | 2.93 | 763 | 296.3 |
| 1 | 51.83 | 107.60 | 515 | 291.6 |
| 2 | 8.41 | 124.61 | 188 | 299.4 |
| 3 | -29.60 | 31.12 | 105 | 299.8 |
Temperature values >= 300 K:
[10]:
df = fl.to_pandas(
columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M"],
filters={"airTemperatureAt2M": slice(300, None)},
)
df
[10]:
| latitude | longitude | heightOfStation | airTemperatureAt2M | |
|---|---|---|---|---|
| 0 | -10.75 | 179.50 | 3 | 300.4 |
| 1 | 17.86 | 102.75 | 176 | 307.9 |
| 2 | -2.77 | -171.72 | 2 | 302.1 |
[ ]: