BUFR: using SYNOP data¶

We load a BUFR file with SYNOP observations taken from the ECMWF MARS archive. First we ensure the example file is available.

[1]:

import earthkit.data as ekd

ekd.download_example_file("synop_10.bufr")

[2]:

ds = ekd.from_source("file", "synop_10.bufr")
ds

[2]:

BUFR file

path	synop_10.bufr
size	2.1 KiB
types	pandas, featurelist

To inspect BUFR data we need to convert it into a featureslist. It is a similar object to a fieldList, but it is an iterable of “features”, where a “feature” can be anything. In a BUFR featurelist each feature is a BUFR message (a BUFRMessage object).

[3]:

fl = ds.to_featurelist()

[4]:

len(fl)

[4]:

We use ls() to see metadata from the header section of each BUFR message:

[5]:

fl.ls()

[5]:

	edition	dataSubCategory	bufrHeaderCentre	masterTablesVersionNumber	localTablesVersionNumber	numberOfSubsets	typicalDate	typicalTime	ident	localLatitude	localLongitude
0	3	1	98	13	1	1	20230602	120000	91648	-10.75	179.50
1	3	1	98	13	1	1	20230602	120000	89514	-70.77	11.75
2	3	1	98	13	1	1	20230602	120000	60545	33.77	2.93
3	3	1	98	13	1	1	20230602	120000	30823	51.83	107.60
4	3	1	98	13	1	1	20230602	120000	30846	51.35	112.47
5	3	1	98	13	1	1	20230602	120000	48352	17.86	102.75
6	3	1	98	13	1	1	20230602	120000	98747	8.41	124.61
7	3	1	98	13	1	1	20230602	120000	68267	-26.50	29.98
8	3	1	98	13	1	1	20230602	120000	68592	-29.60	31.12
9	3	1	98	13	1	1	20230602	120000	91701	-2.77	-171.72

Extracting 2m temperature¶

BUFR data can be extracted into a Pandas dataframe using to_pandas(), which passes all the arguments to the pdbufr.read_bufr() method from pdbufr.

SYNOP data can be encoded into BUFR in many different ways. For the data we have the location and the 2m temperature can be extracted in the following way into a Pandas dataframe:

[6]:

df = fl.to_pandas(columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M"])
df

[6]:

	latitude	longitude	heightOfStation	airTemperatureAt2M
0	-10.75	179.50	3.0	300.4
1	-70.77	11.75	NaN	255.2
2	33.77	2.93	763.0	296.3
3	51.83	107.60	515.0	291.6
4	51.35	112.47	743.0	287.4
5	17.86	102.75	176.0	307.9
6	8.41	124.61	188.0	299.4
7	-26.50	29.98	1774.0	281.9
8	-29.60	31.12	105.0	299.8
9	-2.77	-171.72	2.0	302.1

Using filters¶

Specify station WMO IDs:

[7]:

df = fl.to_pandas(
    columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M", "WMO_station_id"],
    filters={"WMO_station_id": [30846, 89514]},
)
df

[7]:

	latitude	longitude	heightOfStation	airTemperatureAt2M	WMO_station_id
0	-70.77	11.75	NaN	255.2	89514
1	51.35	112.47	743.0	287.4	30846

Temperature values <= 290 K:

[8]:

df = fl.to_pandas(
    columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M"],
    filters={"airTemperatureAt2M": slice(None, 290)},
)
df

[8]:

	latitude	longitude	heightOfStation	airTemperatureAt2M
0	-70.77	11.75	NaN	255.2
1	51.35	112.47	743.0	287.4
2	-26.50	29.98	1774.0	281.9

Temperature values >= 290 K and <= 300 K:

[9]:

df = fl.to_pandas(
    columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M"],
    filters={"airTemperatureAt2M": slice(290, 300)},
)
df

[9]:

	latitude	longitude	heightOfStation	airTemperatureAt2M
0	33.77	2.93	763	296.3
1	51.83	107.60	515	291.6
2	8.41	124.61	188	299.4
3	-29.60	31.12	105	299.8

Temperature values >= 300 K:

[10]:

df = fl.to_pandas(
    columns=["latitude", "longitude", "heightOfStation", "airTemperatureAt2M"],
    filters={"airTemperatureAt2M": slice(300, None)},
)
df

[10]:

	latitude	longitude	heightOfStation	airTemperatureAt2M
0	-10.75	179.50	3	300.4
1	17.86	102.75	176	307.9
2	-2.77	-171.72	2	302.1

[ ]: