Using pandas dataΒΆ

[1]:
import numpy as np
import pandas as pd

import earthkit.data as ekd

Construct a sample pandas objects for demonstration

[2]:
t2m_series = pd.Series(np.linspace(273.15, 293, 20), name="t2m")
lat_series = pd.Series(np.arange(50, 52, 0.1), name="latitude")
lon_series = pd.Series(np.arange(-1, 1, 0.1), name="longitude")
date_series = pd.Series(pd.date_range("2022-01-01", "2022-01-20"), name="date")

date_series
t2m_df = pd.concat([t2m_series, lat_series, lon_series], axis=1).set_index(date_series)
t2m_df
[2]:
t2m latitude longitude
date
2022-01-01 273.150000 50.0 -1.000000e+00
2022-01-02 274.194737 50.1 -9.000000e-01
2022-01-03 275.239474 50.2 -8.000000e-01
2022-01-04 276.284211 50.3 -7.000000e-01
2022-01-05 277.328947 50.4 -6.000000e-01
2022-01-06 278.373684 50.5 -5.000000e-01
2022-01-07 279.418421 50.6 -4.000000e-01
2022-01-08 280.463158 50.7 -3.000000e-01
2022-01-09 281.507895 50.8 -2.000000e-01
2022-01-10 282.552632 50.9 -1.000000e-01
2022-01-11 283.597368 51.0 -2.220446e-16
2022-01-12 284.642105 51.1 1.000000e-01
2022-01-13 285.686842 51.2 2.000000e-01
2022-01-14 286.731579 51.3 3.000000e-01
2022-01-15 287.776316 51.4 4.000000e-01
2022-01-16 288.821053 51.5 5.000000e-01
2022-01-17 289.865789 51.6 6.000000e-01
2022-01-18 290.910526 51.7 7.000000e-01
2022-01-19 291.955263 51.8 8.000000e-01
2022-01-20 293.000000 51.9 9.000000e-01

Create an earthkit object from the pandas object with from_object().

[3]:
d = ekd.from_object(t2m_df)
d
[3]:
Pandas DataFrame data

typespandas, xarray, numpy, featurelist
[4]:
df = d.to_pandas()
df.describe()
[4]:
t2m latitude longitude
count 20.000000 20.000000 20.000000
mean 283.075000 50.950000 -0.050000
std 6.180747 0.591608 0.591608
min 273.150000 50.000000 -1.000000
25% 278.112500 50.475000 -0.525000
50% 283.075000 50.950000 -0.050000
75% 288.037500 51.425000 0.425000
max 293.000000 51.900000 0.900000
[5]:
fl = d.to_featurelist()

print(f"Number of rows in pandas: {len(fl)}")
print("Iterate of first 2 rows:")
for f in fl[:2]:
    print(f)
Number of rows in pandas: 20
Iterate of first 2 rows:
t2m          273.15
latitude      50.00
longitude     -1.00
Name: 2022-01-01 00:00:00, dtype: float64
t2m          274.194737
latitude      50.100000
longitude     -0.900000
Name: 2022-01-02 00:00:00, dtype: float64