Using pandas dataΒΆ
[1]:
import numpy as np
import pandas as pd
import earthkit.data as ekd
Construct a sample pandas objects for demonstration
[2]:
t2m_series = pd.Series(np.linspace(273.15, 293, 20), name="t2m")
lat_series = pd.Series(np.arange(50, 52, 0.1), name="latitude")
lon_series = pd.Series(np.arange(-1, 1, 0.1), name="longitude")
date_series = pd.Series(pd.date_range("2022-01-01", "2022-01-20"), name="date")
date_series
t2m_df = pd.concat([t2m_series, lat_series, lon_series], axis=1).set_index(date_series)
t2m_df
[2]:
| t2m | latitude | longitude | |
|---|---|---|---|
| date | |||
| 2022-01-01 | 273.150000 | 50.0 | -1.000000e+00 |
| 2022-01-02 | 274.194737 | 50.1 | -9.000000e-01 |
| 2022-01-03 | 275.239474 | 50.2 | -8.000000e-01 |
| 2022-01-04 | 276.284211 | 50.3 | -7.000000e-01 |
| 2022-01-05 | 277.328947 | 50.4 | -6.000000e-01 |
| 2022-01-06 | 278.373684 | 50.5 | -5.000000e-01 |
| 2022-01-07 | 279.418421 | 50.6 | -4.000000e-01 |
| 2022-01-08 | 280.463158 | 50.7 | -3.000000e-01 |
| 2022-01-09 | 281.507895 | 50.8 | -2.000000e-01 |
| 2022-01-10 | 282.552632 | 50.9 | -1.000000e-01 |
| 2022-01-11 | 283.597368 | 51.0 | -2.220446e-16 |
| 2022-01-12 | 284.642105 | 51.1 | 1.000000e-01 |
| 2022-01-13 | 285.686842 | 51.2 | 2.000000e-01 |
| 2022-01-14 | 286.731579 | 51.3 | 3.000000e-01 |
| 2022-01-15 | 287.776316 | 51.4 | 4.000000e-01 |
| 2022-01-16 | 288.821053 | 51.5 | 5.000000e-01 |
| 2022-01-17 | 289.865789 | 51.6 | 6.000000e-01 |
| 2022-01-18 | 290.910526 | 51.7 | 7.000000e-01 |
| 2022-01-19 | 291.955263 | 51.8 | 8.000000e-01 |
| 2022-01-20 | 293.000000 | 51.9 | 9.000000e-01 |
Create an earthkit object from the pandas object with from_object().
[3]:
d = ekd.from_object(t2m_df)
d
[3]:
Pandas DataFrame data
| types | pandas, xarray, numpy, featurelist |
[4]:
df = d.to_pandas()
df.describe()
[4]:
| t2m | latitude | longitude | |
|---|---|---|---|
| count | 20.000000 | 20.000000 | 20.000000 |
| mean | 283.075000 | 50.950000 | -0.050000 |
| std | 6.180747 | 0.591608 | 0.591608 |
| min | 273.150000 | 50.000000 | -1.000000 |
| 25% | 278.112500 | 50.475000 | -0.525000 |
| 50% | 283.075000 | 50.950000 | -0.050000 |
| 75% | 288.037500 | 51.425000 | 0.425000 |
| max | 293.000000 | 51.900000 | 0.900000 |
[5]:
fl = d.to_featurelist()
print(f"Number of rows in pandas: {len(fl)}")
print("Iterate of first 2 rows:")
for f in fl[:2]:
print(f)
Number of rows in pandas: 20
Iterate of first 2 rows:
t2m 273.15
latitude 50.00
longitude -1.00
Name: 2022-01-01 00:00:00, dtype: float64
t2m 274.194737
latitude 50.100000
longitude -0.900000
Name: 2022-01-02 00:00:00, dtype: float64