Using GeoJSON and GeoPandas data¶
[1]:
import geopandas as gpd
import earthkit.data as ekd
Reading from a file¶
Load a GeoJSON object as a “file” source using from_source().
[2]:
ekd.download_example_file("NUTS_RG_20M_2021_3035.geojson")
d = ekd.from_source("file", "NUTS_RG_20M_2021_3035.geojson")
d
[2]:
GeoJSON file
| path | NUTS_RG_20M_2021_3035.geojson |
| size | 2.4 MiB |
| types | geopandas, pandas, xarray, featurelist |
We can convert this data into a geopandas GeoDataframe.
[3]:
df = d.to_geopandas()
df.describe()
[3]:
| LEVL_CODE | MOUNT_TYPE | URBN_TYPE | COAST_TYPE | |
|---|---|---|---|---|
| count | 2010.000000 | 2009.000000 | 2010.000000 | 2010.000000 |
| mean | 2.654229 | 2.709308 | 1.536318 | 1.768159 |
| std | 0.679168 | 1.664941 | 1.094774 | 1.280081 |
| min | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 25% | 3.000000 | 2.000000 | 1.000000 | 1.000000 |
| 50% | 3.000000 | 4.000000 | 2.000000 | 2.000000 |
| 75% | 3.000000 | 4.000000 | 2.000000 | 3.000000 |
| max | 3.000000 | 4.000000 | 3.000000 | 3.000000 |
We can also convert GeoJSON data into a featurelist. With this we can iterate over the rows (features) (by default pandas iterates over columns).
[4]:
fl = d.to_featurelist()
print(f"Number of polygons in geopandas: {len(fl)}")
print("Iterate of first 2 polygons:")
for f in fl[:2]:
print(f)
Number of polygons in geopandas: 2010
Iterate of first 2 polygons:
id FR
NUTS_ID FR
LEVL_CODE 0
CNTR_CODE FR
NAME_LATN France
NUTS_NAME France
MOUNT_TYPE 0.0
URBN_TYPE 0
COAST_TYPE 0
FID FR
geometry MULTIPOLYGON (((9954236.1162 -3059379.3164, 99...
Name: 0, dtype: object
id HR
NUTS_ID HR
LEVL_CODE 0
CNTR_CODE HR
NAME_LATN Hrvatska
NUTS_NAME Hrvatska
MOUNT_TYPE 0.0
URBN_TYPE 0
COAST_TYPE 0
FID HR
geometry MULTIPOLYGON (((4827385.8894 2618351.326199999...
Name: 1, dtype: object
Loading geopandas object¶
It is also possible to create an earthkit-data object from an already instantiated geopandas dataframe
[5]:
gpd_df = gpd.read_file("NUTS_RG_20M_2021_3035.geojson")
d = ekd.from_object(gpd_df)
d
[5]:
GeoPandas DataFrame data
| types | geopandas, pandas, xarray |
[6]:
df = d.to_geopandas()
df.describe()
[6]:
| LEVL_CODE | MOUNT_TYPE | URBN_TYPE | COAST_TYPE | |
|---|---|---|---|---|
| count | 2010.000000 | 2009.000000 | 2010.000000 | 2010.000000 |
| mean | 2.654229 | 2.709308 | 1.536318 | 1.768159 |
| std | 0.679168 | 1.664941 | 1.094774 | 1.280081 |
| min | 0.000000 | 0.000000 | 0.000000 | 0.000000 |
| 25% | 3.000000 | 2.000000 | 1.000000 | 1.000000 |
| 50% | 3.000000 | 4.000000 | 2.000000 | 2.000000 |
| 75% | 3.000000 | 4.000000 | 2.000000 | 3.000000 |
| max | 3.000000 | 4.000000 | 3.000000 | 3.000000 |
[7]:
fl = d.to_featurelist()
print(f"Number of polygons in geopandas: {len(fl)}")
print("Iterate of first 2 polygons:")
for f in fl[:2]:
print(f)
Number of polygons in geopandas: 2010
Iterate of first 2 polygons:
id FR
NUTS_ID FR
LEVL_CODE 0
CNTR_CODE FR
NAME_LATN France
NUTS_NAME France
MOUNT_TYPE 0.0
URBN_TYPE 0
COAST_TYPE 0
FID FR
geometry MULTIPOLYGON (((9954236.1162 -3059379.3164, 99...
Name: 0, dtype: object
id HR
NUTS_ID HR
LEVL_CODE 0
CNTR_CODE HR
NAME_LATN Hrvatska
NUTS_NAME Hrvatska
MOUNT_TYPE 0.0
URBN_TYPE 0
COAST_TYPE 0
FID HR
geometry MULTIPOLYGON (((4827385.8894 2618351.326199999...
Name: 1, dtype: object
[ ]: