Using shapefilesΒΆ
The code below reads shapefile data as a file source using from_source(). Internally the data is represented as a geopandas GeoDataFrame. To make this example work geopandas needs to be installed.
[1]:
!pip install geopandas --quiet
[2]:
import earthkit.data as ekd
ekd.download_example_file("major_basins.zip")
d = ekd.from_source("file", "./major_basins.zip")
d
Ignoring archive member 'major_basins/' because it is not a directory or a file
[2]:
Shapefile
| path | /var/folders/93/w0p869rx17q98wxk83gn9ys40000gn/T/earthkit-data-cgr/file-01a9a28d918afaccf918851ffd6cbe283aa6554692023480731e1dc0bac800a0.d/major_basins/Major_Basins_of_the_World.shp |
| size | 2.9 MiB |
| types | geopandas, pandas, xarray, numpy, featurelist |
We can convert this data into a geopandas GeoDataFrame.
[3]:
df = d.to_geopandas()
df.describe()
[3]:
| BASWC4_ID | ID | N | CONT | NN | FISH_ | ACRES | NO_COUNTRI | CHECKED | LAEA_HA | LAEA_ACRES | LAEA_PRMTR | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 254.000000 | 254.000000 | 254.000000 | 254.000000 | 254.000000 | 254.000000 | 254.000000 | 254.0 | 254.000000 | 2.540000e+02 | 2.540000e+02 | 2.540000e+02 |
| mean | 141.346457 | 157.342520 | 23.295276 | 2.137795 | 1842.377953 | 67.381890 | 0.000622 | 0.0 | 3.181102 | 2.114910e+07 | 5.225943e+07 | 2.217893e+06 |
| std | 87.460742 | 134.465112 | 35.202343 | 2.060524 | 2046.534093 | 242.899156 | 0.001651 | 0.0 | 2.763161 | 5.536194e+07 | 1.367993e+08 | 3.281328e+06 |
| min | 1.000000 | 0.000000 | -1.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | -1.000000 | 0.000000e+00 | 0.000000e+00 | 0.000000e+00 |
| 25% | 67.250000 | 50.500000 | 1.250000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | -1.000000 | 9.978000e+01 | 2.466000e+02 | 3.995600e+03 |
| 50% | 130.500000 | 123.000000 | 12.000000 | 2.000000 | 1016.500000 | 0.000000 | 0.000000 | 0.0 | 5.000000 | 3.168993e+06 | 7.830582e+06 | 1.224777e+06 |
| 75% | 215.500000 | 223.500000 | 24.000000 | 4.000000 | 4012.000000 | 48.000000 | 0.000000 | 0.0 | 5.000000 | 1.080241e+07 | 2.669275e+07 | 2.334400e+06 |
| max | 297.000000 | 476.000000 | 160.000000 | 6.000000 | 6013.000000 | 2500.000000 | 0.011000 | 0.0 | 5.000000 | 4.331847e+08 | 1.070399e+09 | 2.042193e+07 |
We can also convert the Shapefile data into a featurelist. With this we can iterate over the rows (features).
[4]:
fl = d.to_featurelist()
len(fl)
[4]:
254
[5]:
fl[2]
[5]:
BASWC4_ID 4
ID 38
N 36
NAME Yenisey
CONT 2
NN 2036
FISH_ 42
ACRES 0.007
SOURCETHM final_draft22.db
NO_COUNTRI 0
Q3 International catchments
CHECKED -1
LAEA_HA 191639888.63
LAEA_ACRES 473542164.8
LAEA_PRMTR 10368035.7
geometry POLYGON ((84.03548431396484 62.48704147338867,...
Name: 2, dtype: object
[6]:
fl[2].geometry
[6]:
[7]:
print(f"Number of polygons in shapefile: {len(fl)}")
print("Iterate of first 2 polygons:")
for f in fl[:2]:
print(f)
Number of polygons in shapefile: 254
Iterate of first 2 polygons:
BASWC4_ID 2
ID 408
N 11
NAME Indigirka
CONT 2
NN 2011
FISH_ 0
ACRES 0.002
SOURCETHM geoff2.dbf
NO_COUNTRI 0
Q3 None
CHECKED -1
LAEA_HA 0.0
LAEA_ACRES 0.0
LAEA_PRMTR 0.0
geometry POLYGON ((139.6873016357422 63.933204650878906...
Name: 0, dtype: object
BASWC4_ID 3
ID 436
N 14
NAME Kolyma
CONT 2
NN 2014
FISH_ 29
ACRES 0.003
SOURCETHM geoff2.dbf
NO_COUNTRI 0
Q3 None
CHECKED -1
LAEA_HA 0.0
LAEA_ACRES 0.0
LAEA_PRMTR 0.0
geometry POLYGON ((153.32125854492188 70.87090301513672...
Name: 1, dtype: object
[ ]: