Using shapefilesΒΆ

The code below reads shapefile data as a file source using from_source(). Internally the data is represented as a geopandas GeoDataFrame. To make this example work geopandas needs to be installed.

[1]:
!pip install geopandas --quiet
[2]:
import earthkit.data as ekd

ekd.download_example_file("major_basins.zip")
d = ekd.from_source("file", "./major_basins.zip")
d
Ignoring archive member 'major_basins/' because it is not a directory or a file
[2]:
Shapefile

path/var/folders/93/w0p869rx17q98wxk83gn9ys40000gn/T/earthkit-data-cgr/file-01a9a28d918afaccf918851ffd6cbe283aa6554692023480731e1dc0bac800a0.d/major_basins/Major_Basins_of_the_World.shp
size2.9 MiB
typesgeopandas, pandas, xarray, numpy, featurelist

We can convert this data into a geopandas GeoDataFrame.

[3]:
df = d.to_geopandas()
df.describe()
[3]:
BASWC4_ID ID N CONT NN FISH_ ACRES NO_COUNTRI CHECKED LAEA_HA LAEA_ACRES LAEA_PRMTR
count 254.000000 254.000000 254.000000 254.000000 254.000000 254.000000 254.000000 254.0 254.000000 2.540000e+02 2.540000e+02 2.540000e+02
mean 141.346457 157.342520 23.295276 2.137795 1842.377953 67.381890 0.000622 0.0 3.181102 2.114910e+07 5.225943e+07 2.217893e+06
std 87.460742 134.465112 35.202343 2.060524 2046.534093 242.899156 0.001651 0.0 2.763161 5.536194e+07 1.367993e+08 3.281328e+06
min 1.000000 0.000000 -1.000000 0.000000 0.000000 0.000000 0.000000 0.0 -1.000000 0.000000e+00 0.000000e+00 0.000000e+00
25% 67.250000 50.500000 1.250000 0.000000 0.000000 0.000000 0.000000 0.0 -1.000000 9.978000e+01 2.466000e+02 3.995600e+03
50% 130.500000 123.000000 12.000000 2.000000 1016.500000 0.000000 0.000000 0.0 5.000000 3.168993e+06 7.830582e+06 1.224777e+06
75% 215.500000 223.500000 24.000000 4.000000 4012.000000 48.000000 0.000000 0.0 5.000000 1.080241e+07 2.669275e+07 2.334400e+06
max 297.000000 476.000000 160.000000 6.000000 6013.000000 2500.000000 0.011000 0.0 5.000000 4.331847e+08 1.070399e+09 2.042193e+07

We can also convert the Shapefile data into a featurelist. With this we can iterate over the rows (features).

[4]:
fl = d.to_featurelist()
len(fl)
[4]:
254
[5]:
fl[2]
[5]:
BASWC4_ID                                                     4
ID                                                           38
N                                                            36
NAME                                                    Yenisey
CONT                                                          2
NN                                                         2036
FISH_                                                        42
ACRES                                                     0.007
SOURCETHM                                      final_draft22.db
NO_COUNTRI                                                    0
Q3                                     International catchments
CHECKED                                                      -1
LAEA_HA                                            191639888.63
LAEA_ACRES                                          473542164.8
LAEA_PRMTR                                           10368035.7
geometry      POLYGON ((84.03548431396484 62.48704147338867,...
Name: 2, dtype: object
[6]:
fl[2].geometry
[6]:
../../_images/how-tos_shapefile_shapefile_9_0.svg
[7]:
print(f"Number of polygons in shapefile: {len(fl)}")
print("Iterate of first 2 polygons:")
for f in fl[:2]:
    print(f)
Number of polygons in shapefile: 254
Iterate of first 2 polygons:
BASWC4_ID                                                     2
ID                                                          408
N                                                            11
NAME                                                  Indigirka
CONT                                                          2
NN                                                         2011
FISH_                                                         0
ACRES                                                     0.002
SOURCETHM                                            geoff2.dbf
NO_COUNTRI                                                    0
Q3                                                         None
CHECKED                                                      -1
LAEA_HA                                                     0.0
LAEA_ACRES                                                  0.0
LAEA_PRMTR                                                  0.0
geometry      POLYGON ((139.6873016357422 63.933204650878906...
Name: 0, dtype: object
BASWC4_ID                                                     3
ID                                                          436
N                                                            14
NAME                                                     Kolyma
CONT                                                          2
NN                                                         2014
FISH_                                                        29
ACRES                                                     0.003
SOURCETHM                                            geoff2.dbf
NO_COUNTRI                                                    0
Q3                                                         None
CHECKED                                                      -1
LAEA_HA                                                     0.0
LAEA_ACRES                                                  0.0
LAEA_PRMTR                                                  0.0
geometry      POLYGON ((153.32125854492188 70.87090301513672...
Name: 1, dtype: object
[ ]: