Using shapefiles

The code below reads shapefile data as a file source using from_source(). Internally the data is represented as a geopandas dataframe. To make this example work geopandas needs to be installed.

[1]:
!pip install geopandas --quiet
WARNING: Error parsing requirements for nbsphinx: [Errno 2] No such file or directory: '/opt/homebrew/Caskroom/miniforge/base/envs/dev_ecc/lib/python3.10/site-packages/nbsphinx-0.9.3.dist-info/METADATA'

[2]:
import earthkit.data as ekd
ekd.download_example_file("major_basins.zip")
ds = ekd.from_source("file", "./major_basins.zip")
WARNING:earthkit.data.readers.archive:Ignoring archive member 'major_basins/' because it is not a directory or a file

The data contains 254 features in total.

[3]:
len(ds)
WARNING:fiona._env:/var/folders/93/w0p869rx17q98wxk83gn9ys40000gn/T/earthkit-data-cgr/file-48ce4ac918e854d69bd7776b38e05c25eb66b4235111d1fa9cf48f78e4294d94.d/major_basins/Major_Basins_of_the_World.shp contains polygon(s) with rings with invalid winding order. Autocorrecting them, but that shapefile should be corrected using ogr2ogr for example.
[3]:
254
[4]:
ds
[4]:

ShapeFileReader(represented as a geopandas object):

BASWC4_ID ID N NAME CONT NN FISH_ ACRES SOURCETHM NO_COUNTRI Q3 CHECKED LAEA_HA LAEA_ACRES LAEA_PRMTR geometry
0 2 408 11 Indigirka 2 2011 0 0.002 geoff2.dbf 0 None -1 0.000000e+00 0.0 0.0 POLYGON ((139.68730 63.93320, 139.82028 64.030...
1 3 436 14 Kolyma 2 2014 29 0.003 geoff2.dbf 0 None -1 0.000000e+00 0.0 0.0 POLYGON ((153.32126 70.87090, 153.33440 70.873...
2 4 38 36 Yenisey 2 2036 42 0.007 final_draft22.db 0 International catchments -1 1.916399e+08 473542164.8 10368035.7 POLYGON ((84.03548 62.48704, 84.03149 62.49136...
3 5 148 78 Tana 0 0 0 0.000 final_draft22.db 0 International catchments -1 1.698563e+06 4197148.9 750506.4 POLYGON ((25.51510 68.64600, 25.40198 68.62086...
4 6 104 11 Mackenzie 5 5011 53 0.006 final_draft22.db 0 None 5 0.000000e+00 0.0 0.0 POLYGON ((-136.92070 68.20810, -136.88443 68.2...
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
249 251 0 0 Void 0 0 0 0.000 None 0 International catchments 5 7.292745e+06 18020373.2 1867904.1 POLYGON ((33.95228 -6.69550, 33.95243 -6.70446...
250 279 0 0 Mangoky 0 0 0 0.000 None 0 None 5 4.310280e+06 10650702.0 2127189.5 POLYGON ((46.82328 -21.32725, 46.82447 -21.336...
251 1 39 16 Lena 2 2016 43 0.009 final_draft22.db 0 None -1 1.016640e+08 251211777.3 13423660.9 POLYGON ((102.72000 65.76000, 102.85008 65.906...
252 10 37 24 Ob 2 2024 43 0.010 final_draft22.db 0 International catchments -1 3.070918e+08 758823924.1 13761196.3 POLYGON ((84.03548 62.48704, 84.06540 62.45478...
253 59 214 108 Connecticut 0 0 0 0.000 final_draft22.db 0 International catchments 5 2.775322e+06 6857819.8 1427586.8 POLYGON ((-71.16196 45.24421, -71.12241 45.182...

254 rows × 16 columns

[5]:
ds[2]
[5]:
BASWC4_ID                                                     4
ID                                                           38
N                                                            36
NAME                                                    Yenisey
CONT                                                          2
NN                                                         2036
FISH_                                                        42
ACRES                                                     0.007
SOURCETHM                                      final_draft22.db
NO_COUNTRI                                                    0
Q3                                     International catchments
CHECKED                                                      -1
LAEA_HA                                            191639888.63
LAEA_ACRES                                          473542164.8
LAEA_PRMTR                                           10368035.7
geometry      POLYGON ((84.03548431396484 62.48704147338867,...
Name: 2, dtype: object
[6]:
ds[2].geometry
[6]:
../_images/examples_shapefile_8_0.svg

We can access the data as a geopandas object by using to_geopandas().

[7]:
gpd = ds.to_geopandas()
type(gpd)
[7]:
geopandas.geodataframe.GeoDataFrame