Reading data from URLs

Using individual URLs

We can read individual files from URLs with from_source():

[1]:

import earthkit.data as ekd

fs = ekd.from_source("url",
                       "https://sites.ecmwf.int/repository/earthkit-data/examples/test.grib")

[2]:

fs.ls()

[2]:

	centre	shortName	typeOfLevel	level	dataDate	dataTime	stepRange	dataType	number	gridType
0	ecmf	2t	surface	0	20200513	1200	0	an	0	regular_ll
1	ecmf	msl	surface	0	20200513	1200	0	an	0	regular_ll

Tar and zip archives can also be loaded from a URL:

[3]:

fs = ekd.from_source("url",
                       "https://sites.ecmwf.int/repository/earthkit-data/examples/test_gribs.tar")

[4]:

fs.ls()

[4]:

	centre	shortName	typeOfLevel	level	dataDate	dataTime	dataType	gridType
0	ecmf	2t	surface	0	20200513	1200	an	regular_ll
1	ecmf	msl	surface	0	20200513	1200	an	regular_ll
2	ecmf	t	isobaricInhPa	500	20070101	1200	an	regular_ll
3	ecmf	z	isobaricInhPa	500	20070101	1200	an	regular_ll
4	ecmf	t	isobaricInhPa	850	20070101	1200	an	regular_ll
5	ecmf	z	isobaricInhPa	850	20070101	1200	an	regular_ll

Using multiple URLs

We can access a list of URLs in one go. In the example below the first file contains 2 fields while the second one 4 fields.

[5]:

fs = ekd.from_source("url",
                       ["https://sites.ecmwf.int/repository/earthkit-data/examples/test.grib",
                        "https://sites.ecmwf.int/repository/earthkit-data/examples/test4.grib"])
fs.ls()

[5]:

	centre	shortName	typeOfLevel	level	dataDate	dataTime	dataType	gridType
0	ecmf	2t	surface	0	20200513	1200	an	regular_ll
1	ecmf	msl	surface	0	20200513	1200	an	regular_ll
2	ecmf	t	isobaricInhPa	500	20070101	1200	an	regular_ll
3	ecmf	z	isobaricInhPa	500	20070101	1200	an	regular_ll
4	ecmf	t	isobaricInhPa	850	20070101	1200	an	regular_ll
5	ecmf	z	isobaricInhPa	850	20070101	1200	an	regular_ll

Using URL patterns

URLs can also be specified by using url-patterns. In the example below when pattern “id” is substituted it will match two files: test4.grib and test6.grib:

[6]:

fs = ekd.from_source("url-pattern",
                        "https://sites.ecmwf.int/repository/earthkit-data/examples/test{id}.grib",
                        {"id": [4, 6]})
fs.ls()

[6]:

	centre	shortName	typeOfLevel	level	dataDate	dataTime	dataType	gridType
0	ecmf	t	isobaricInhPa	500	20070101	1200	an	regular_ll
1	ecmf	z	isobaricInhPa	500	20070101	1200	an	regular_ll
2	ecmf	t	isobaricInhPa	850	20070101	1200	an	regular_ll
3	ecmf	z	isobaricInhPa	850	20070101	1200	an	regular_ll
4	ecmf	t	isobaricInhPa	1000	20180801	1200	an	regular_ll
5	ecmf	u	isobaricInhPa	1000	20180801	1200	an	regular_ll
6	ecmf	v	isobaricInhPa	1000	20180801	1200	an	regular_ll
7	ecmf	t	isobaricInhPa	850	20180801	1200	an	regular_ll
8	ecmf	u	isobaricInhPa	850	20180801	1200	an	regular_ll
9	ecmf	v	isobaricInhPa	850	20180801	1200	an	regular_ll

We can specify a format for each pattern. In this example “my_date” is the pattern name and “:date(%Y-%m-%d)” specifies the format:

[7]:

import datetime

fs = ekd.from_source(
    "url-pattern",
    "https://sites.ecmwf.int/repository/earthkit-data/test-data/test_{my_date:date(%Y-%m-%d)}_{name}.grib",
    {"my_date": datetime.datetime(2020,5,13), "name": ["t2","msl"]})
fs.ls()

[7]:

	centre	shortName	typeOfLevel	level	dataDate	dataTime	stepRange	dataType	number	gridType
0	ecmf	2t	surface	0	20200513	1200	0	an	0	regular_ll
1	ecmf	msl	surface	0	20200513	1200	0	an	0	regular_ll

[ ]: