NetCDF: working with fieldlists

We read a NetCDF file containing 3 variables on pressure levels on a 2D latitude-longitude grid. First we ensure the example file is available.

[1]:
import earthkit.data as ekd
ekd.download_example_file("tuv_pl.nc")
[2]:
ds = ekd.from_source("file", "tuv_pl.nc")

Our NetCDF data is represented as a FieldList consisting of NetCDFFields. Field in this context means a 2D geographical coverage (horizontal slices).

Iteration

We can itearte through the fields (we use the first 3 fields for simplicity):

[3]:
for f in ds[:3]:
    print(f)
NetCDFField(t,level=1000,time=2018-08-01 12:00:00)
NetCDFField(t,level=850,time=2018-08-01 12:00:00)
NetCDFField(t,level=700,time=2018-08-01 12:00:00)

Inspecting the contents

[4]:
len(ds)
[4]:
18
[5]:
ds.ls()
[5]:
variable level valid_datetime units
0 t 1000 2018-08-01T12:00:00 K
1 t 850 2018-08-01T12:00:00 K
2 t 700 2018-08-01T12:00:00 K
3 t 500 2018-08-01T12:00:00 K
4 t 400 2018-08-01T12:00:00 K
5 t 300 2018-08-01T12:00:00 K
6 u 1000 2018-08-01T12:00:00 m s**-1
7 u 850 2018-08-01T12:00:00 m s**-1
8 u 700 2018-08-01T12:00:00 m s**-1
9 u 500 2018-08-01T12:00:00 m s**-1
10 u 400 2018-08-01T12:00:00 m s**-1
11 u 300 2018-08-01T12:00:00 m s**-1
12 v 1000 2018-08-01T12:00:00 m s**-1
13 v 850 2018-08-01T12:00:00 m s**-1
14 v 700 2018-08-01T12:00:00 m s**-1
15 v 500 2018-08-01T12:00:00 m s**-1
16 v 400 2018-08-01T12:00:00 m s**-1
17 v 300 2018-08-01T12:00:00 m s**-1

Slicing

Standard Python slicing is available.

[6]:
g = ds[1]
g
[6]:
NetCDFField(t,level=850,time=2018-08-01 12:00:00)
[7]:
g = ds[1:3]
g.ls()
[7]:
variable level valid_datetime units
0 t 850 2018-08-01T12:00:00 K
1 t 700 2018-08-01T12:00:00 K
[8]:
g = ds[-1]
g
[8]:
NetCDFField(v,level=300,time=2018-08-01 12:00:00)

Getting data values

Using values

The values property always returns a flat array per field:

[9]:
v = ds[0].values
v.shape
[9]:
(84,)
[10]:
v[0:4]
[10]:
array([272.56486405, 272.56486405, 272.56486405, 272.56486405])

When called on the whole fieldlist values returns a 2D array:

[11]:
v = ds.values
v.shape
[11]:
(18, 84)

Using to_numpy()

With to_numpy() the field shape is set on the array:

[12]:
v = ds[0].to_numpy()
print(v.shape)
print(ds[0].shape)
(7, 12)
(7, 12)
[13]:
v = ds.to_numpy()
v.shape
[13]:
(18, 7, 12)

Metadata

Metadata access works both on individual fields and slices:

[14]:
ds[0].metadata("variable")
[14]:
't'
[15]:
ds[0:2].metadata(["level", "variable"])
[15]:
[[1000, 't'], [850, 't']]

and on all the fields:

[16]:
ds.metadata("level")
[16]:
[1000,
 850,
 700,
 500,
 400,
 300,
 1000,
 850,
 700,
 500,
 400,
 300,
 1000,
 850,
 700,
 500,
 400,
 300]

For each filed we can get the metadata as an object:

[17]:
md = ds[0].metadata()
md
[17]:
NetCDFMetadata({'units': 'K', 'long_name': 'Temperature', 'standard_name': 'air_temperature', 'date': 20180801, 'time': 1200, 'variable': 't', 'level': 1000, 'levtype': 'level'})
[18]:
md["level"]
[18]:
1000

Selection

Selection by metadata is always creating a “view”, no copying of data is involved.

[19]:
g = ds.sel(variable=["u", "v"], level=850)
g.ls()
[19]:
variable level valid_datetime units
0 u 850 2018-08-01T12:00:00 m s**-1
1 v 850 2018-08-01T12:00:00 m s**-1
[20]:
g = ds.sel(variable="t")
g.ls()
[20]:
variable level valid_datetime units
0 t 1000 2018-08-01T12:00:00 K
1 t 850 2018-08-01T12:00:00 K
2 t 700 2018-08-01T12:00:00 K
3 t 500 2018-08-01T12:00:00 K
4 t 400 2018-08-01T12:00:00 K
5 t 300 2018-08-01T12:00:00 K

Xarray

Xarray conversion does not involve disk writing.

[21]:
ds1 = ds.to_xarray()
ds1
[21]:
<xarray.Dataset> Size: 12kB
Dimensions:    (longitude: 12, latitude: 7, level: 6, time: 1)
Coordinates:
  * longitude  (longitude) float32 48B 0.0 30.0 60.0 90.0 ... 270.0 300.0 330.0
  * latitude   (latitude) float32 28B 90.0 60.0 30.0 0.0 -30.0 -60.0 -90.0
  * level      (level) int32 24B 1000 850 700 500 400 300
  * time       (time) datetime64[ns] 8B 2018-08-01T12:00:00
Data variables:
    t          (time, level, latitude, longitude) float64 4kB dask.array<chunksize=(1, 6, 7, 12), meta=np.ndarray>
    u          (time, level, latitude, longitude) float64 4kB dask.array<chunksize=(1, 6, 7, 12), meta=np.ndarray>
    v          (time, level, latitude, longitude) float64 4kB dask.array<chunksize=(1, 6, 7, 12), meta=np.ndarray>
Attributes:
    Conventions:  CF-1.6
    history:      2023-08-07 18:24:35 GMT by grib_to_netcdf-2.30.2: grib_to_n...
[ ]: