Xarray: using CuPy¶
This notebook demonstrates how to use Xarray on a GPU with CuPy. Since CuPy is not a dependency for earthkit-data it has to be installed separately. Also a CUDA-based GPU environment has to be up and running for the notebook to work.
[1]:
# Get GRIB data on pressure levels
import earthkit.data as ekd
ds = ekd.from_source("sample", "pl.grib").to_fieldlist()
[2]:
# Create a lazy loaded Xarray with Numpy arrays
r = ds.to_xarray()
r
[2]:
<xarray.Dataset> Size: 176kB
Dimensions: (forecast_reference_time: 4, step: 2, level: 2,
latitude: 19, longitude: 36)
Coordinates:
* forecast_reference_time (forecast_reference_time) datetime64[ns] 32B 202...
* step (step) timedelta64[ns] 16B 00:00:00 06:00:00
* level (level) int64 16B 500 700
* latitude (latitude) float64 152B 90.0 80.0 ... -80.0 -90.0
* longitude (longitude) float64 288B 0.0 10.0 ... 340.0 350.0
Data variables:
r (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
t (forecast_reference_time, step, level, latitude, longitude) float64 88kB ...
Attributes:
class: od
stream: oper
levtype: pl
type: fc
expver: 0001
date: 20240603
time: 0
domain: g
number: 0
Conventions: CF-1.8
institution: ECMWF[3]:
type(r.t.data)
[3]:
numpy.ndarray
Move to the GPU as CuPy¶
We use the to_device() method, which is available on the earthkit Xarray accessor. The first argument specifies the device. When the device is not “cpu” and the array_backend keyword argument is not specified it is automatically set to “cupy”.
[4]:
r_cp = r.earthkit.to_device("cuda:0")
# equivalent code:
# r_cp = r.earthkit.to_device("cuda:0", array_backend="cupy")
[5]:
type(r_cp.t.data)
[5]:
cupy.ndarray
[6]:
# Xarray computations work
r_cp.t.mean()
[6]:
<xarray.DataArray 't' ()> Size: 8B array(261.56490497)
[7]:
# Alter the values
r_cp += 1
type(r_cp.t.data)
[7]:
cupy.ndarray
Move back to the CPU as Numpy¶
We use to_device() again to move back the dataset to the cpu. When the device is “cpu” and the array_backend keyword argument is not specified it is automatically set to “numpy”.
[8]:
r_np = r_cp.earthkit.to_device("cpu")
# equivalent code:
# r_np = r.earthkit.to_device("cpu", array_backend="numpy")
[9]:
type(r_np.t.data)
[9]:
numpy.ndarray
[10]:
# The dataset contains the values altered on the GPU
r_np.t.mean()
[10]:
<xarray.DataArray 't' ()> Size: 8B array(262.56490497)
[ ]: