earthkit.data.core.caching

Attributes

Classes

Cache

Class controlling the cache.

CacheManager

A class that represents a thread of control.

CachePolicy

DiskUsage

EmptyCachePolicy

Future

NoCachePolicy

TmpCachePolicy

UserCachePolicy

Functions

auxiliary_cache_file(owner, path[, index, content, ...])

Create an auxiliary cache file.

cache_file(owner, create, args[, hash_extra, ...])

Creates a cache or temporary file in the folder specified by the cache-policy.

default_serialiser(o)

disk_usage(path)

Module Contents

earthkit.data.core.caching.CACHE = None
earthkit.data.core.caching.CACHE
earthkit.data.core.caching.CACHE_DB = 'cache-2.db'
earthkit.data.core.caching.CONNECTION = None
class earthkit.data.core.caching.Cache

Class controlling the cache.

See Caching for details.

check_size(*args, **kwargs)

Check the cache size and trim it down when needed.

Automatically runs when a new entry is added to the cache or the Cache config parameters change. Does not work when the cache-policy is “off”.

The algorithm includes three steps:

  • first, the cache size is determined

  • next, if the size is larger than the limit defined by the maximum-cache-size config the oldest cache entries are removed until the desired size reached

  • finally, if the size is larger than the limit defined by the maximum-cache-disk-usage config the oldest cache entries are removed until the desired size reached

directory()

Return the path to the current (cache) directory.

Returns:

The cache directory when cache-policy is “user” or “temporary”. The temporary directory when cache-policy is “off”.

Return type:

str

entries(*args, **kwargs)

Dump the entries stored in the cache.

Does not work when the cache-policy is “off”.

Returns:

One dict per cache entry.

Return type:

list of dict

property policy

Get the current cache policy.

Type:

CachePolicy

purge(*args, **kwargs)

Delete entries from the cache.

Does not work when the cache-policy is “off”.

Parameters:

**kwargs (dict, optional) –

Other keyword arguments:

  • matcher: callable

    Method to match the cache entries to delete. Its only argument is a cache entry and should return True if the entry is to be deleted.

Examples

Delete all entries.

>>> from earthkit.data import cache
>>> cache.purge()

Delete all entries where the “owner” is “test_cache”.

>>> from earthkit.data import cache
>>> cache.purge(matcher=lambda e: ["owner"] == "test_cache")
size(*args, **kwargs)

Return the total number of bytes stored in the cache.

Does not work when the cache-policy is “off”.

summary_dump_database(*args, **kwargs)

Return the number of items and total size of the cache.

Does not work when the cache-policy is “off”.

Returns:

  • num (int) – number of items in the cache

  • size (int) – total number of bytes stored in the cache

Examples

>>> from earthkit.data import cache
>>> cache.summary_dump_database()
(40, 846785699)
class earthkit.data.core.caching.CacheManager

Bases: threading.Thread

A class that represents a thread of control.

This class can be safely subclassed in a limited fashion. There are two ways to specify the activity: by passing a callable object to the constructor, or by overriding the run() method in a subclass.

property connection
enqueue(func, *args, **kwargs)
new_connection()
run()

Method representing the thread’s activity.

You may override this method in a subclass. The standard run() method invokes the callable object passed to the object’s constructor as the target argument, if any, with sequential and keyword arguments taken from the args and kwargs arguments, respectively.

class earthkit.data.core.caching.CachePolicy
CACHE_KEYS = ['cache-policy', 'user-cache-directory', 'temporary-cache-directory-root',...
OUTDATED_CHECK_KEYS = None
abstractmethod directory()
file_in_cache_directory(path)
static from_config()
abstractmethod is_cache_size_managed()
abstractmethod managed()
abstractmethod maximum_cache_disk_usage()
abstractmethod maximum_cache_size()
property name
outdated()
update()
abstractmethod use_message_position_index_cache()
class earthkit.data.core.caching.DiskUsage(path)
path = b'.'
percent
class earthkit.data.core.caching.EmptyCachePolicy

Bases: CachePolicy

CACHE_KEYS = ['cache-policy', 'user-cache-directory', 'temporary-cache-directory-root',...
OUTDATED_CHECK_KEYS = None
directory()
file_in_cache_directory(path)
static from_config()
is_cache_size_managed()
managed()
maximum_cache_disk_usage()
maximum_cache_size()
property name
outdated()
update()
use_message_position_index_cache()
class earthkit.data.core.caching.Future(func, args, kwargs)
args
execute()
func
kwargs
result()
class earthkit.data.core.caching.NoCachePolicy

Bases: CachePolicy

CACHE_KEYS = ['cache-policy', 'user-cache-directory', 'temporary-cache-directory-root',...
OUTDATED_CHECK_KEYS = ['cache-policy', 'temporary-directory-root']
directory()
file_in_cache_directory(path)
static from_config()
is_cache_size_managed()
managed()
maximum_cache_disk_usage()
maximum_cache_size()
property name
outdated()
update()
use_message_position_index_cache()
class earthkit.data.core.caching.TmpCachePolicy

Bases: UserCachePolicy

CACHE_KEYS = ['cache-policy', 'user-cache-directory', 'temporary-cache-directory-root',...
OUTDATED_CHECK_KEYS = ['cache-policy', 'temporary-cache-directory-root']
directory()
file_in_cache_directory(path)
static from_config()
is_cache_size_managed()
managed()
maximum_cache_disk_usage()
maximum_cache_size()
property name
outdated()
update()
use_message_position_index_cache()
class earthkit.data.core.caching.UserCachePolicy

Bases: CachePolicy

CACHE_KEYS = ['cache-policy', 'user-cache-directory', 'temporary-cache-directory-root',...
OUTDATED_CHECK_KEYS = ['cache-policy', 'user-cache-directory']
directory()
file_in_cache_directory(path)
static from_config()
is_cache_size_managed()
managed()
maximum_cache_disk_usage()
maximum_cache_size()
property name
outdated()
update()
use_message_position_index_cache()
earthkit.data.core.caching.VERSION = 2
earthkit.data.core.caching.auxiliary_cache_file(owner, path, index=0, content=None, extension='.cache')

Create an auxiliary cache file.

It can be used for example to cache an index for a message based format such as GRIB. It is invalidated if path is changed.

earthkit.data.core.caching.cache_file(owner, create, args, hash_extra=None, extension='.cache', force=None, replace=None)

Creates a cache or temporary file in the folder specified by the cache-policy.

Parameters:
  • owner (str) – The owner of the cache file is generally the name of the source that called cache_file().

  • create (callable) – The method to create the contents of the cache file.

  • args (list-like) – The parameters used to generate the cache key, which is also encoded into the file name and stored in the cache entry.

  • extension (str) – Extension filename (such as “.nc” for NetCDF, etc.), by default “.cache”.

  • force (callable, bool) – Method or flag to decide whether an already existing cache file should be regenerated.

Returns:

path – Full path to the cache file.

Return type:

str

Notes

The behaviour depends on the cache policy:

  • If the cache-policy is user or temporary the file is created in the

cache-directory and the relevant entries are added to the cache database using Cache._register_cache_file().

  • If the cache-policy is off the file is created in the temporary directory.

No cache database and monitoring is available. The cache directory is merely serving as a temporary space.

earthkit.data.core.caching.default_serialiser(o)
earthkit.data.core.caching.disk_usage(path)