cache_on_disk#

Persistent disk cache for immutable method return values.

Disk cache in the style of LBT elab_lib

class arte.dataelab.cache_on_disk.DiskCacher(f)#

Bases: object

Class implementing a persistent disk cache

Methods

clear_cache()

Remove cache from disk and memory

execute(*args, **kwargs)

Cache lookup

fullpath()

Full path of cache file on disk, without extension

set_prefix(prefix)

Set prefix for tag directories

set_tag(tag, instance_name)

Set the tag used to make cached data persistent

set_tmpdir(tmpdir)

Set the directory where cached data is stored

clear_cache()#

Remove cache from disk and memory

execute(*args, **kwargs)#

Cache lookup

fullpath()#

Full path of cache file on disk, without extension

set_prefix(prefix)#

Set prefix for tag directories

set_tag(tag, instance_name)#

Set the tag used to make cached data persistent

set_tmpdir(tmpdir)#

Set the directory where cached data is stored

exception arte.dataelab.cache_on_disk.TagNotSetException#

Bases: Exception

Cache tag has not been set

arte.dataelab.cache_on_disk.cache_on_disk(f)#

Persistent disk cache for immutable method return values.

Only use this decorator for methods whose return value never changes.

Methods decorated with cache_on_disk() will write their return value into a npy or pickle file depending on the value type. Subsequent method calls will reuse the previous return value, stored in a memory buffer. New instances using the same tag will read data from disk on the first call.

Caching only starts after a global tag has been defined. Call set_tag(obj, tag) with obj set the highest object in the hierarchy (typically the analyzer instance) to initialize disk caching for all inherited and composed (via attributes) objects.

Tags should be uniquely identifying the dataset to be stored. One suggestion is to use both a system identifier (like KAPA or LBTISX) and a timestamp, for example LBTISX20240410_112233.

Call clear_cache(obj) to delete all temporary files for all inherited and composed (via attributes) objects. The methods code will be run and stored again when called.

Data is stored in tmpdir/prefix<tag>/filename.npy, where:
  • tmpdir is by default the system temporary directory

  • prefix is by default “cache”

  • tag must be set by the owner class by calling set_tag() at some point

  • “filename” identifies the method and class name

File always has extension “.npy”. Any type different from a numpy array will be pickled instead, using the same extension and leveraging numpy’s transparent object pickling.

Defaults for tmpdir and prefix can be overriden using set_tmpdir() and set_prefix(), each with two arguments:

  • the highest object in the hiearchy (typically the anayzer instance)

  • the new value for tmpdir or prefix

arte.dataelab.cache_on_disk.clear_cache(root_obj)#

Clear cache for root_obj and child/member objects

arte.dataelab.cache_on_disk.get_disk_cacher(instance, method)#

Return the DiskCacher used by method for instance instance

arte.dataelab.cache_on_disk.set_logfile(filename, level=10, name='cache_on_disk')#

Set the file where log output will be written

arte.dataelab.cache_on_disk.set_prefix(root_obj, prefix)#

Set the prefix for each tag directory

arte.dataelab.cache_on_disk.set_tag(root_obj, tag)#

Setup disk cache tag for root_obj and child/member objects

arte.dataelab.cache_on_disk.set_tmpdir(root_obj, tmpdir)#

Set the directory where cached data is stored