cache_on_disk#
Persistent disk cache for immutable method return values.
Disk cache in the style of LBT elab_lib
- class arte.dataelab.cache_on_disk.DiskCacher(f)#
Bases:
objectClass implementing a persistent disk cache
Methods
Remove cache from disk and memory
execute(*args, **kwargs)Cache lookup
fullpath()Full path of cache file on disk, without extension
set_prefix(prefix)Set prefix for tag directories
set_tag(tag, instance_name)Set the tag used to make cached data persistent
set_tmpdir(tmpdir)Set the directory where cached data is stored
- clear_cache()#
Remove cache from disk and memory
- execute(*args, **kwargs)#
Cache lookup
- fullpath()#
Full path of cache file on disk, without extension
- set_prefix(prefix)#
Set prefix for tag directories
- set_tag(tag, instance_name)#
Set the tag used to make cached data persistent
- set_tmpdir(tmpdir)#
Set the directory where cached data is stored
- exception arte.dataelab.cache_on_disk.TagNotSetException#
Bases:
ExceptionCache tag has not been set
- arte.dataelab.cache_on_disk.cache_on_disk(f)#
Persistent disk cache for immutable method return values.
Only use this decorator for methods whose return value never changes.
Methods decorated with cache_on_disk() will write their return value into a npy or pickle file depending on the value type. Subsequent method calls will reuse the previous return value, stored in a memory buffer. New instances using the same tag will read data from disk on the first call.
Caching only starts after a global tag has been defined. Call set_tag(obj, tag) with obj set the highest object in the hierarchy (typically the analyzer instance) to initialize disk caching for all inherited and composed (via attributes) objects.
Tags should be uniquely identifying the dataset to be stored. One suggestion is to use both a system identifier (like KAPA or LBTISX) and a timestamp, for example LBTISX20240410_112233.
Call clear_cache(obj) to delete all temporary files for all inherited and composed (via attributes) objects. The methods code will be run and stored again when called.
- Data is stored in tmpdir/prefix<tag>/filename.npy, where:
tmpdir is by default the system temporary directory
prefix is by default “cache”
tag must be set by the owner class by calling set_tag() at some point
“filename” identifies the method and class name
File always has extension “.npy”. Any type different from a numpy array will be pickled instead, using the same extension and leveraging numpy’s transparent object pickling.
Defaults for tmpdir and prefix can be overriden using set_tmpdir() and set_prefix(), each with two arguments:
the highest object in the hiearchy (typically the anayzer instance)
the new value for tmpdir or prefix
- arte.dataelab.cache_on_disk.clear_cache(root_obj)#
Clear cache for root_obj and child/member objects
- arte.dataelab.cache_on_disk.get_disk_cacher(instance, method)#
Return the DiskCacher used by method for instance instance
- arte.dataelab.cache_on_disk.set_logfile(filename, level=10, name='cache_on_disk')#
Set the file where log output will be written
- arte.dataelab.cache_on_disk.set_prefix(root_obj, prefix)#
Set the prefix for each tag directory
- arte.dataelab.cache_on_disk.set_tag(root_obj, tag)#
Setup disk cache tag for root_obj and child/member objects
- arte.dataelab.cache_on_disk.set_tmpdir(root_obj, tmpdir)#
Set the directory where cached data is stored