[[!tag obnam python]]
During my recent vacations, I had some time to work on Obnam, my backup program. I've been rewriting it, and had gotten it to work again, but it was slow. Profiling showed the problem to be in the in-memory LRU cache for "backup objects". It's not relevant what backup objects are, except that they have an identifier field. During a backup run, these objects need to be cached in memory, for speed, but not too many of them, so memory requirement don't blow up.
My first version of the LRU cache was basically just a list, in which the most recently used object was kept at the end. Every time I accessed an object in the cache, it was moved to the end. Turns out, linear search is slow when you do it millions of times and the list is tens of thousands long. Who would've guessed.
I set out to write a better one. The API of the cache class is basically this:
class ObjectCache(object):
max = N # max number of objects in cache
def get(self, objid):
"""Return corresponding object, or None."""
def put(self, obj):
"""Add object, maybe forget least recently used one."""
Both get
and put
need to be fast, preferably on the order of O(1),
but certainly much faster than O(max).
My solution is based on the realization that Python's built-in hash table (dictionary) is pretty fast, so I might as well use two.
First, age is indicated by an integer that monotonously increases for
every access. I call this counter
. When I access an object, I store
the current value of counter
as the age, and then I increment
counter
. No two objects have the same age.
values
maps object identifiers to tuples of object and counter
value.
ages
maps counter
value to object identifier. smallest
keeps track
of the smallest index of ages.
At any one time (invariant!), for a given object values[obj.id] == (obj, age)
and ages[age] == obj.id
.
get
retrieves the tuple from values
, removes the corresponding entry
in ages
, puts in a new tuple (with new age) back to values
, updates
ages
and smallest
. That's several dictionary accesses, at O(log n),
which is slower than O(1), but fast enough.
put
works in a similar way, except that it may need to remove the
least recently used object. It finds that object's identifier via
ages[smallest]
.
I hope that makes sense. Perhaps the full code will help:
class ObjectCache(object):
"""Cache objects in memory."""
def __init__(self):
self.counter = 0 # counter for age
self.values = {} # indexed by object id, gives (counter, object)
self.ages = {} # indexed by counter, gives object id
self.smallest = -1 # smallest remembered counter value
# Compute a default max cache size by assuming a one megabyte
# block size and a 64 byte object size.
self.max = 1000 * 1000 / 64
def get(self, objid):
pair = self.values.get(objid)
if pair is None:
return None
obj, counter = pair
self.values[objid] = (obj, self.counter)
del self.ages[counter]
self.ages[self.counter] = obj.id
self.counter += 1
while self.smallest not in self.ages:
self.smallest += 1
return obj
def put(self, obj):
if obj.id in self.values:
del self.ages[self.values[obj.id][1]]
self.values[obj.id] = (obj, self.counter)
self.ages[self.counter] = obj.id
self.counter += 1
while self.smallest not in self.ages:
self.smallest += 1
else:
self.values[obj.id] = (obj, self.counter)
self.ages[self.counter] = obj.id
self.counter += 1
while self.smallest not in self.ages:
self.smallest += 1
if len(self.values) > self.max:
del self.values[self.ages[self.smallest]]
del self.ages[self.smallest]
while self.smallest not in self.ages:
self.smallest += 1
Now that I blog about this, I'm sure someone will point out a horrible mistake in the code, or some data structure or algorithm that will make the whole thing O(1). However, the above code dropped my test case exceution time from about 90 minutes to less than 5, so I'm already happy.