Posts Tagged ‘exaile’

Garbage, Weakrefs, and Magic Closures

February 22, 2012

In the previous post, I showed a class that can be used to pass ‘weak references’ to bound methods as callbacks without creating extra refs that prevent garbage collection. For global functions, you can simply use a standard weakref, but for very simple callbacks a lambda or short function defined in the local scope, like so:

def register_callbacks(self):
    
    def callback(*x):
        ... do stuf ...
    
    some_object.connect('event', callback)
    
    some_object.connect('another-event', lambda *x: ...)
    

Since no external reference to these functions are normally held, if a weakref is used in their place it is invalid immediately (or after the local function returns). This is rarely a concern unless the object storing the callbacks itself is using weakrefs (like a weakValueDictionary) to store the callbacks.

Exaile does this, which is why, if you write plugins for Exaile that use the event manager, you will either need to store references to these functions or use global functions or class methods.

The Magic Closure Problem
In most cases, you don’t have to worry about such things and can just pass a strong reference (the actual function) to the callback managers. There is one caveat, however, that I recently ran into that wasn’t very obvious at first: objects (like self) that are referenced inside the lambda or local function are kept alive by reference in the closure attached to it. I’ll explain with a few examples:

import gc 
from pprint import pprint

class C(object):
    def __init__(self, x):
        self.x = x

class A(object):
    def __init__(self):
        def lfun():
            return 'lfun'
            
        self.c = C(lfun) 
    
    def fun(self):
        return 'fun'
        
    def cfun(self):
        return self.c.x()

def rec(x):
    print 'reclained',repr(x)

print 'start'
a = A()
b = weakref.ref(a, rec)
print 'post'
print a.fun()
print a.cfun()

print 'del a'
del a

print 'collecting'
n = gc.collect()
print 'garbage:', n
pprint(gc.garbage)

In this example I am using gc debugging (see: gc & weakref pmotw) to check for memory leaks. I am also using a weakref callback to notify me when its referent dies, which is a neat little debugging feature.

Note: You can also use the special function __del__ on the class to notify you when it dies, but beware: Python’s gc is smart enough to automatically recognize and clean up cyclic references (A holds a ref to B which holds a ref to A) that have no external references. If any of the objects in the cyclic chain, however, define the __del__ function, this behavior is not applied. (see gc docs)

A’s __init__ function creates an instance of C and passes it a local function, which it saves a reference to.

Looking at the output:

start
post
fun
lfun
del a
reclained <weakref at 0xcc2208; dead>
collecting
garbage: 0
[]

From the weakref callback, we see that the object dies when we remove it’s reference, as expected. The local function lfun has no effect on the lifetime of a. If we modified it such that it references self, though:

        def lfun():
            return 'lfun',self

Then the output becomes:

start
post
fun
('lfun', <__main__.A object at 0x26c1890>)
del a
collecting
gc: collectable <A 0x26c1890>
gc: collectable <cell 0x26c6050>
gc: collectable <tuple 0x26c18d0>
gc: collectable <function 0x26c7230>
gc: collectable <C 0x26c1950>
gc: collectable <dict 0x2699af0>
gc: collectable <dict 0x26edd80>
reclained <weakref at 0x26c3208; dead>
garbage: 7
[<__main__.A object at 0x26c1890>,
 <cell at 0x26c6050: A object at 0x26c1890>,
 (<cell at 0x26c6050: A object at 0x26c1890>,),
 <function lfun at 0x26c7230>,
 <__main__.C object at 0x26c1950>,
 {'x': <function lfun at 0x26c7230>},
 {'c': <__main__.C object at 0x26c1950>}]

What this shows is that the object does not actually die when we ‘del a’. Since the function lfun uses self, its closure holds a reference to self and is subsequently stored in c. Since the only reference to c is in a, this effectively creates a cyclic reference between C and A, which is cleaned up when the gc runs it’s collection routine. The gc debug output shows that all the inaccesible objects were succesfully collected.

The problem is that, unlike this simple contrived example, callbacks are most often passed to external objects that store and call them. In this case, that external object will hold a reference, through the closure, to self and prevent it from being disposed of. This is similar to the situation described in the previous post, but, as mentioned above, we cannot use weakrefs to the functions themselves as they have no external reference to keep them alive.

One simple solution that I recently discovered is pass a strong reference to the function, but use a weakref to self inside the function itself:

class A(object):
    def __init__(self):
        pself = weakref.proxy(self)
        def lfun():
            return 'lfun',pself
            
        self.c = C(lfun) 

The output once again becomes:

start
post
fun
('lfun', <weakproxy at 0x7fb576c882b8 to A at 0x7fb576c86890>)
del a
reclained <weakref at 0x7fb576c88310; dead>
collecting
garbage: 0
[]

And the object dies when we ‘del a’. Now this is a simple example, and any real usage of weakrefs in this way need to check that they are valid before using them (or handle their exceptions).

Extra Bits
As I was researching this topic I came across two neat recipies using weakrefs that I can’t help but share:

weakattr – a weakly-referenced attribute. When the attribute is no longer referenced, it ‘disappears’ from the instance. Great for cyclic references.

weakmethod – unlike the previously defined WeakMethod, this is a decorator which makes it such that any reference to the decorated method automagically contains a weak reference to the instance object.
Think: fun(weakref.proxy(self), *args, **kwargs).
Decorated methods can then be passed to other objects without creating extra references. Neat.