Posts Tagged ‘language python’

Garbage, Weakrefs, and Magic Closures

February 22, 2012

In the previous post, I showed a class that can be used to pass ‘weak references’ to bound methods as callbacks without creating extra refs that prevent garbage collection. For global functions, you can simply use a standard weakref, but for very simple callbacks a lambda or short function defined in the local scope, like so:

def register_callbacks(self):
    
    def callback(*x):
        ... do stuf ...
    
    some_object.connect('event', callback)
    
    some_object.connect('another-event', lambda *x: ...)
    

Since no external reference to these functions are normally held, if a weakref is used in their place it is invalid immediately (or after the local function returns). This is rarely a concern unless the object storing the callbacks itself is using weakrefs (like a weakValueDictionary) to store the callbacks.

Exaile does this, which is why, if you write plugins for Exaile that use the event manager, you will either need to store references to these functions or use global functions or class methods.

The Magic Closure Problem
In most cases, you don’t have to worry about such things and can just pass a strong reference (the actual function) to the callback managers. There is one caveat, however, that I recently ran into that wasn’t very obvious at first: objects (like self) that are referenced inside the lambda or local function are kept alive by reference in the closure attached to it. I’ll explain with a few examples:

import gc 
from pprint import pprint

class C(object):
    def __init__(self, x):
        self.x = x

class A(object):
    def __init__(self):
        def lfun():
            return 'lfun'
            
        self.c = C(lfun) 
    
    def fun(self):
        return 'fun'
        
    def cfun(self):
        return self.c.x()

def rec(x):
    print 'reclained',repr(x)

print 'start'
a = A()
b = weakref.ref(a, rec)
print 'post'
print a.fun()
print a.cfun()

print 'del a'
del a

print 'collecting'
n = gc.collect()
print 'garbage:', n
pprint(gc.garbage)

In this example I am using gc debugging (see: gc & weakref pmotw) to check for memory leaks. I am also using a weakref callback to notify me when its referent dies, which is a neat little debugging feature.

Note: You can also use the special function __del__ on the class to notify you when it dies, but beware: Python’s gc is smart enough to automatically recognize and clean up cyclic references (A holds a ref to B which holds a ref to A) that have no external references. If any of the objects in the cyclic chain, however, define the __del__ function, this behavior is not applied. (see gc docs)

A’s __init__ function creates an instance of C and passes it a local function, which it saves a reference to.

Looking at the output:

start
post
fun
lfun
del a
reclained <weakref at 0xcc2208; dead>
collecting
garbage: 0
[]

From the weakref callback, we see that the object dies when we remove it’s reference, as expected. The local function lfun has no effect on the lifetime of a. If we modified it such that it references self, though:

        def lfun():
            return 'lfun',self

Then the output becomes:

start
post
fun
('lfun', <__main__.A object at 0x26c1890>)
del a
collecting
gc: collectable <A 0x26c1890>
gc: collectable <cell 0x26c6050>
gc: collectable <tuple 0x26c18d0>
gc: collectable <function 0x26c7230>
gc: collectable <C 0x26c1950>
gc: collectable <dict 0x2699af0>
gc: collectable <dict 0x26edd80>
reclained <weakref at 0x26c3208; dead>
garbage: 7
[<__main__.A object at 0x26c1890>,
 <cell at 0x26c6050: A object at 0x26c1890>,
 (<cell at 0x26c6050: A object at 0x26c1890>,),
 <function lfun at 0x26c7230>,
 <__main__.C object at 0x26c1950>,
 {'x': <function lfun at 0x26c7230>},
 {'c': <__main__.C object at 0x26c1950>}]

What this shows is that the object does not actually die when we ‘del a’. Since the function lfun uses self, its closure holds a reference to self and is subsequently stored in c. Since the only reference to c is in a, this effectively creates a cyclic reference between C and A, which is cleaned up when the gc runs it’s collection routine. The gc debug output shows that all the inaccesible objects were succesfully collected.

The problem is that, unlike this simple contrived example, callbacks are most often passed to external objects that store and call them. In this case, that external object will hold a reference, through the closure, to self and prevent it from being disposed of. This is similar to the situation described in the previous post, but, as mentioned above, we cannot use weakrefs to the functions themselves as they have no external reference to keep them alive.

One simple solution that I recently discovered is pass a strong reference to the function, but use a weakref to self inside the function itself:

class A(object):
    def __init__(self):
        pself = weakref.proxy(self)
        def lfun():
            return 'lfun',pself
            
        self.c = C(lfun) 

The output once again becomes:

start
post
fun
('lfun', <weakproxy at 0x7fb576c882b8 to A at 0x7fb576c86890>)
del a
reclained <weakref at 0x7fb576c88310; dead>
collecting
garbage: 0
[]

And the object dies when we ‘del a’. Now this is a simple example, and any real usage of weakrefs in this way need to check that they are valid before using them (or handle their exceptions).

Extra Bits
As I was researching this topic I came across two neat recipies using weakrefs that I can’t help but share:

weakattr – a weakly-referenced attribute. When the attribute is no longer referenced, it ‘disappears’ from the instance. Great for cyclic references.

weakmethod – unlike the previously defined WeakMethod, this is a decorator which makes it such that any reference to the decorated method automagically contains a weak reference to the instance object.
Think: fun(weakref.proxy(self), *args, **kwargs).
Decorated methods can then be passed to other objects without creating extra references. Neat.

Python weakmethods

February 22, 2012

Python is my favorite language. It is so simple to code and readable that doing just about anything is quicker and easier than any other language I’ve used. At the same time it is so advanced that I am always discovering new features/techniques/oddities.

One of the oddities I recently ran into is the inability of python’s weakref module to create weak references to bound methods.

Problem
Any application where you pass methods as callbacks could run into this situation. Two recent occurences involve GUI programming and network programming (with twisted). In both cases there is an event system which you need to register callbacks that are called to notify your class. They usually look something like:

    eo = EventObject()

    class MyClass(object):
        def __init__(self):
           eo.register(self.callback)
           
        def callback(self):
            ...

This works great, and it’s usually what you’ll see in just about every tutorial on pygtk or twisted (using their own api of course). What if your program is long running and object that generates your events is expected to outlive your class instance? Your program will ‘leak memory’ because, unless you can explicitly unregister your callback, the event object will hold a reference to your bound method which keeps your instance alive.

A real-world application where this is a problem is Exaile, which uses a global event manager that lives as long as the program runs.

WeakRef
One (elegant) solution is provided by Python’s weakref module:

This module lets you pass weak references to other classes, so when you dispose of your object, it is not kept alive (good examples on that site). The problem with weakref, though, is that it doesn’t act as you might expect on bound methods.

Since bound methods are ‘first class objects’, unless you store a separate reference to the method (which requires extra bookkeeping), the weakref created from a bound method is dead on arrival. The following example illistrates this:

    import weakref
    class A(object):
        def fun(self):
            return 'fun!'
            
    def notify_dead(ref):
        '''called by weakref when the referent dies'''
        print '{0} now dead'.format(repr(ref))

    print 'start'
    a = A()
    r = weakref.ref(a.fun, notify_dead) # dead on arrival
    print a.fun()   # fun!
    print r()       # None
    print r()()     # Exception

WeakMethod
I’ve seen a few ways to work around this, but the cleanest I’ve seen is what’s done in Exaile. The general idea is to create a class that holds a weakref to the instance object and generates the bound method when called if the referent is still alive. The following is a simplified version (no exception handling):

import types
import weakref

class WeakMethod(object):
    def __init__(self, meth, notify=None):
        if meth.im_self is not None:
            raise ValueError ('unbound method')
        self.obj = weakref.ref(meth.im_self, notify)
        self.func = meth.im_func
        self.cls = meth.im_class
        
    def __call__(self):
        obj = self.obj()
        if obj is None:
            return None
        else:
            return types.MethodType(self.func, self.obj, self.cls)

The previous example then becomes:

    class A(object):
        def fun(self):
            return 'fun!'
            
    def notify_dead(ref):
        '''called by weakref when the referent dies'''
        print '{0} now dead'.format(repr(ref))

    print 'start'
    a = A()
    r = WeakMethod(a.fun, notify_dead)
    print a.fun()   # fun!
    print r()       # 
    print r()()     # fun!
    del a           # dies
    print r()       # None

A similar WeakMethodProxy class can be made to behave like a proxy object, if you need to pass something that acts like a method.
# notes on lambdas,closures

Note:
These classes are only valid for methods. Unbound methods, functions, lambdas, and closures/nested functions will not work. You creat weakrefs to these objects as you would any object with weakref.ref, but since these objects are usually created and used in-place (not stored), the weakrefs will be invalid beyond the scope that they are defined in.
Because of this, it is not very useful to create weakrefs of lambdas and closures, more on this in the next post.

One final note:
atexit
This is a convenient function for making sure resources are cleaned up on program termination. The problem is that there is no way to ‘unregister’ a function, and if you pass an instance method the instance is kept alive for the length of the program. You can pass a weakref proxy, but this will cause an exception when atexit tries to execute an invalid weakref proxy. Most solutions I’ve seen create a global ‘cleanup’ function or class that they then wrap their cleanup functions in, supplementing atexit with their own register/unregister or exception handling.