Sunday, May 22, 2011

Python, imp.load_source() trap

While I have been writing a hook for WebApy lightweight RESTful Python webserver -- the recent a project of mine, I got ran into the funny (actually it wasn't; since it was hard enough to debug) issue related to loading of Python-app addons. As far as you know (or not; if you already have taken a look at sources), there is Python's standard library's 'imp' module is used to load hook files. "imp.load_source()" if to be more precised.

Here is an example of the problem you may get into in Python while loading modules with load_source() function of imp module.

Lets see a simplified example of what where I ran into. Suppose our Python application supports external addons (aka plugins/extensions). Each addon must provide a class named "Addon" with implemented "runLogic()" method. Addons have priorities -- from 1 to N (N > 1); priority set to "1" is a highest one. Suppose several of addons also implement internal Helper class for low-level dirty work.

Addon #1 (file: addon-1.addon.py)
class Helper(object):
    """ Internal, addon-specific helper class """
    def __init__(self, msg):
        self.__msg = msg

    def help(self):
        print self.__msg
        
class Addon(object):
    """ Addon entry point class """
    PRIORITY = 1
    
    def runLogic(self):
        addonHelper = Helper("module 1")
        addonHelper.help()

and Addon #2 (file: addon-2.addon.py)
class Helper(object):
    """ Internal, addon-specific helper class """
    def help(self):
        print "module 2"

class Addon(object):
    """ Addon entry point class """
    PRIORITY = 2
    
    def runLogic(self):
        addonHelper = Helper()
        addonHelper.help()

Addons are handled inside the main application through the class named "AddonLoader". "AddonLoader" is initialized with the only argument -- filename pattern of modules; since our modules are called addon-1.addon.py and addon-2.addon.py respectively the aforementioned pattern could be '*.addon.py'. Also "AddonLoader" provides with the only public method "runMoreImportantAddon()" which executes an addon with the highest priority. Addons handling is implemented through Python's 'imp' module:

import glob  ### To find addons on local filesystem
import imp   ### To load addons

class AddonLoader(object):
    def __init__(self, addonPattern):
        self.__addonPattern = addonPattern

    def runMoreImportantAddon(self):
        """ Runs an addon with the highest priority (determined by PRIORITY property)  """
        moreImportantAddon = self.__getMoreImportantAddonInstance()
        moreImportantAddon.runLogic()

    def __getMoreImportantAddonInstance(self):
        """ Returns an addon instance with the highest (1 -- high, 10 -- low) priority """
        
        availableAddons = self.__enumerateAddonsOnFileSystem()
        sortedAvailableAddons = sorted(availableAddons,
                                       key = lambda addon: addon.PRIORITY)

        return sortedAvailableAddons[0]()

    def __enumerateAddonsOnFileSystem(self):
        """ Load addons by specified pattern and returns a list with them """
        
        addons = list()
        
        for addonPath in glob.glob(self.__addonPattern):
            addon = imp.load_source("addon", addonPath)
            addons.append(addon.Addon)

        return addons

def main():
    addonLoader = AddonLoader("*.addon.py")
    addonLoader.runMoreImportantAddon()

if "__main__" == __name__:
    main()

So could you predict what the output will be when main app is ran? I bet that probably not. There will be an exception that Helper class could not be instantiated because of invalid passed parameters amount. Was it expected?

No, it was not; at least for me. The problem of this code is hidden in "name" argument of imp.load_source()'s function. For each enumerated addon it is still the same (set explicitly to "addon"); on the each iteration all already loaded classes are overwritten. On first iteration we extract and keep a reference to Addon #1 (do not forget that it uses Helper #1 class). On the second (final) iteration a reference to Addon #2 (it depends on Helper #2 class) is taken. Since loaded addons are set to have the same name ("addon"), on the final iteration we see that Helper #1 is overwritten with Helper #2 in the namespace of "addon". And when Addon #1 is being instantiated it is calls for Helper class, but as you remember it was overwritten and does not take any parameters in the constructor. Here we get an exception.

The solution of the problem is not to pass constant name to load_source() function. Just replace "addon" there with the call of the function which returns an unique name of module to load. Names must be unique!

Since the official documentation does say nothing about uniqueness of "name" argument, I wonder why there is no hint about the trap you could get into while using it?

1 comment:

  1. Thanks for this! I have a similar architecture (different apps configured via conf.py module each) and it took me a while to realize why it leads to this ridiculous bug. Changing the static module name in the load_source() call to str(uuid.uuid4()) helped ;)

    ReplyDelete