Showing posts with label code snippet. Show all posts
Showing posts with label code snippet. Show all posts

Tuesday, November 5, 2013

Pattern of Error Reporting in Python

Instead of Intro


That is not an uncommon and even not rare when an application which was supposed to be a prototype suddenly becomes a tool for everyday usage. Not a perfect thing here is that in a rush to make the program more or less user-friendly a developer has to hide internals by preserving error messages with internal details. Here I am going to share the pattern I mostly use in my "so-called-prototype" Python scripts; especially when they are translated to binaries using py2exe.

The pattern allows to introduce an extended error reporting in Python scripts w/o any extra costs.

Own Errors


The only rule I follow when there is a need to raise an error is to forget about standard ready-to-use Python exceptions. Due the following reasons:

  • Need to distinguish where an error came from: from the "Batteries" or from the application's logic;
  • Need to express an application's domain in the code.
The rule is true even for a spaghetti-style code which is supposed to be thrown away tomorrow or even today; this will cost nothing but might help with debugging.

So introduce own exception class:

class Error(Exception):
    def __init__(self, message, innerError = None):
        msg = message
        if innerError:
            msg += " *- {0}".format(str(innerError))

        Exception.__init__(self, msg)

The exception class here is straight forward for the sake of simplicity. In serious applications it is much better to introduce a field for an inner error, environment etc.

Respect Each Error


When a script's function pass flow control to another one there is a bell that the flow goes to another virtual layer. Each new layer is worth to have own logged error if there is any.

Suppose below that foo() is a first layer, bar() is a second one. So the application might look like:

def tryRussianRoulette():
    ### NOTE: that is a very bad practice to put import statements somewhere in a logic
    import random
    
    isFired = (0 == random.randint(0, 5))
    if isFired:
        raise Error("bang!")
        
def bar():
    try:
        tryRussianRoulette()
    except Error, e:
        raise Error("Russian Rouletter has fired", e)

def foo():
    try:
        bar()
    except Error, e:
        raise Error("Failed to bar-bar", e)

def main():
    foo()

def propagateToUser(error):
    if isinstance(error, Error):
        print "[Error]", str(e)
    else:
        print "[Unknown error]", str(e)

if "__main__" == __name__:   
    try:
        main()
    except Exception, e:
        propagateToUser(e)
        sys.exit(1)

    sys.exit(0)


tryRussianRoulette() is a function which may cause an error.
When application is accidentally in a production, an error for end-user (!) would look like:

[Error] Failed to bar-bar *- Russian Rouletter has fired *- bang!

In most cases (if you have not skipped error handling on each layer), the error is descriptive enough to understand the problem.

Pattern in Action


How to introduce an ability for an extended error tracing w/o writing tons of extra code? The answer is to vary try/except statement's behavior depending on system's environment variable bound to an application.

The application's code above is just extended with the function:

def isDebug():
    withLettersOnly = lambda string: filter(lambda ch: ch.isalpha(), string)
    appName = os.path.basename(sys.argv[0])
    debugKey = "{appName}_DEBUG".format(appName = withLettersOnly(appName))

    return os.environ.get(debugKey, False)

and try/except statement will be replaced with the following code:

    try:
        main()
    except Exception, e:
        if isDebug():
            raise
        else:
            propagateToUser(e)
            sys.exit(1)

If the newly developed application is run from a file named bing-bang.py, environment variable bingbangpy_DEBUG set to a non-empty value will cause a raw Python's stack trace instead user-friendly error. The similar is true for a Python's script compiled using py2exe; guess a bound environment variable name.

Instead of Summary


  1. The pattern code has been intentionally left primitive for one reason: to allow your to play around and find a suitable implementation;
  2. Introduced own exception class could contain locals() and globals() of a corresponding layer; or system details. That totally depends on your fantasy;
  3. The pattern works well for small scripts; and for "proof-of-concepts" applications which might be used in real-life until RTM. Avoid the approach in the case of more or less serious applications.


Tuesday, October 15, 2013

Python, argparse and Environment Variables

argparse more likely is the one of frequently used Python's libraries. It covers all standard cases out of the box. When a case goes beyond the box a developer is encouraged to extend the library with the specially provided API.

What to do when the argument is marked as required and its value is not changed during some quite long time? Make the argument's value persistent; store somewhere. Otherwise your application more likely has all chances to be recognized as unfriendly by an end-user. You may want to store a value of the argument somewhere in a configuration file. But what the file format should be? How to organize the file? Where should it be located? The are more questions than answers.

Why not to be able to pass the arguments to argparse's parser through the system environment? Such approach is widely recognized and makes your application easily scriptable.

Optional Arguments


When an argument is marked as optional a value from the system environment could be accessed by evaluating a default one:

parser.add_argument("-c", "--crt", type = str, default = os.environ.get("X509_CRT"), required = False, help = "Path to X509 Certificate")

Or in a bit complicated way when the value is not allowed to be empty:

parser.add_argument("-c", "--crt", type = str, default = os.environ.get("X509_CRT") or "~/.work/vpn.crt", required = False, help = "Path to X509 Certificate")

Required Arguments


Due to the design mandatory arguments in argparse library are not allowed to have default values.
There is an interface in argparse called Action which is associated with the argument being processed. The interface is intended to customize the way how an argument is processed/stored. Providing own implementation of the interface will allow you to look the desired value of the argument in system environment:

class FindValueInEnvironmentAction(argparse.Action):
    def __init__(self, varName, **kwargs):
        assert kwargs.get("required")
         
        valueFromEnv = os.environ.get(varName)
        requiredValue = True
        
        if valueFromEnv:
            kwargs["required"] = False
            kwargs["default"] = valueFromEnv
            
        argparse.Action.__init__(self, **kwargs)

    def __call__(self, parser, namespace, values, option_string):
        setattr(namespace, self.dest, values)
...
parser.add_argument("-c", "--crt", type = str, action = FindValueInEnvironmentAction, varName = "X509_CRT", required = True, help = "Path to X509 Certificate")

When the argument's value is found in the system environment variable scope, the built-in options in **kwargs are patched:
  • required attribute is removed;
  • default value is set to the read one.
These steps allow to pretend that an optional argument with a predefined default value is being processed. Here a value of the argument passed through the command line has a priority over a value set through "X509_CRT" environment variable.

I would also inject to our implementation a dictionary where to look up; it will allow us to cover the class with unit tests. And if you a user of Python3 feel free to try an alternative way.

Wednesday, May 29, 2013

Pass arguments to BaseHTTPRequestHandler

Each time when I face with the Python's built-in web-server (BaseHTTPServer) I feel a pain. The pain is caused by a strange architectural decision to pass a class as a request handler not an instance. At first glance it does not matter what kind of entity to pass. Since you can extend the default implementation with your logic. It still does not matter until a some moment. This moment happens when you want to have an externally configurable handler and/or you have to inject a bunch of settings. Currently there is no way to do it easily.

There is no easy way since the developer who created such design more likely was fell in love with Template Method pattern or was affected by some forbidden stuff :). Let's take a brief look how current Python's BaseHTTPRequestHandler's implementation works. Then let's try to answer the question how to pass arguments to BaseHTTPRequestHandler?

When a request comes to the server, the server creates an instance of BaseHTTPRequestHandler class. The newly created instance is initialized with a received request in raw format (say, as a plain not yet parsed text; when it comes finally to our handler, it is already split to the headers, body etc.). BaseHTTPRequestHandler's constructor dispatches an inner method (call it process_request()) responsible for an initial request handling; e.g. to determine a kind of the request (GET/POST/etc). After the request is recognized, a corresponding method do_[GET/POST/DELETE/HEAD/PUT]() is called from the self.

How BaseHTTPServer interacts with the handler


Seems very straightforward. But the following approach at least breaks the rule that one function should do one thing only. Constructor is responsible for object construction but not for serving business logic.

Let's see the following code. The handler is supposed to output current time and date in some format. With the current implementation the task could be implemented as:

def tellTheDate():
  import time
  return time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime())

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
  def do_GET(self):
    self.__process()

  def do_POST(self):
    self.__process()

  def __process(self):
    self.__setupLayout()
    self.wfile.write(tellTheDate())

  def __setupLayout(self):
    self.send_response(200)
    self.send_header("Content/Type", "text/plain")
    self.end_headers()

def main():
  host = ("localhost", 8080)
  server = BaseHTTPServer.HTTPServer(host, RequestHandler)
  server.handle_request()

The problem in the code above that it depends on the global scope. We need to avoid such dependency by injecting a required logic inside the handler. This is achieved by extending the handler's class.

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
  def do_GET(self):
    self.__process()

  def do_POST(self):
    self.__process()

  def __process(self):
    self.__setupLayout()
    self.logic()

  def __setupLayout(self):
    self.send_response(200)
    self.send_header("Content/Type", "text/plain")
    self.end_headers()

def tellTheDate(handler):
  import time
  currentDate = time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime())
  handler.wfile.write(currentDate)

def main():
  host = ("localhost", 8080)
  handlerCls = RequestHandler

  handlerCls.logic = tellTheDate

  server = BaseHTTPServer.HTTPServer(host, handlerCls)
  server.handle_request()

After such modification the logic could be easily interchanged and could pretend as handler's built-in method. But since now it starts to break general encapsulation of extended BaseHTTPRequestHandler: the outer function has to know inner details of the class; e.g. to know how response is sent to the client. Also the implementation of handler is located across multiple locations: in the function with the logic and in the handler itself.

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
  def __init__(self, tellTheDate, *args):
    self.tellTheDate = tellTheDate
    BaseHTTPServer.BaseHTTPRequestHandler.__init__(self, *args)
  
  def do_GET(self):
    self.__process()

  def do_POST(self):
    self.__process()

  def __process(self):
    self.__setupLayout()
    self.tellTheDate(self.wfile)

  def __setupLayout(self):
    self.send_response(200)
    self.send_header("Content/Type", "text/plain")
    self.end_headers()

def tellTheDate(output):
  import time
  currentDate = time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime())
  output.write(currentDate)

def handleRequestsUsing(tellTheDateLogic):
  return lambda *args: RequestHandler(tellTheDateLogic, *args)

def main():
  host = ("localhost", 8080)

  handler = handleRequestsUsing(tellTheDate)
  server = BaseHTTPServer.HTTPServer(host, handler)
  server.handle_request()

The code above shows that a postponed (aka lazy) initialization mixed with Python's ability to setup a context where the function runs, makes the desired possible.

Friday, November 11, 2011

Python, make ConfigParser aware of spaces

There is a wonderful Python's module called ConfigParser which allows to process .ini-style configuration files easily. I prefer to use it everywhere rather than spend the time to implement my own solution. Recently there was a bug received that values with leading and trailing spaces are read incorrectly: spaces are lost. This might be important for cases when an application is sensitive for such values; e.g.: for passwords.

It was discovered that current Python's ConfigParser implementation cannot be tuned up not to strip values while reading a configuration. Also there was a corresponding issue found with an attached patch. Unfortunately the patch has not been applied to public available Python builds yet. Definitely it is absolute not convenient to patch Python everywhere where yours application is run.

The solution is not to lose leading and trailing spaces by wrapping them for quotes. Here is a helping code snippet to solve this issue:

class SpaceAwareConfigParser(ConfigParser.ConfigParser):
    def __init__(self, **args):
        KEEP_SPACES_KEYWORD = "keep_spaces"
        
        self.__keep_spaces = args.get(KEEP_SPACES_KEYWORD, True)
        args.pop(KEEP_SPACES_KEYWORD)

        ConfigParser.ConfigParser.__init__(self, **args)

    def get(self, section, option):
        value = ConfigParser.ConfigParser.get(self, section, option)
        if self.__keep_spaces:
            value = self._unwrap_quotes(value)

        return value

    def set(self, section, option, value):
        if self.__keep_spaces:
            value = self._wrap_to_quotes(value)

        ConfigParser.ConfigParser.set(self, section, option, value)        

    @staticmethod
    def _unwrap_quotes(src):
        QUOTE_SYMBOLS = ('"', "'")
        for quote in QUOTE_SYMBOLS:
            if src.startswith(quote) and src.endswith(quote):
                return src.strip(quote)

        return src

    @staticmethod
    def _wrap_to_quotes(src):
        if src and src[0].isspace():
            return '"%s"' % src

        return src

Overridden get() method removes quotes if any. So for ConfigParser's client that is transparent if the option has a value with quoted spaces or not; double and single quotes are supported. set() does vice versa.

So it is enough to replace instantination of ConfigParser in your Python's code just with SpaceAwareConfigParser one and it should work like expected.

Wednesday, April 28, 2010

'cp' command with a wget-like progress bar

Can not help re-posting: a tip and trick to make 'cp' command have a wget-like progress bar.

#!/bin/sh
cp_p()
{
   strace -q -ewrite cp -- "${1}" "${2}" 2>&1 \
      | awk '{
        count += $NF
            if (count % 10 == 0) {
               percent = count / total_size * 100
               printf "%3d%% [", percent
               for (i=0;i<=percent;i++)
                  printf "="
               printf ">"
               for (i=percent;i<100;i++)
                  printf " "
               printf "]\r"
            }
         }
         END { print "" }' total_size=$(stat -c '%s' "${1}") count=0
}

% cp_p /mnt/raid/pub/iso/debian/debian-2.2r4potato-i386-netinst.iso /dev/null
76% [===========================================>                    ]

Source: http://chris-lamb.co.uk/2008/01/24/can-you-get-cp-to-give-a-progress-bar-like-wget/

I think that's amazing idea!

Thursday, March 11, 2010

Code snippet: QLabel to show remote pixmap by URL

Perhaps one of the most often used widgets in Qt is QLabel. It is mainly used to display a plain text as well as rich one (which contains HTML markup). QLabel is also able to show graphics by passing QPixmap instance to it. But what if there is a need to display (using QLabel) a pixmap stored on remote server? There is no such "out of the box" functionality in QLabel. But it can be easily done. Here I want to show you how.

We just make a derivative class of QLabel and add only one new public method -- setRemotePixmap(const QString&) to let the label know that we want to display a remote graphics. Then the label composes HTTP GET request using QNetworkAccessManager class to retrieve the file. When the image is retrieved, it is showed up using setPixmap(const QPixmap&).

Let's see the source code.

Friday, February 26, 2010

QCryptographicHash: code snippet

  When the guys developing Qt applications find themselves in such a case when they need to obtain MD5 hash of a string or an array of bytes, the first thing most of them do is try to use the already implemented 3rd-party libraries like libxcrypt, OpenSSL and others. The talented ones try to get the output of 'md5sum' command.
  But the 'true' way is to use built-in tools: not everyone knows that Qt already has QCryptographicHash class which can help us. Let's see how:


QString password("our super secret password");
QByteArray hashed_password_ba(
     QCryptographicHash::hash(password.toAscii(),
                              QCryptographicHash::Md5));
/* Now 'md5_password' var contains
*  md5-hashed our super secret password 
*/
QString md5_password(hashed_password_ba.toHex().constData());

Very easy, isn't it? Apart from MD5, the class allows to generate MD4 and Sha1 hashes which may also be very helpful in some special cases.