Wednesday, May 29, 2013

Pass arguments to BaseHTTPRequestHandler

Each time when I face with the Python's built-in web-server (BaseHTTPServer) I feel a pain. The pain is caused by a strange architectural decision to pass a class as a request handler not an instance. At first glance it does not matter what kind of entity to pass. Since you can extend the default implementation with your logic. It still does not matter until a some moment. This moment happens when you want to have an externally configurable handler and/or you have to inject a bunch of settings. Currently there is no way to do it easily.

There is no easy way since the developer who created such design more likely was fell in love with Template Method pattern or was affected by some forbidden stuff :). Let's take a brief look how current Python's BaseHTTPRequestHandler's implementation works. Then let's try to answer the question how to pass arguments to BaseHTTPRequestHandler?

When a request comes to the server, the server creates an instance of BaseHTTPRequestHandler class. The newly created instance is initialized with a received request in raw format (say, as a plain not yet parsed text; when it comes finally to our handler, it is already split to the headers, body etc.). BaseHTTPRequestHandler's constructor dispatches an inner method (call it process_request()) responsible for an initial request handling; e.g. to determine a kind of the request (GET/POST/etc). After the request is recognized, a corresponding method do_[GET/POST/DELETE/HEAD/PUT]() is called from the self.

How BaseHTTPServer interacts with the handler


Seems very straightforward. But the following approach at least breaks the rule that one function should do one thing only. Constructor is responsible for object construction but not for serving business logic.

Let's see the following code. The handler is supposed to output current time and date in some format. With the current implementation the task could be implemented as:

def tellTheDate():
  import time
  return time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime())

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
  def do_GET(self):
    self.__process()

  def do_POST(self):
    self.__process()

  def __process(self):
    self.__setupLayout()
    self.wfile.write(tellTheDate())

  def __setupLayout(self):
    self.send_response(200)
    self.send_header("Content/Type", "text/plain")
    self.end_headers()

def main():
  host = ("localhost", 8080)
  server = BaseHTTPServer.HTTPServer(host, RequestHandler)
  server.handle_request()

The problem in the code above that it depends on the global scope. We need to avoid such dependency by injecting a required logic inside the handler. This is achieved by extending the handler's class.

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
  def do_GET(self):
    self.__process()

  def do_POST(self):
    self.__process()

  def __process(self):
    self.__setupLayout()
    self.logic()

  def __setupLayout(self):
    self.send_response(200)
    self.send_header("Content/Type", "text/plain")
    self.end_headers()

def tellTheDate(handler):
  import time
  currentDate = time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime())
  handler.wfile.write(currentDate)

def main():
  host = ("localhost", 8080)
  handlerCls = RequestHandler

  handlerCls.logic = tellTheDate

  server = BaseHTTPServer.HTTPServer(host, handlerCls)
  server.handle_request()

After such modification the logic could be easily interchanged and could pretend as handler's built-in method. But since now it starts to break general encapsulation of extended BaseHTTPRequestHandler: the outer function has to know inner details of the class; e.g. to know how response is sent to the client. Also the implementation of handler is located across multiple locations: in the function with the logic and in the handler itself.

class RequestHandler(BaseHTTPServer.BaseHTTPRequestHandler):
  def __init__(self, tellTheDate, *args):
    self.tellTheDate = tellTheDate
    BaseHTTPServer.BaseHTTPRequestHandler.__init__(self, *args)
  
  def do_GET(self):
    self.__process()

  def do_POST(self):
    self.__process()

  def __process(self):
    self.__setupLayout()
    self.tellTheDate(self.wfile)

  def __setupLayout(self):
    self.send_response(200)
    self.send_header("Content/Type", "text/plain")
    self.end_headers()

def tellTheDate(output):
  import time
  currentDate = time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime())
  output.write(currentDate)

def handleRequestsUsing(tellTheDateLogic):
  return lambda *args: RequestHandler(tellTheDateLogic, *args)

def main():
  host = ("localhost", 8080)

  handler = handleRequestsUsing(tellTheDate)
  server = BaseHTTPServer.HTTPServer(host, handler)
  server.handle_request()

The code above shows that a postponed (aka lazy) initialization mixed with Python's ability to setup a context where the function runs, makes the desired possible.

1 comment:

  1. Related, possibly simpler solution: https://mail.python.org/pipermail/python-list/2012-March/621727.html

    ReplyDelete