Deployment Recipes

There are multiple ways to deploy web2py in a production environment; the details depend on the configuration and the services provided by the host.

In this chapter we consider the following issues:

• Configuration of production-quality web servers (Apache, Lighttpd, Cherokee)

• Security Issues

• Scalability issues

• Deployment on the Google App Engine (GAE [12])

web2py comes with an SSL [20] enabled web server, the CherryPy ws-giserver [21]. While this is a fast web server, it has limited configuration capabilities. For this reason it is best to deploy web2py behind Apache [71], Lighttpd [75] or Cherokee [76]. These are free and open-source web servers that are customizable and have been proven to be reliable in high traffic production environments. They can be configured to serve static files directly, deal with HTTPS, and pass control to web2py for dynamic content.

WEB2PY: Enterprise Web Framework /2nd Ed.. By Massimo Di Pierro Copyright © 2009

Until a few years ago, the standard interface for communication between web servers and web applications was the Common Gateway Interface (CGI) [70]. The main problem with CGI is that it creates a new process for each HTTP request. If the web application is written in an interpreted language, each HTTP request served by the CGI scripts starts a new instance of the interpreter. This is slow, and it should be avoided in a production environment. Moreover, CGI can only handle simple responses. It cannot handle, for example, file streaming.

web2py provides a file modpythonhandier.py to interface to CGI.

One solution to this problem is to use the mocLpython module for Apache. mod_python starts one instance of the Python interpreter when Apache starts, and serves each HTTP request in its own thread without having to restart Python each time. This is a better solution than CGI, but it is not an optimal solution, since mod_python uses its own interface for communication between the web server and the web application. In mod_python, all hosted applications run under the same user-id/group-id, which presents security issues.

web2py provides a file cgihandier .py to interface to mod_python.

In the last few years, the Python community has come together behind a new standard interface for communication between web servers and web applications written in Python. It is called Web Server Gateway Interface (WSGI) [17, 18]. web2py was built on WSGI, and it provides handlers for using other interfaces when WSGI is not available.

Apache supports WSGI via the module mod_wsgi [74] developed by Graham Dumpleton.

web2py provides a file wsgihandier.py to interface to WSGI.

Some web hosting services do not support mod_wsgi. In this case, we must use Apache as a proxy and forward all incoming requests to the web2py built-in web server (running for example on localhost:8000).

In both cases, with mod_wsgi and/ormod_proxy, Apache can be configured to serve static files and deal with SSL encryption directly, taking the burden off web2py.

The Lighttpd web server does not currently support the WSGI interface, but it does support the FastCGI [77] interface, which is an improvement over CGI. FastCGI's main aim is to reduce the overhead associated with interfacing the web server and CGI programs, allowing a server to handle more HTTP requests at once.

According to the Lighttpd web site, "Lighttpd powers several popular Web 2.0 sites such as YouTube and Wikipedia. Its high speed IO-infrastructure allows them to scale several times better with the same hardware than with alternative web-servers". LighttpdwithFastCGIis, in fact, faster than Apache with mocLwsgi.

web2py provides a file fcgihandier.py to interface to FastCGI.

web2py also includes a gaehandier.py to interface with the Google App Engine (GAE). On GAE, web applications run "in the cloud". This means that the framework completely abstracts any hardware details. The web application is automatically replicated as many times as necessary to serve all concurrent requests. Replication in this case means more than multiple threads on a single server; it also means multiple processes on different servers. GAE achieves this level of scalability by blocking write access to the file system and all persistent information must be stored in the Google BigTable datastore or in memcache.

On non-GAE platforms, scalability is an issue that needs to be addressed, and it may require some tweaks in the web2py applications. The most common way to achieve scalability is by using multiple web servers behind a load-balancer (a simple round robin, or something more sophisticated, receiving heartbeat feedback from the servers).

Even if there are multiple web servers, there must be one, and only one, database server. By default, web2py uses the file system for storing sessions, error tickets, uploaded files, and the cache. This means that in the default configuration, the corresponding folders have to be shared folders:

—;—

'— —

a

0

0

0

DB Shared Folder

Samba or NFS (sessions, errors, cache, uploads)

DB Shared Folder

Samba or NFS (sessions, errors, cache, uploads)

In the rest of the chapter, we consider various recipes that may provide an improvement over this naive approach, including:

• Store sessions in the database, in cache or do not store sessions at all.

• Store tickets on local filesystems and move them into the database in batches.

• Use memcache instead of cache.ram and cache.disk.

• Store uploaded files in the database instead of the shared filesystem.

While we recommend following the first three recipes, the fourth recipe may provide an advantage mainly in the case of small files, but may be counterproductive for large files.

Was this article helpful?

+1 -1

Responses

  • cassandra
    Why use cherokee as a proxy server with web2py?
    8 years ago

Post a comment