The magazine of the Melbourne PC User Group

A Snake In the Web: Python and CGI
Myles Strous

CGI has been used as an abbreviation for a number of descriptive phrases. While Python has been involved in a range of Computer Generated Imagery topics (e.g. the Lightflow Rendering Interface by Jacopo Pantaleoni, at http://www.lightflowtech.com/), in this case I am referring to the Common Gateway Interface; the standard method for "interfacing external applications with information servers, such as HTTP or Web servers" (from the National Center for Supercomputing Applications definition at http://hoohoo.ncsa.uiuc.edu/cgi/intro.html. NCSA maintains the current CGI specs and was responsible for some important Web developments, such as the early Web browser NCSA Mosaic and the Web server NCSA httpd, which may be familiar to Web pioneers). 

Most languages can be used for basic CGI work - the only real requirement is to recognise standard input (stdin) and standard output (stdout). However, Python provides a number of advantages: cross-platform portability (should you ever wish to change your Web server operating system), source code readability (and thus maintainability), extensive libraries enabling rapid Web development eg. "batteries are included" for communicating with Web servers, mail servers, FTP servers, reading XML, and other useful tasks.

Python doesn't have to be used in a traditional CGI manner to be used for Web work - these days techniques based on Object Request Broker technology allow Python objects to be "published" directly on the Web without the need for CGI or HTTP specific code. One major application server gaining in popularity is Zope, the Z Object Publishing Environment, written mostly in Python.
 
Zope makes it easy to maintain an extensive Web site with a standardised look, standard headers and footers, departmental permission controls, and the ability to add customised code, database interactivity, and portal applications. Zope may be found at http://www.zope.com/. The product is free, although support, consultation, or customisation will cost money, a common paradigm emerging in Open Source environments.

It is possible to use Zope for many things without writing any Python code at all, but if you want to extend its functionality then Python is a natural choice.

If you think Zope is overkill for your needs, you might wish to look at Poor Man's Zope, or PMZ, by Andreas Jung (see http://www.suxers.de/python/pmz.htm). This technique is similar to Microsoft's Active Server Pages (ASP) or the PHP Web development language, but can be run on any server and is implemented in pure Python. A sample of Python code embedded in HTML using PMZ is shown in Listing 1, producing the page shown in Figure 1. PMZ requires you to configure your server to recognize that .pmz files need to be passed to the main PMZ Python script and to modify the main PMZ script with the location of permitted .pmz files.

<html><body>

<h1>Testing Poor Man's Zope</h1>

<pmz>

for n in range(1,6):
print "<H%d>Testing</H%d>" % (n, n)

</pmz>

</body></html>

Listing 1. Using Python, ASP-style, in Poor Man's Zope


Figure 1. Web page produced by PMZ code

Alternatively, after appropriate configuration, Python can also be used within standard ASP on a Microsoft server.

As in normal Python usage, indentation whitespace is significant and is used for block delimiting syntax in such Python source code "embedded" in HTML Web pages.

Dyed-in-the-wool programmers may prefer to use something like HTMLgen, "a class library for the generation of HTML documents with Python scripts", written by Robin Friedrich (available at http://starship.python.net/lib.html). This enables you to generate HTML with code such as the snippet shown in Listing 2, which generates a simple HTML table.

mytable = HTMLgen.Table('Table caption')
myheaders = ['column heading 1', 'column heading 2', 'column heading 3']
mytable.heading = myheaders
mylist = ['data cell one', 'data cell two','data cell three']
mytable.body = [mylist]
print mytable

Listing 2. Generating HTML from HTMLgen

Alternatively for simple tasks you can just generate HTML directly from Python code - two snippets for generating a similar table are shown in Listing 3.

# double percent symbol indicates literal % character
# single percent symbol used for string substitution
print '<TABLE BORDER=1 WIDTH=90%%>'
print '<TR>'

mylist = ['one', 'two','three']
for eachitem in mylist :
print '<TD>',eachitem,'</TD>'

print '</TR>'
print '</TABLE>'

# or alternatively,
# using a template and string substitution

mylist = ['one', 'two','three']

# construct a long string of data cells, including HTML
datacells = "" # a blank/empty string
for eachitem in mylist:
datacells = datacells + '<TD>%s</TD>' % eachitem

# use a triple-quoted multiline "template"
print '''
<TABLE BORDER=1 WIDTH=90%%>
<TR>
%s
</TD>
</TABLE>
''' % datacells
# the "datacells" variable substitutes into the %s string marker

Listing 3. Generating HTML code directly

The use of string "addition" as shown in the second part of Listing 3 is not the most efficient or Pythonic way of adding multiple short strings, but for the purposes of this article is perhaps easier to understand. Note that Listings 2 and 3 are only snippets from larger programs, not complete CGI scripts in themselves.

A number of Web servers are available if you want to test CGI scripts on your own computer, before using your script on a live server.

Microsoft's Personal Web Server is supplied as an extra with various versions of Windows, with Microsoft Frontpage, and was available on Microsoft's Web page.

Xitami (available at http://www.xitami.com/) is a relatively small download but has a surprising range of features, including CGI and Server Side Include functionality.

Listing 4 is a brief guide to setting up Python CGI files on Windows on either of these two servers. It is possible to have several Web servers installed on your PC at once, but it is probably not wise to have more than one running at once unless you know what you are doing with port numbers and such.

Microsoft Personal Web Server (details summarised from a more informative page by Aaron Watters at http://starship.python.net/crew/aaron_watters/pws.html)
Run regedit and go to
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W3SVC\Parameters\ScriptMap, add a new string value named ".cgi" with a datastring of "c:\<path to python>\python.exe -u %s %s", then restart your PC
This line will enable you to call your Python script with a simple reference like http://yourserver/scripts/yourscript.cgi, provided "scripts" is an "executable" directory for your server (the standard PWS "scripts" directory is executable by default). The "-u" flag specifies unbuffered binary mode, required for some Windows operations.
Xitami (for Windows)
Just put your scripts in a cgi-bin directory with a first line of "#! python -u", assuming python.exe is on your path.

Listing 4. Using Python for CGI on Windows

Whatever Web server you are using on your personal PC, I'd recommend you change the settings for it to start manually rather than default to automatically starting with your PC, and only run it when you are not online, assuming you are only using it for CGI testing.

If you are going to make any Web server available to the outside world, I recommend you regularly check the home page for the Web server supplier, to keep up with any security patches and updates. You may also wish to investigate firewall programs, if you don't already use one, for all your time online.

If you are more interested in playing with server code rather than CGI code, the libraries distributed with the standard Python installation include the source to a simple Web server.
 
Testing CGI code in any language can be a trial - when a program called by a Web form or URL does something wrong, it often returns a terse message to the browser something like "Error 500 : Server error" - not very helpful. If you're lucky, error messages returned by the CGI program will be in the server's error log file, but in some cases they will be discarded, leaving you with no visible clue as to the error.

One way to capture these errors, as documented in the Python Library Reference, is to put all your CGI code into a try/except loop, and redirect traceback errors to the Web page (see Listing 5, example output shown in Figure 2)

#! python -u
import sys, traceback
print 'Content-type: text/html'
print
sys.stderr = sys.stdout
try : 
# your CGI code goes here
except :
print "\n\n<PRE>"
traceback.print_exc()

Listing 5. Catching CGI errors
 with exceptions and traceback

#! python -u
import cgitb
try:
# your CGI code goes here
except:
cgitb.handler()

Listing 6. Catching CGI errors using cgitb.py from Ka-Ping Yee



Figure 2. Sample of CGI traceback shown in browser

If you want to get a little more sophisticated, you can extend this slightly to e-mail a copy of these these errors to yourself (e-mail scripting is covered later in this article), so that if a CGI script you maintain runs into a complication you hadn't planned on, you will get automatic e-mail notification of errors from your Web page.
 
As Tim Peters, probably second in the Python hierarchy to Python's creator, Guido van Rossum, once said in his partially serious Zen of Python, "Errors should never pass silently. Unless explicitly silenced." (you can normally find the Zen of Python at http://www.python.org/doc/Humor.html, although at the time of writing that server was suffering hardware problems, so you can see it instead at http://python.mirrors/netnumina.com/doc/Humor.html).

With the release of the CGI traceback module from Ka-Ping Yee, available at http://web.lfw.org/python/, you can add a simple bit of code (see Listing 6) to get nicely coloured and formatted traceback details with the values of any variables at the time of an error occurring (see Figure 3).


Figure 3. Formatted CGI traceback using Ka-Ping Yee's cgitb module

If you don't want to be entering the same data into a Web form over and over again while testing, you can fake the form input with a bit of code I cobbled together for testing, presented in Listing 7. You can also put the information directly into the URL, such as "http://aserver.com.au/cgi-bin/myscript.cgi?year=1999&town= Melbourne", but I find this quickly gets tedious when passing numerous values for testing.

#! python -u
testing = 1 # 1 is true, 0 is false
if testing :
    # fake the HTML form input
    class fakefield:
        def __init__(self, formvalue):
             self.value = formvalue
    yearvalue = fakefield("1999")
    townvalue = fakefield("Melbourne")
    form = {"year":yearvalue, 
                              "town":townvalue}
else : 
    # get the information from the
                  HTML form
    form = cgi.FieldStorage()
# process the information (code not
                   shown)

Listing 7. Faking input from a HTML form


Figure 4. Drop-down links list calling a CGI script

As well as generating repetitive HTML code, Python is useful for all sorts of little CGI utilities. For example, a useful feature on a Web page may be a drop-down box for a set of links, rather than providing a space-hungry full list of links (see Figure 4). Listing 8 shows simple Python code to go behind such a feature. Alternatively, you may provide more detailed HTML, such as the redirection function shown in Listing 9, combined from several such scripts discussed on the comp.lang.python newsgroup.

#! python -u

import cgi
form = cgi.FieldStorage()

if form.has_key('destination') :
destination = form['destination'].value
print 'Location:', destination
print
else :
# Error function not shown in this example
Error('no destination given')

Listing 8. CGI script to jump to another web page

 

def redirect(url):
    print 'Status: 302 Found'
    print 'Uri:', url
    print 'Location:', url
    print 'Pragma: no-cache'
    print 'Content-type: text/html'
    print
    print '<html><head>'
    print '<title>Redirect (302)</title>'
    print '</head><body>'
    print '<h1>Redirecting to new location</h1>'
                          new_url = cgi.escape(url, 1)
    print '<a href="%s">%s</a>' % (new_url, new_url)
    print '</body>'
    print '</html>'

Listing 9. Extract from a redirection script

A common use for CGI is to provide the back end for a Web-based e-mail submission form. Listing 10 shows a Python CGI file to send e-mail from a Web form via an SMTP server (SMTP = simple mail transfer protocol). Note that the communication with the SMTP server is four relatively simple lines of code. A complete CGI program will probably also include code for error handling e.g. user not entering all the required fields. Note that the brief examples shown in this article do not include significant data validation, error handling, or unit testing routines.

import cgi
# get the information from the HTML form
form = cgi.FieldStorage()
# create a dictionary to store information
mail_info = {}
# fill the dictionary from the form data
mail_info['from_address'] = form['sender'].value
mail_info['to_address'] = form['destination'].value
mail_info['subject'] = form['subject'].value
mail_info['message_body'] = form['messagebody'].value

# compose it into standard SMTP format
# fill the values with information from the dictionary
message = '''From: %(from_address)s
To: %(to_address)s
Subject: %(subject)s

%(message_body)s
''' % mail_info

# send it via the mail server
import smtplib
server=smtplib.SMTP('smtp.aserver.com.au')
server.sendmail(mail_info['from_address'], [mail_info['to_address']], message)
server.quit()

# create a web page confirmation for the sender
# (code not shown)

Listing 10. Sending email from a web page via SMTP.

One slightly unusual circumstance I ran into recently involved the creation of Web pages that used Server Side Include (SSI) techniques to include a standard header on all Web pages produced within an institution. The header snippet was centrally maintained and stored on a central Web server. When using Web pages on the same server, including the header was as simple as including a line in the HTML code stating '<!-#include virtual="/ssi/header.html" ->'.

Problems arose when one department wanted to use a customised version of the standard header - one title image and the link from it needed to be changed. Moreover, they wanted to use it on a different Web server, and they didn't want to manually maintain the non-customised portions (other links and images) when they were changed on the central server.

The main initial problem was getting the standard header snippet from the central server. Server Side Include technology just doesn't allow you to include snippets from another server. Python to the rescue - using the urllib library included with Python, I could easily grab the header from the server (see the snippet in Listing 11). It could then be modified using regular expressions and Python's string routines to provide the standard header with customised portions (code not shown in this article) before outputting it to the Web page. It is possible to call a CGI program from a Server Side include command, so with the appropriate code, the SSI line given previously could be replaced with a line stating '<!-#include virtual="/ssi/header.cgi" ->', which now gets the standard header from a different server and modifies it if necessary.

def getHTML(url) :
# open up a connection with the web server
import urllib
connection = urllib.urlopen(url) 
# get the contents of the given URL
htmlText = connection.read()
connection.close()
return htmlText

Listing 11. Getting a HTML file from another server

Another relatively common use for CGI code, at least where I work, is to query a database and return the results as a Web page. This becomes relatively easy with Python, no matter what the platform. Listing 12 shows a snippet demonstrating querying a MySQL database. MySQL is an established Open Source database, available from http://www.mysql.com> for a number of platforms, including Windows. This snippet uses the MySQLdb interface module written by Andy Dustman (available at http://sourceforge.net/projects/mysql-python, compiled Windows version provided at http://www.cs.fhm.edu/~ifw000065/ by Gerhard H„ring). Similar code could be used with the appropriate interface module for many of the other major server-based database engines, both commercial or Open Source. If you wish to connect to a specific database, try looking at http://www.python.org/topics/database/ or http://dmoz.org/Computers/Programming/Languages/Python/Modules/Database/ for an appropriate interface module.

#! python -u

# specify the database module
import MySQLdb

# get details to search on 
# (code not shown)

# connect to database
connection = MySQLdb.connect(host="localhost", \
     user="nobody", db="Mybooks",port=3306)
cursor = connection.cursor()

# form the search details into an SQL search string
SQLcommandstring = \
"""SELECT *
FROM Fiction
WHERE author1 = '%s' and author2 = '%s'
""" % (author1, author2)

# do the search
cursor.execute(SQLcommandstring)
results = cursor.fetchall() # returns list of tuples

connection.close()

# do something with the results (code not shown)

Listing 12. Querying a MySQL database

Alternatively, Gadfly is an SQL database implemented in pure Python code by Aaron Watters, which should therefore run on the extensive list of platforms that Python will run on. (Gadfly is available at http://www.chordate.com/gadfly.html)

It is possible to have the Python CGI code on either the same machine or a different machine to the MySQL database server. Python code querying a MySQL server could also be either part of a CGI script returning results to a Web page, or part of a client program returning results via a graphical interface.

#! python -u

import Tkinter

def hello(output):

    if output == "HTML":
        # output HTML code to stdout, for CGI purposes
        print 'Content-type: text/html'
        print
        print """<HTML>
       <HEAD><TITLE>Hello world !</TITLE></HEAD>
       <BODY>
       <H1>Hello, world !</H1>
       </BODY></HTML>
       """

    elif output == "GUI" :
        # output using Tkinter GUI toolkit
       class helloGUI:
       def __init__(self, master):
       self.main = Tkinter.Label(master, text="Hello, World !", \
       font=("Arial", 30)).pack()
       root = Tkinter.Tk() # define a root GUI object/window
       root.protocol( "WM_DELETE_WINDOW", root.quit )
       app = helloGUI(root) # create an instance of the GUI class
       root.mainloop() # start the GUI event loop

    elif output == "CLI" :
       # output to stdout without HTML
       print "Hello, world !"

    else: 
       # write to a file or stream
       # EOL automatically appended to print statements
       print >> outfile, "Hello, world !"
      # or we could have said this
      # outfile.write("Hello, world !\n") 

# if we're running this script as the main program
if __name__ == "__main__" : 
     if len(sys.argv) == 1 :
         # script name provided without arguments, run as CGI
         hello("HTML")
    else:
        hello(sys.argv[1])

Listing 13. Python program with 3 interface choices

Listing 13 is a simple version of a "Hello World!" program that writes its output as HTML if run with an argument of "HTML", to stdout as simple text if run with an argument of "CLI" (command line interface), puts up a graphical interface if run with an argument of "GUI", or otherwise assumes the argument given is an open file or stream, and writes the output to it. This script is also written in a way that it may be imported by another Python program which can call its main routine.

I have used a variation of this technique to write a Python conversion utility for a certain graphical file format - the same Python program may be run from a Web page as a CGI script, compiled to a Windows executable, run as an interpreted GUI script on a Macintosh, or run in command-line form within a DOS/Windows batch file, or from a Unix/Linux or BeOS prompt. The GUI and CGI/HTML versions provide file browsing dialogues for input and output, while the command-line version requires the filename to be supplied as an argument when calling the script. It is also possible to write code that will automatically detect that it is running as a CGI script e.g. by testing for the environmental variable "REMOTE_ADDR".

It is possible to build on this even further, so the script could offer a choice of GUI toolkits at runtime, or put up a variety of Web pages, providing several Web pages as a program interface rather than a traditional GUI interface.

Listing 14 shows a simple demonstration of this - a CGI script that will put up either of two different Web pages. If called directly without arguments it will put up a query form, asking for an age in years.

#! python -u

import cgi

def HTMLheader(title):
      print "Content-type: text/html"
      print
      print """<HTML>
      <HEAD>
          <TITLE>%s</TITLE>
     </HEAD>
     <BODY>
     <H1>%s</H1>""" % (title, title)

def HTMLfooter():
     print "</BODY></HTML>"

def dog_years(age):
     HTMLheader("Your age in dog years:")
     print age*7
     HTMLfooter()

def ask_age():
     HTMLheader("What is your age ?")
     print """
     <FORM METHOD=POST ACTION="myscript.cgi" NAME="age">
     <INPUT TYPE=TEXT SIZE=10 MAXLENGTH=10>
     <INPUT TYPE=SUBMIT>
     </FORM>"""
     HTMLfooter()

form = cgi.FieldStorage()
# which web page do we show ?
if form.has_key("age"):
      years = int(form["age"].value)
      dog_years(years)
else:
     ask_age()

Listing 14. One Python script, two web pages

If called from a HTML form with arguments (either from the previous Web page, using the POST submission format, or directly using the GET submission format as a URL of the form http://myserver/mydirectory/myscript.cgi?age=21), it will instead put up a response, giving that number in "dog years" (see Figure 6).


Figure 6. Two web pages for the price of one Python script

If you are interested in further details on using Python for Web work, you might want to start browsing at http://www.python.org/topics/web/.
 
I'll leave you with a quote from Mark Jackson, another Python user: "Python - why settle for snake oil when you can have the *whole* snake ?"
 
About the Author
Amongst other duties, Myles works in a PC support role and provides technical support for a number of small Web-based projects. His hobbies include a range of computer graphics interests and programming in Python.


Reprinted from the December 2001 issue of PC Update, the magazine of Melbourne PC User Group, Australia