The chapter provides a very brief introduction to HTML and CGI programming. HTML is a way to specify text formatting, including hypertext links to other pages on the World Wide Web. CGI is a standard for communication between a Web server that delivers documents and a program that computes documents for the server. There are many books on these subjects alone. CGI Developers Resource, Web Programming with Tcl and Perl (Prentice Hall, 1997) by John Ivler is a good reference for details that are left unexplained here.
A guestbook is a place for visitors to sign their name and perhaps provide other information. We will build a guestbook that takes advantage of the World Wide Web. Our guests can leave their address as a Universal Resource Location (URL). The guestbook will be presented as a page that has hypertext links to all these URLs so that other guests can visit them. The program works by keeping a simple database of the guests, and it generates the guestbook page from the database.
The Tcl scripts described in this chapter use commands and techniques that are described in more detail in later chapters. The goal of the examples is to demonstrate the power of Tcl without explaining every detail. If the examples in this chapter raise questions, you can follow the references to examples in other chapters that do go into more depth.
<TITLE>My Home Page</TITLE>The tags provide general formatting guidelines, but the browsers that display HTML pages have freedom in how they display things. This keeps the markup simple. The general syntax for HTML tags is:
<tag parameters>normal text</tag>As shown here, the tags usually come in pairs. The open tag may have some parameters, and the close tag name begins with a slash. The case of a tag is not considered, so <title>, <Title>, and <TITLE> are all valid and mean the same thing. The corresponding close tag could be </title>, </Title>, </TITLE>, or even </TiTlE>.
The <A> tag defines hypertext links that reference other pages on the Web. The hypertext links connect pages into a Web so you can move from page to page to page and find related information. It is the flexibility of the links that make the Web so interesting. The <A> tag takes an HREF parameter that defines the destination of the link. If you wanted to link to the Sun home page, you would put this in your page:
<A HREF="http://www.sun.com/">Sun Microsystems</A>When this construct appears in a Web page, your browser typically displays "Sun Microsystems" in blue underlined text. When you click on that text, your browser switches to the page at the address "http://www.sun.com/". There is a lot more to HTML, of course, but this should give you a basic idea of what is going on in the examples. The following list summarizes the HTML tags that will be used in the examples:
CGI for Dynamic Pages
There are two classes of pages on the Web, static and dynamic. A static page is written and stored on a Web server, and the same thing is returned each time a user views the page. This is the easy way to think about Web pages. You have some information to share, so you compose a page and tinker with the HTML tags to get the information to look good. If you have a home page, it is probably in this class.
puts "Content-Type: text/html"
puts ""
puts "<TITLE>The Current Time</TITLE>"
puts "The time is <B>[clock format [clock seconds]]</B>"The program computes a simple HTML page that has the current time. Each time a user visits the page they will see the current time on the server. The server that has the CGI program and the user viewing the page might be on different sides of the planet. The output of the program starts with a Content-Type line that tells your Web browser what kind of data comes next. This is followed by a blank line and then the contents of the page.
The clock command is used twice: once to get the current time in seconds, and a second time to format the time into a nice looking string. The clock command is described in detail on page 145. Fortunately there is no conflict between the markup syntax used by HTML and the Tcl syntax for embedded commands, so we can mix the two in the argument to the puts command. Double quotes are used to group the argument to puts so that the clock commands will be executed. When run, the output of the program will look like this:
Content-Type: text/html
<TITLE>The Current Time</TITLE>
The time is <B>Wed Oct 16 11:23:43 1996</B>This example is a bit sloppy in its use of HTML, but it should display properly in most Web browsers. The next example include all the required tags for a proper HTML document.
#!/bin/sh
# guestbook.cgi
# \
exec tclsh "$0" ${1+"$@"}
# Implement a simple guestbook page.
# The set of visitors is kept in a simple database.
# The newguest.cgi script will update the database.
#
source /usr/local/lib/cgilib.tcl
Cgi_Header "Brent's Guestbook" {BGCOLOR=white TEXT=black}
P
set datafile [file join \
[file dirname [info script]] guestbook.data]
if {![file exists $datafile]} {
puts "No registered guests, yet."
P
puts "Be the first [Link {registered guest!} newguest.html]"
} else {
puts "The following folks have registered in my GuestBook."
P
puts [Link Register newguest.html]
H2 Guests
catch {source $datafile}
foreach name [lsort [array names Guestbook]] {
set item $Guestbook($name)
set homepage [lindex $item 0]
set markup [lindex $item 1]
H3 [Link $name $homepage]
puts $markup
}
}
Cgi_End
source /usr/local/lib/cgilib.tcl
Cgi_Header {Brent's GuestBook} {bgcolor=white text=black}The Cgi_Header procedure takes as arguments the title for the page and some optional parameters for the HTML <Body> tag that set the page background and text color. Here we specify black text on a white background to avoid the standard grey background of most browsers. An empty default value is specified for the bodyparams so you do not have to pass those to Cgi_Header. Default values for procedure parameters are described on page 75.
proc Cgi_Header {title {bodyparams {}}} {
puts stdout \
"Content-Type: text/html
<HTML>
<HEAD>
<TITLE>$title</TITLE>
</HEAD>
<BODY $bodyparams>
<H1>$title</H1>"
}The Cgi_Header procedure just contains a single puts command that generates the standard boilerplate that appears at the beginning of the output. Note that several lines are grouped together with double quotes. Double quotes are used so that the variable references mixed into the HTML are substituted properly.
The output begins with the CGI content-type information, a blank line, and then the HTML. The HTML is divided into a head and body part. The <TITLE> tag goes in the head section of an HTML document. Finally, browsers display the title in a different place than the rest of the page, so I always want to repeat the title as a level-one heading (i.e., H1) in the body of the page.
if [file exists $datafile] {
Pputs "Be the first [Link {registered guest!} newguest.html]"
The P command generates the HTML for a paragraph break. This trivial procedure saves us a few keystrokes:
proc P {} {
puts <P>}
The Link command formats and returns the HTML for a hypertext link. Instead of printing the HTML directly, it is returned so you can include it in-line with other text you are printing:
proc Link {text url} {
return "<A HREF=\"$url\">$text</A>"
}The output of the program would be this if there were no data:
Content-Type: text/html
<HTML>
<HEAD>
<TITLE>Brent's Guestbook</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black>
<H1>Brent's Guestbook</H1>
<P>
No registered guests.
<P>
Be the first <A HREF="newguest.cgi">registered guest!</A>
</BODY>
</HTML>If the database file exists, then the real work begins. We first generate a link to the registration page, and a level-two header to separate that from the guest list:
puts [Link Register newguest.html]
H2 GuestsThe H2 procedure handles the detail of including the matching close tag:
proc H2 {string} {
puts "<H2>$string</H2>"
}
set datafile [file join \
[file dirname [info script]] guestbook.data]By using Tcl commands to represent the data, we can load the data with the source command. The catch command is used to protect the script from a bad data file, which will show up as an error from the source command. Catching errors is described in detail on page 73:
catch {source $datafile}The Guestbook variable is the array defined in guestbook.data. Array variables are the topic of Chapter 8. Each element of the array is defined with a Tcl command that looks like this:
set {Guestbook(Brent Welch)} {
http://www.beedub.com/
{<img src=http://www.beedub.com/welch.gif>}
}The person's name is the array index, or key. The value of the array element is a Tcl list with two elements: their URL and some additional HTML markup that they can include in the guestbook. Tcl lists are the topic of Chapter 5. The spaces in the name result in some awkward syntax that is explained on page 84. Do not worry about this now. We will see on page 40 that all the braces in the previous statement are generated automatically. The main point is that the person's name is the key, and the value is a list with two elements.
foreach name [lsort [array names Guestbook]] {Given the key, we get the value like this:
set item $Guestbook($name)The two list elements are extracted with lindex, which is described on page 57.
set homepage [lindex $item 0]
set markup [lindex $item 1]We generate the HTML for the guestbook entry as a level-three header that contains a hypertext link to the guest's home page. We follow the link with any HTML markup text that the guest has supplied to embellish their entry. The H3 procedure is similar to the H2 procedure already shown, except it generates <H3> tags;
H3 [Link $name $homepage]
puts $markup
Content-Type: text/html
<HTML>
<HEAD>
<TITLE>Brent's Guestbook</TITLE>
</HEAD>
<BODY BGCOLOR=white TEXT=black>
<H1>Brent's Guestbook</H1>
<P>
The following folks have registered in my guestbook.
<P>
<A HREF="newguest.cgi">Register</A>
<H2>Guests</H2>
<H3><A HREF="http://www.beedub.com/">Brent Welch</A></H3>
<IMG SRC="http://www.beedub.com/welch.gif">
</BODY>
</HTML>
The guestbook page contains a link to newguest.html. This page contains a form that lets a user register their name, home page URL, and some additional HTML markup. The form has a submit button. When a user clicks that button in their browser, the information from the form is passed to the newguest.cgi script. This script updates the database and computes another page for the user that acknowledges their contribution.
The newguest.html Form
An HTML form is defined with tags that define data entry fields, buttons, checkboxes, and other elements that let the user specify values. For example, a one-line entry field that is used to enter the home page URL is defined like this:
<INPUT TYPE=text NAME=url>The INPUT tag is used to define several kinds of input elements, and its type parameter indicates what kind. In this case, TYPE=text creates a one-line text entry field. The submit button is defined with a INPUT tag that has TYPE=submit, and the VALUE parameter becomes the text that appears on the button:
<INPUT TYPE=submit NAME=submit VALUE=Register>A general type-in window is defined with the TEXTAREA tag. This creates a multiline, scrolling text field that is useful for specifying lots of information, such as a free-form comment. In our case we will let guests type in HTML that will appear with their guestbook entry. The text between the open and close TEXTAREA tags is inserted into the type-in window when the page is first displayed.
<TEXTAREA NAME=markup ROWS=10 COLS=50>Hello.</TEXTAREA>A common parameter to the form tags is NAME=something. This name identifies the data that will come back from the form. The tags also have parameters that affect their display, such as the label on the submit button and the size of the text area. Those details are not important for our example. The complete form is shown in Example 3-8:
<!Doctype HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<HTML>
<HEAD>
<TITLE>Register in my Guestbook</TITLE>
<!-- Author: bwelch -->
<META HTTP-Equiv=Editor Content="SunLabs WebTk 1.0beta 10/ 11/96">
</HEAD>
<BODY>
<FORM ACTION="newguest.cgi" METHOD="POST">
<H1>Register in my Guestbook</H1>
<UL>
<LI>Name <INPUT TYPE="text" NAME="name" SIZE="40">
<LI>URL <INPUT TYPE="text" NAME="url" SIZE="40">
<P>
If you don't have a home page, you can use an email URL like "mailto:welch@acm.org"
<LI>Additional HTML to include after your link:
<BR>
<TEXTAREA NAME="html" COLS="60" ROWS="15">
</TEXTAREA>
<LI><INPUT TYPE="submit" NAME="new" VALUE="Add me to your guestbook">
<LI><INPUT TYPE="submit" NAME="update" VALUE="Update my guestbook entry">
</UL>
</FORM>
</BODY>
</HTML>
<FORM ACTION=newguest.cgi METHOD=POST>The CGI specification defines how the data from the form is passed to the program. The data is encoded and organized so that the program can figure out the values the user specified for each form element. The encoding is handled rather nicely with some regular expression tricks that are done in Cgi_Parse. Cgi_Parse saves the form data, and you use Cgi_Value to get a form value in your script. These procedures are described in Example 11-4 on page 129. Example 3-9 starts out by calling Cgi_Parse:
#!/bin/sh
# \
exec tclsh "$0" ${1+"$@"}
# source cgilib.tcl from the same directory as newguest.cgi
source [file join \
[file dirname [info script]] cgilib.tcl]
set datafile [file join \
[file dirname [info script]] guestbook.data]
Cgi_Parse
# Open the datafile in append mode
if [catch {open $datafile a} out] {
Cgi_Header "Guestbook Registration Error" \
{BGCOLOR=black TEXT=red}
P
puts "Cannot open the data file"
P
puts $out ;# the error message
exit 0
}
# Append a Tcl set command that defines the guest's entry
puts $out ""
puts $out [list set Guestbook([Cgi_Value name]) \
[list [Cgi_Value url] [Cgi_Value html]]]
close $out
# Return a page to the browser
Cgi_Header "Guestbook Registration Confirmed" \
{BGCOLOR=white TEXT=black}
puts "
<DL>
<DT>Name
<DD>[Cgi_Value name]
<DT>URL
<DD>[Link [Cgi_Value url] [Cgi_Value url]]
</DL>
[Cgi_Value html]
"
Cgi_EndThe main idea of the newguest.cgi script is that it saves the data to a file as a Tcl command that defines an element of the Guestbook array. This lets the guestbook.cgi script simply load the data by using the Tcl source command. This trick of storing data as a Tcl script saves us from the chore of defining a new file format and writing code to parse it. Instead, we can rely on the well-tuned Tcl implementation to do the hard work for us efficiently.
The script opens the datafile in append mode so it can add a new record to the end. Opening files is described in detail on page 101. The script uses a catch command to guard against errors. If an error occurs, a page explaining the error is returned to the user. Working with files is one of the most common sources of errors (permission denied, disk full, file-not-found, and so on), so I always open the file inside a catch statement:
if [catch {open $datafile a} out] {
# an error occurred
} else {
# open was ok
}In this command, the variable out gets the result of the open command, which is either a file descriptor or an error message. This style of using catch is described in detail in Example 6-14 on page 71.
puts $out [list set Guestbook([Cgi_Value name]) \
[list [Cgi_Value url] [Cgi_Value html]]]There are two lists. First the url and html are formatted into one list. This list will be the value of the array element. Then, the whole Tcl command is formed as a list. In simplified form, the command is generated from this:
list set variable valueUsing the list command ensures that the result will always be a valid Tcl command that sets the variable to the given value. The list command is described in more detail on page 55.
Next Steps
There are a number of details that could be added to this example. A user may want to update their entry, for example. They could do that now, but they would have to retype everything. They might also like a chance to check the results of their registration and make changes before committing them. This requires another page that displays their guest entry as it would appear on a page, and also has the fields that let them update the data.