set d [exec date]The standard output of the program is returned as the value of the exec command. However, if the program writes to its standard error channel or exits with a non-zero status code, then exec raises an error. If you do not care about the exit status, or you use a program that insists on writing to standard error, then you can use catch to mask the errors:
catch {exec program arg arg} resultThe exec command supports a full set of I/O redirection and pipeline syntax. Each process normally has three I/O channels associated with it: standard input, standard output, and standard error. With I/O redirection you can divert these I/O channels to files or to I/O channels you have opened with the Tcl open command. A pipeline is a chain of processes that have the standard output of one command hooked up to the standard input of the next command in the pipeline. Any number of programs can be linked together into a pipeline.
set n [exec sort < /etc/passwd | uniq | wc -l 2> /dev/null]Example 9-1 uses exec to run three programs in a pipeline. The first program is sort, which takes its input from the file /etc/passwd. The output of sort is piped into uniq, which suppresses duplicate lines. The output of uniq is piped into wc, which counts the lines. The error output of the command is diverted to the null device to suppress any error messages.
Table 9-1 provides a summary of the syntax understood by the exec command.
Note that a trailing & causes the program to run in the background. In this case the process identifier is returned by the exec command. Otherwise, the exec command blocks during execution of the program and the standard output of the program is the return value of exec. The trailing newline in the output is trimmed off, unless you specify -keepnewline as the first argument to exec.
The auto_noexec Variable
The Tcl shell programs are set up during interactive use to attempt to execute unknown Tcl commands as programs. For example, you can get a directory listing by typing:
lsinstead of:
exec lsThis is handy if you are using the Tcl interpreter as a general shell. It can also cause unexpected behavior when you are just playing around. To turn this off, define the auto_noexec variable:
set auto_noexec anything
The good news is that Windows 95 and Windows NT clean up most of these problems. Windows NT 4.0 is the most robust.
Cross-Platform File Naming
Files are named differently on UNIX, Windows, and Macintosh. UNIX separates file name components with a forward slash (/), Macintosh separates components with a colon (:), and Windows separates components with a backslash (\). In addition, the way that absolute and relative names are distinguished is different. For example, these are absolute pathnames for the Tcl script library (i.e., $tcl_library) on Macintosh, Windows, and UNIX, respectively:
c:\Program Files\Tcl\lib\Tcl7.6/usr/local/tcl/lib/tcl7.6
The good news is that Tcl provides operations that let you deal with file pathnames in a platform-independent manner. The file operations described in this chapter allow either native format or the UNIX naming convention. The backslash used in Windows pathnames is especially awkward because the backslash is special to Tcl. Happily, you can use forward slashes instead:
c:/Program Files/Tcl/lib/Tcl7.6There are some ambiguous cases that can only be specified with native pathnames. On my Macintosh, Tcl and Tk are installed in a directory that has a slash in it. You can only name it with the native Macintosh name:
Disk:Applications:Tcl/Tk 4.2Another construct to watch out for is a leading // in a file name. This is the Windows syntax for network names that reference files on other computers. You can avoid accidentally constructing a network name by using the file join command described next. Of course, you can use network names to access remote files.
If you have to communicate with external programs, you may need to construct a file name in the native syntax for the current platform. You can construct these names with file join described later. You can also convert a UNIX-like name to a native name with file nativename.
Several of the file operations operate on pathnames as opposed to returning information about the file itself. You can use the dirname, extension, join, pathtype, rootname, split, and tail operations on any string; there is no requirement that the pathnames refer to an existing file.
set file $tcl_library/init.tclUse file join to construct file names.
The platform-independent way to construct file names is with file join. The following command returns the name of the init.tcl file in native format:
set file [file join $tcl_library init.tcl]The file join operation can join any number of path name components. In addition, it has the feature that an absolute pathname overrides any previous components. For example (on UNIX), /b/c is an absolute pathname, so it overrides any paths that come before it in the arguments to file join:
file join a b/c d=> a/b/c/d
file join a /b/c d=> /b/c/d
On Macintosh, a relative pathname starts with a colon, and an absolute pathname does not. To specify an absolute path you put a trailing colon on the first component so it is interpreted as a volume specifier. These relative components are joined into a relative pathname:
file join a :b:c d=> :a:b:c:d
In the next case, b:c is an absolute pathname with b: as the volume specifier. The absolute name overrides the previous relative name:
file join a b:c d=> b:c:d
The file join operation converts UNIX-style pathnames to native format. For example, on Macintosh you get this:
file join /usr/local/lib=> usr:local:lib
file split "/Disk/System Folder/Extensions"=> Disk: {System Folder} Extensions
Common reasons to split up pathnames are to divide a pathname into the directory part and the file part. These special cases are handled directly by the dirname and tail operations. The dirname operation returns the parent directory of a pathname, while tail returns the trailing component of the pathname:
file dirname /a/b/c=> /a/b
file tail /a/b/c=> c
For a pathname with a single component, the dirname option returns ".", on UNIX and Windows, or ":" on Macintosh. This is the name of the current directory.
The extension and root options are also complementary. The extension option returns everything from the last period in the name to the end (i.e., the file suffix including the period.) The root option returns everything up to, but not including, the last period in the pathname:
file root /a/b.c=> /a/b
file extension /a/b.c=> .c
File name patterns are not directly supported by the file operations. Instead, you can use the glob command described on page 106 to get a list of file names that match a pattern.
Copying Files
The file copy operation copies files and directories. The following example copies file1 to file2. If file2 already exists, the operation raises an error unless the -force option is specified:
file copy ?-force? file1 file2Several files can be copied into a destination directory. The names of the source files are preserved. The -force option indicates that files under directory can be replaced:
file copy ?-force? file1 file2 ... directoryDirectories can be recursively copied. The -force option indicates that files under dir2 can be replaced:
file copy ?-force? dir1 dir2
file mkdir dir dir ...It is not an error if the directory already exists. Furthermore, intermediate directories are created if needed. This means you can always make sure a directory exists with a single mkdir operation. Suppose /tmp has no subdirectories at all. The following command creates /tmp/sub1 and /tmp/sub1/sub2:
file mkdir /tmp/sub1/sub2The -force option is not understood by file mkdir, so the following command accidentally creates a folder named -force, as well as one named oops.
file mkdir -force oops
file delete ?-force? name name ...To delete a file or directory named -force, you must specify a non-existent file before the -force to prevent it from being interpreted as a flag (-force -force won't work):
file delete xyzzy -force
file rename ?-force? old newUsing file rename is the best way to update an existing file. First generate the new version of the file in a temporary file. Then use file rename to replace the old version with the new version. This ensures that any other programs that access the file will not see the new version until it is complete.
proc newer { file1 file2 } {
if ![file exists $file2] {
return 1
} else {
# Assume file1 exists
expr [file mtime $file1] > [file mtime $file2]
}
}The stat and lstat operations return a collection of file attributes. They take a third argument that is the name of an array variable, and they initialize that array with elements that contain the file attributes. If the file is a symbolic link, then the lstat operation returns information about the link itself and the stat operation returns information about the target of the link. The array elements are listed in Table 9-3. All the element values are decimal strings, except for type, which can have the values returned by the type option. These are based on the UNIX stat system call. Use the file attributes command described later to get other platform-specific attributes:
Example 9-3 uses the device (dev) and inode (ino) attributes of a file to determine if two pathnames reference the same file.
proc fileeq { path1 path2 } {
file stat $path1 stat1
file stat $path2 stat2
expr $stat1(ino) == $stat2(ino) && \
$stat1(dev) == $stat2(dev)
}The file attributes operation was added in Tcl 8.0 to provide access to platform-specific attributes. The attributes operation lets you set and query attributes. The interface uses option-value pairs. With no options, all the current values are returned.
file attributes book.doc=> -creator FRAM -hidden 0 -readonly 0 -type MAKR
These Macintosh attributes are explained in Table 9-4. The 4-character type codes used on Macintosh are illustrated on page 432. With a single option, just that value is returned:
file attributes book.doc -readonly=> 0
The attributes are modified by specifying one or more option-value pairs. Setting attributes can raise an error if you do not have the right permissions:
file attributes book.doc -readonly 1 -hidden 0
Opening Files for I/O
The open command sets up an I/O channel to either a file or a pipeline of processes. The return value of open is an identifier for the I/O channel. Store the result of open in a variable and use the variable as you used the stdout, stdin, and stderr identifiers in the examples so far. The basic syntax is:
open what ?access? ?permissions?The what argument is either a file name or a pipeline specification similar to that used by the exec command. The access argument can take two forms, either a short character sequence that is compatible with the fopen library routine, or a list of POSIX access flags. Table 9-6 summarizes the first form, while Table 9-7 summarizes the POSIX flags. If access is not specified, it defaults to read. The permissions argument is a value used for the permission bits on a newly created file. The default permission bits are 0666, which grant read/write access to everybody. Example 9-4 specifies 0600 so that the file is only readable and writable by the owner. Remember to specify the leading zero to get an octal number as used in the chmod documentation. Consult the manual page on the UNIX chmod command for more details about permission bits.
set fileId [open /tmp/foo w 0600]
puts $fileId "Hello, foo!"
close $fileId
set fileId [open /tmp/bar {RDWR CREAT}]Catch errors from open.
In general you should check for errors when opening files. The following example illustrates a catch phrase used to open files. Recall that catch returns 1 if it catches an error, otherwise it returns zero. It treats its second argument as the name of a variable. In the error case it puts the error message into the variable. In the normal case it puts the result of the command into the variable:
if [catch {open /tmp/data r} fileId] {
puts stderr "Cannot open /tmp/data: $fileId"
} else {
# Read and process the file, then...
close $fileId
}
set input [open "|sort /etc/passwd" r]
set contents [split [read $input] \n]
close $inputYou can open a pipeline for both read and write by specifying the r+ access mode. In this case you need to worry about buffering. After a puts, the data may still be in a buffer in the Tcl library. Use the flush command to force the data out to the spawned processes before you try to read any output from the pipeline. You can also use the fconfigure command described on page 182 to force line buffering. Remember that read-write pipes will not work at all with Windows 3.1 because pipes are simulated with files. On UNIX, the expect extension, which is described in Exploring Expect (Libes, O'Reilly & Associates, Inc., 1995), provides a much more powerful way to interact with other programs.
Event-driven I/O is also very useful with pipes. It means you can do other processing while the pipeline executes, and just respond when the pipe generates data. This is described in Chapter 15.
Reading and Writing
The standard I/O channels are already open for you. There is a standard input channel, a standard output channel, and a standard error output channel. These channels are identified by stdin, stdout, and stderr, respectively. Other I/O channels are returned by the open command, and by the socket command described on page 186.
There may be cases when the standard I/O channels are not available. Windows has no standard error channel. Some UNIX window managers close the standard I/O channels when you start programs from window manager menus. You can also close the standard I/O channels with close.
The puts and gets Commands
The puts command writes a string and a newline to the output channel. There are a couple of details about the puts command that we have not yet used. It takes a -nonewline argument that prevents the newline character that is normally appended to the output channel. This will be used in the prompt example below. The second feature is that the channel identifier is optional, defaulting to stdout if not specified.
puts -nonewline "Enter value: "
flush stdout ;# Necessary in Tcl 7.5 and Tcl 7.6
set answer [gets stdin]The gets command reads a line of input, and it has two forms. In the previous example, with just a single argument, gets returns the line read from the specified I/O channel. It discards the trailing newline from the return value. If end of file is reached, an empty string is returned. You must use the eof command to tell the difference between a blank line and end-of-file. eof returns 1 if there is end of file. Given a second varName argument, gets stores the line into a named variable and returns the number of bytes read. It discards the trailing newline, which is not counted. A -1 is returned if the channel has reached the end of file.
while {[gets $channel line] >= 0} {
# Process line
}
close $channel
foreach line [split [read $channel] \n] {
# Process line
}
close $channelFor moderate-sized files it is about 10 percent faster to loop over the lines in a file using the read loop in the second example. In this case, read returns the whole file, and split chops the file into list elements, one for each line. For small files (less than 1K) it doesn't really matter. For large files (megabytes) you might induce paging with this approach.
During output, text lines are generated in the platform-native format. The automatic handling of line formats means that it is easy to convert a file to native format. You just need to read it in and write it out:
puts -nonewline $out [read $in]To suppress conversions, use the fconfigure command, which is described in more detail on page 183.
Example 9-10 demonstrates a File_Copy procedure that translates files to native format. It is complicated because it handles directories:
proc File_Copy {src dest} {
if [file isdirectory $src] {
file mkdir $dest
foreach f [glob -nocomplain [file join $src *]] {
File_Copy $f [file join $dest [file tail $f]]
}
return
}
if [file isdirectory $dest] {
set dest [file join $dest [file tail $src]]
}
set in [open $src]
set out [open $dest w]
puts -nonewline $out [read $in]
close $out ; close $in
}
The close command can raise an error.
If the channel was a process pipeline and any of the processes wrote to their standard error channel, then Tcl believes this is an error. The error is raised when the channel to the pipeline is finally closed. Similarly, if any of the processes in the pipeline exit with a non-zero status, close raises an error.
Matching File Names with glob
The glob command expands a pattern into the set of matching file names. The general form of the glob command is:
glob ?flags? pattern ?pattern? ...The pattern syntax is similar to the string match patterns:
The -- flag must be used if the pattern begins with a -.
Unlike the glob matching in csh, the Tcl glob command only matches the names of existing files. In csh, the {a,b} construct can match non-existent names. In addition, the results of glob are not sorted. Use the lsort command to sort its result if you find it important.
proc FindFile { startDir namePat } {
set pwd [pwd]
if [catch {cd $startDir} err] {
puts stderr $err
return
}
foreach match [glob -nocomplain -- $namePat]{
puts stdout [file join $startDir $match]
}
foreach file [glob -nocomplain *] {
if [file isdirectory $file] {
FindFile [file join $startDir $file] $namePat
}
}
cd $pwd
}The FindFile procedure traverses the file system hierarchy using recursion. At each iteration it saves its current directory and then attempts to change to the next subdirectory. A catch guards against bogus names. The glob command matches file names.
The pid command returns the process ID of the current process. This can be useful as the seed for a random number generator because it changes each time you run your script. It is also common to embed the process ID in the name of temporary files.
You can also find out the process IDs associated with a process pipeline with pid:
set pipe [open "|command"]
set pids [pid $pipe]There is no built-in mechanism to control processes in Tcl. On UNIX systems you can exec the kill program to terminate a process:
exec kill $pid
proc printenv { args } {
global env
set maxl 0
if {[llength $args] == 0} {
set args [lsort [array names env]]
}
foreach x $args {
if {[string length $x] > $maxl} {
set maxl [string length $x]
}
}
incr maxl 2
foreach x $args {
puts stdout [format "%*s = %s" $maxl $x $env($x)]
}
}
printenv USER SHELL TERM
=>
USER = welch
SHELL = /bin/cshTERM = tx
Note: Environment variables can be initialized for Macintosh applications by editing a resource of type STR# whose name is "Tcl Environment Variables". This resource is part of the tclsh and wish applications. Follow the directions on page 26 for using ResEdit. The format of the resource values is NAME=VALUE.
welch@acm.org Copyright © 1997, Brent Welch. All rights reserved.