%==============================================================================% % Start of Ch3.tex % %==============================================================================% % % Copyright % --------- % Copyright (C) 1992 Ross N. Williams. % This file contains a chapter of the FunnelWeb User's Manual. % See the main TeX file for this manual for further information. % %==============================================================================% \chapter{FunnelWeb Definition} \label{chapdefinition}\xx{FunnelWeb}{definition} \section{Introduction} This purpose of this chapter is to provide a complete and consistent definition of the FunnelWeb input language and the behaviour of the FunnelWeb program. Usually, a chapter such as this is called a \dq{reference manual}, but this chapter is intended to go further by actually defining the language and program. This chapter takes precedence over all other chapters and all implementations of FunnelWeb. If an implementation contradicts this chapter, then the implementation is wrong. This is the chapter that you should turn if you find yourself asking a specific question about a specific aspect of FunnelWeb. In many cases it will be convenient to access this chapter through the index. \section{Notation} \x{notation} A particular variant of EBNF\xx{EBNF}{syntax} (Extended Bachus Naur Form) will be used to describe the FunnelWeb syntax. In this variant, literal strings are delimited by double quotes (\eg{}\p{"string"}), optional constructs by square brackets (\eg{}\p{[optional]}), and constructs repeated zero or more times by braces (\eg{}\p{\{zeroormore\}}). Constructs to be repeated a fixed number of times are enclosed in braces followed by a decimal number indicating the number of times to be repeated (\eg{}\p{\{sixtimes\}6}). Constructs to be repeated one or more times are enclosed in braces and followed by a \p{+} (\eg{}\p{\{oneormore\}+}). The traditional BNF \dqp{::=} is replaced by the visually simpler \dqp{=}. The traditional BNF angle brackets are abandoned. Although FunnelWeb allows the special character to be changed using the construct \dq{$<$special$>$=}, use of \dq{$<$special$>$} to refer to FunnelWeb's special character is cumbersome and abstract. To simplify the presentation, the default special character \dqp{@} is used throughout this chapter to represent the special character. \section{Terminology} \x{terminology} A specific terminology has arisen for dealing with FunnelWeb. Some particularly useful examples are: \narrowthing{Journal file:}{An output file containing a copy of the output sent to the user's console during an invocation of FunnelWeb. In other systems, this file is sometimes called a \dq{log file}.} \narrowthing{Product file:}{An output file, generated by the Tangle component of FunnelWeb, that contains the expansion of the macros in the input file.\footnote{Other names considered for this were: generated file, expanded file, result file, program file, and tangle file.}} A complete list of all the special FunnelWeb terminology appears in the glossary. Be sure to refer to it if any of the terms used are unclear. \section{An Architectural Overview} \xx{semantic}{architecture}\xx{FunnelWeb}{overview} An understanding of the internals of FunnelWeb assists with understanding its operation (\figphases{}).\xx{execution}{phases} During a single run, FunnelWeb reads and processes a single input file called the \newterm{input file} or the \newterm{FunnelWeb file}. The file is processed by passing it through a series of stages called \newterm{phases}. The result is that some \newterm{output files} are generated. A \newterm{journal file} is generated containing a copy of the messages that appear on the console during the FunnelWeb run. A \newterm{listing file} is created containing a summary of the run, including any error messages. A \newterm{documentation} file is generated containing typesetter commands that when fed into a typesetter program will result in printed documentation. Finally, one or more \newterm{product files} are generated containing the result of unscrambling the macro definitions of the input file. These files need not all be generated on any particular FunnelWeb run. Whether each output file appears is controlled by command line options. \begin{figure}[htbp] \begin{verbatim} .fw Input File (FunnelWeb file) V +---------+ \ | Scanner | | +---------+ | V | +--------+ | | Parser | | +--------+ | V | +----------+ >-------+------------+ | Analyser | | | | +----------+ | | | V | | | +-----------+------------+ | V V V V | | | +--------+ +-------+ | | | | Tangle | | Weave | | | | +--------+ +-------+ / | | | | | | V V V V Product Files Documentation File Listing File Journal File \end{verbatim} \mylabel{\figphases{}: FunnelWeb's processing phases.}{% % FunnelWeb processes each input file in a sequence of phases. If an error occurs during a phase, no subsequent phases are executed. % } \end{figure} The phases are briefly described below. \narrowthing{The Scanner}{reads\x{scanner} the input file, expands and reads in include files, scans the input stream, processes pragmas and typesetter directives, and parses all the FunnelWeb special sequences. The result is a list of tokens that is handed to the parser.} \narrowthing{The Parser}{reads\x{parser} the scanner's token list and parses it, constructing a document list\xx{document}{list} and a macro table.\xx{macro}{table} which are passed to later phases.} \narrowthing{The Analyser}{examines\x{analyser} the macro table generated by the parser and performs a number of checks of the macro structures that the parser could not make on its single pass. For example, the analyser detects and flags unused macros and recursive macros. The analyser forms the final stage of FunnelWeb's front-end processing.} \narrowthing{Tangle}{expands certain macros in the macro table to generate\x{tangle} one or more product files.} \narrowthing{Weave}{uses the document list to generate\x{weave} a documentation file.} A single run through these phases constitutes a single invocation of \newterm{FunnelWeb proper}. Most invocations of the \newterm{FunnelWeb program} will consist only of a single execution of FunnelWeb proper. However, FunnelWeb also provides a command shell that provides many useful commands, including a command to invoke FunnelWeb proper. Discussion of the command shell is deferred until Section~\ref{commandshell}. \section{Diagnostics} \label{diagnostics}\x{diagnostics} During execution, FunnelWeb proceeds cautiously with each of its phases, only proceeding with the next phase if the previous phase has been successful. This means that, when debugging a FunnelWeb file, you may find that the number of errors \i{increases} after you fix some of them, as you will be exposing yourself to the next FunnelWeb phase. FunnelWeb employs five levels of diagnostics\xx{diagnostics}{levels of} at different levels of severity.\x{severity} Severity is defined in terms of the level of activity at which the diagnostic causes FunnelWeb to abort. \narrowthing{Warning:}{A warning\xx{warning}{severity} does not cause FunnelWeb to terminate or curtail its operation in any way, but serves merely to warn the user of particular conditions that might be symptomatic of deeper problems.} \narrowthing{Error:}{An\xx{error}{severity} error causes FunnelWeb to terminate processing of the current input file at the end of the current phase. For example, if an error occurs during scanning, FunnelWeb will continue scanning (and possibly generate further scanning diagnostics), but will not invoke the parser.} \narrowthing{Severe Error:}{A\xx{severe}{severity} severe error (or \dq{severe} for short) is the same as an error except that FunnelWeb terminates the current phase immediately.} \narrowthing{Fatal Error:}{A\xx{fatal}{severity} fatal error causes FunnelWeb not only to terminate the current phase and run immediately, but also to terminate total FunnelWeb processing immediately. A severe error will not cause a FunnelWeb script to terminate, but a fatal error will. A fatal error causes FunnelWeb to return control to the operating system.} \narrowthing{Assertion Error:}{An\xx{assertion}{severity} assertion error occurs if FunnelWeb detects an internal inconsistency, in which case FunnelWeb terminates immediately and ungracefully. Such an error can occur only if there are bugs in FunnelWeb. With luck, such errors will be extremely rare.} FunnelWeb indicates the level of severity of each diagnostic that it issues by starting each diagnostic either with the full name of the severity level or with just the first letter of the severity level followed by a colon. FunnelWeb conveys the presence or absence of diagnostics at the operating system level by returning \p{EXIT\_SUCCESS}\xx{return}{status} status if no diagnostics occurred during the run and \p{EXIT\_FAILURE} status if one or more diagnostics (including warnings) occurred during the run.\footnote{From the symbols of the ANSI standard C library \p{stdlib.h}. See \paper{Kernighan88}, p.252.} \section{Typesetter Independence} \xx{typesetter}{independence} One of the design goals of FunnelWeb was to provide a \i{target-language} independent literate programming system. This goal has been achieved simply by treating the text written to the product file as homogeneous and typesetting it in \p{tt font}. A secondary goal was to provide a \i{typesetter} independent literate programming system. By this is meant that it be possible to create FunnelWeb input files that do not contain typesetter-specific commands. To a lesser extent this goal has also been achieved. The difficulty with providing typesetter-independent typesetting is that each desired typesetting feature must be recreated in a typesetter-independent FunnelWeb typesetting construct that FunnelWeb can translate into whatever typesetting language is being targeted by Weave. Taken to the extreme, this would result in FunnelWeb providing the full syntactic and semantic power of \TeX{}, but with a more generic, FunnelWeb-specific syntax. This was unfeasible in the time available, and undesirable as well. The compromise struck in the FunnelWeb design is to provide a set of primitive typesetter-independent typesetting features that are implemented by FunnelWeb. These are the \newterm{typesetter directives}. If the user is prepared to restrict to these directives, then the user's FunnelWeb document will be both target-language and typesetter independent. However, if the user wishes to use the more sophisticated features of the target typesetting system, the user can specify the typesetter in a \dqp{typesetter} pragma and then place typesetter commands in the free text of the FunnelWeb document where they will be passed verbatim to the documentation file. The choice of the trade-off between typesetter independence and typesetting power is left to the user. This said, experience with FunnelWeb V1 over a three year period, indicates that the typesetting facilities provided by FunnelWeb are sufficient for most documentation. \section{Command Line Interface} \xx{command line}{interface} \subsection{Invoking FunnelWeb} \xx{FunnelWeb}{invoking}\xx{FunnelWeb}{running} When a user invokes FunnelWeb at the operating system command level, the user must provide a command line instructing FunnelWeb what to do. Typically an operating system command line consists of a \i{verb} indicating that a particular program should be run, followed by a list of options. For example: \begin{verbatim} $ rename file1 file2 \end{verbatim} In this case, the verb is \p{rename} and the command line options are \p{file1 file2}. The entire command line begins with the \p{\$} and ends with the \p{2}. Operating systems differ greatly in the depth with which they process their command\xx{command line}{processing} lines, ranging from systems that simply pass the entire command line string to the invoked program (\eg{}MSDOS) through to systems that perform complete command line parsing (\eg{}VMS). Syntax conventions vary considerably. So as to achieve maximum portability and consistency of invocation across different platforms, FunnelWeb reads its command line as a raw string and performs all its own parsing.\xx{command line}{parsing} This is portable because, at the very least, all operating systems allow invoked programs access to the raw command line. The command verb used to invoke FunnelWeb should be \dqp{fw}.\xx{fw}{command verb} \begin{verbatim} FunnelWeb_verb = "fw" \end{verbatim} If this verb is not available, some alternatives are \dqp{funweb}, \dqp{fun}, and \dqp{funnelweb}. The verbs \p{web} or \p{fweb} should be avoided as they are the names of other literate programming systems. \subsection{Command Line Arguments} \xx{command line}{argument}\xx{syntax}{command line} Following the verb is the body of the command line which FunnelWeb parses into zero or more \newterm{arguments} separated by runs of one or more blanks. \begin{verbatim} FunnelWeb_command_line = FunnelWeb_verb {{" "}+ argument} \end{verbatim} Because some operating systems convert their command line to upper case before handing it to the invoked program, FunnelWeb has been constructed so as to be \i{insensitive} to the case of its command line arguments.\xx{case}{dependence} However, when dealing internally with arguments, FunnelWeb \i{preserves} the case of its command line arguments so that it will be able to operate with operating systems (such as Unix\x{Unix}) whose file names are case dependent. A valid FunnelWeb argument consists of a \newterm{sign}, an identifying \newterm{letter}, and an optional \newterm{string} with no spaces separating them.\xx{command line options}{syntax}\xx{options}{syntax} \begin{verbatim} argument = sign id_letter [non_blank_string] sign = "+" | "-" | "=" id_letter = "B" | "C" | "D" | "F" | "H" |"I" | "J" | "K" | "L" | "O" | "Q" | "S" | "T" | "W" | "X" \end{verbatim} In addition there is a special form of argument that does not begin with a sign. \begin{verbatim} argument = non_blank_string_not_beginning_with_+_=_or_- \end{verbatim} This form is exactly equivalent to the same string with \dqp{+F} prepended to it. The semantic effect of these arguments is defined in terms of \newterm{options} which are the internal parameters of FunnelWeb and which correspond closely with the set of legal command line arguments. FunnelWeb has a predefined set of options each identified by an identifying letter having two attributes: a \i{string}, and a \i{boolean}. The boolean determines whether an option is turned \i{on} or \i{off}. The string contains additional information depending on the option. When FunnelWeb starts up, its options have predefined default values. FunnelWeb then parses its command line sequentially from left to right executing the effect of each argument on the argument's corresponding option. The sign and the string components of the argument are processed \i{independently}. A sign of \p{+} turns the option on. A sign of \p{-} turns the option off. A sign of \p{=} leaves the option's boolean attribute unchanged. The argument string replaces the string of the corresponding option, unless the argument string is empty, in which case the option string is not changed. Because FunnelWeb processes its command line arguments from left to right, a later argument can cancel the effect of an earlier one. For example \p{fw +t -t} will result in the \p{t} option ending up \i{off}. This allows users to set up their own default arguments by defining a symbol in their operating system's command language. For example, a Unix user who wants FunnelWeb to delete all identical output files and create a documentation file on each run with a default \p{.typ} extension could simply place the following definition in their \dqp{.login} file. \begin{verbatim} alias fw fw +d +t.typ \end{verbatim} These default options can then later be easily overridden on the command line. \subsection{Options} \label{commandlineoptions}\x{options}\xx{list}{options} FunnelWeb's options are internal parameters which can be modified by corresponding arguments on FunnelWeb's command line. A description of each argument and option follows. \narrowthing{B1$\ldots$B6: Tracedumps:}{These\xx{B}{option} six\xx{tracedump}{options}\xx{dump}{option} options have been provided to assist in the debugging and testing of FunnelWeb. They determine which of six possible trace dumps are to be written to the listing file. Only the boolean attributes of these options are ever used. The six dumps are identified by the digits \p{1..6} as follows:\xx{dump}{mapped file} \xx{dump}{global line list}\xx{dump}{token list} \xx{dump}{macro table}\xx{dump}{document list}\xx{dump}{times}} \begin{enumerate} \item Dump a hexdump of each mapped input and include file. \item Dump the global line list created by the scanner. \item Dump the token list created by the scanner. \item Dump the macro table created by the parser. \item Dump the document list created by the parser. \item Dump a table summarizing CPU and real time usage. \end{enumerate} \narrowtext{Because these options are so closely related, a hack has been pulled to enable them to all to be controlled by the \p{B} argument. The string argument to the \p{B} argument determines which of the six options are to be affected by the sign. Examples: \p{+B134} turns on options \p{B1}, \p{B3}, and \p{B4}. \p{-B1} turns off option \p{B1}. \p{Default:~-B123456}.} \narrowthing{B7: Determinism:}{If the \p{B7} option is turned on, FunnelWeb suppresses the output of anything non-deterministic,\x{non-determinism} or machine dependent. This assists in regression testing. Only the boolean attribute is used in this option. This option is controlled by the \p{B7} argument which falls under the same argument syntax as the other \p{B} options. Examples: \p{+B7}, \p{-B7}. \p{Default:~-B7}.} \narrowthing{C: Listing File Context:}{The\xx{C}{option} \p{C}\xx{listing file}{context} option is always turned on and cannot be turned off. Its only attribute is a number which determines the number of lines of context that the lister will place around lines flagged with diagnostics in the listing file (if a listing file is written). A value of 100 indicates infinite context\xx{infinite}{context} which means that the entire listing file will be written out if a single diagnostic occurs. The value of this number can be specified by specifying it as a string of decimal digits to the \p{+C} argument. Examples: \p{+C100}, \p{+C10}. \p{Default:~+C2}.} \narrowthing{D: Delete Identical Output Files:}{Only the boolean attribute of this option is used.\xx{D}{option}\xx{delete output}{option} When turned on, the option causes the suppression (deletion) of product files and documentation files (but not listing or journal files) that are identical to the currently existing files of the same name. For example, if FunnelWeb is instructed to generate \p{stack.h} as an product file, and the text to be written to \p{stack.h} is identical to the currently existing \p{stack.h}, then FunnelWeb will simply not write any product file, leaving the currently existing \p{stack.h} as it is (and in particular leaving the file's date attribute the same). This prevents unnecessary \p{make} propagations. For example, in a C program, if \p{stack.fw} is a FunnelWeb input file that generates \p{stack.h} and \p{stack.c}, a modification to \p{stack.fw} that affects \p{stack.c} but does not affect \p{stack.h} will not provoke the recompilation of modules that \p{\#include stack.h}, so long as the intervening FunnelWeb run has \p{+D} set. Examples: \p{-D}, \p{+D}. \p{Default:~-D}.} \narrowthing{F: FunnelWeb Input File:}{If\xx{F}{option} this option is turned on, FunnelWeb\xx{input file}{option} processes the input file whose name is specified by the option string. Examples: \p{+Fsloth.fw}, \p{+Fwalrus}, \p{-F}. \p{Default:~-F}.} \narrowthing{H: Display Help Message:}{If\xx{H}{option} this option\xx{help}{option} is turned on, FunnelWeb displays the message specified by the argument string. Each message has a name. The main help message is called \dqp{menu} and contains a list of the other help messages. Examples: \p{+Hregistration}, \p{+Hoptions}. \p{Default:~-Hmenu}.} \narrowthing{I: Include default file specification:}{This\xx{I}{option} option\xx{include file}{option}\xx{include}{file} is always turned on and cannot be turned off. Its string attribute is used as the default file specification for include files. Usually this option is used to specify a directory from which include files should be obtained. Examples: \p{=I/usr/dave/includes/}. \p{Default:~+I}.} \narrowthing{J: Journal File:}{If\xx{J}{option} this option is turned on, FunnelWeb generates a\xx{journal file}{option}\xx{journal}{file} journal file. A journal file contains a log of all the console input and output to FunnelWeb during a single invocation of the FunnelWeb program (Note: The \p{Q} option does not affect this.). The journal file is particularly useful for examining what happened during a FunnelWeb shell run. The string attribute is the name of the journal file. Examples: \p{+Jjournfile}, \p{-J}. \p{Default:~-J}.} \narrowthing{K: Keyboard:}{If\xx{K}{option} this option\xx{keyboard}{option}\xx{interactive}{option} is turned on, FunnelWeb enters an interactive mode in which the user can enter FunnelWeb shell commands interactively. The string attribute is unused. Examples: \p{+K}, \p{-K}. \p{Default:~-K}.} \narrowthing{L: Listing File:}{If\xx{L}{option} this option is turned on, FunnelWeb generates a\xx{listing file}{option}\xx{listing}{file} listing file containing a summary of a run on FunnelWeb proper. The string argument is the name of the listing file to be created. Examples: \p{+L}, \p{-L}, \p{+Llisting.lis}. \p{Default:~-L}.} \narrowthing{O: Product Files:}{If this option is turned on, FunnelWeb generates a product file for each macro in the input file that is bound to an output file. The string attribute contributes to the name of the product files. This option is controlled by the \p{O} argument because product files used to be called \dq{\b{O}utput files}). Examples: \p{-O}, \p{+O/usr/dave/product/}. \p{Default:~+O}.} \narrowthing{Q: Quiet:}{If\xx{Q}{option} this option is turned on, FunnelWeb suppresses all\xx{quiet}{option}\xx{suppress}{console output} output to the screen (standard output) unless one or more errors occur, in which case a single line summarizing the errors is sent to standard output at the end of the run. If this option is turned off, FunnelWeb writes to the console in its normal garrulous way. The string attribute is unused in this option. Examples: \p{-Q}, \p{+Q}. \p{Default:~-Q}.} \narrowthing{S: Screen:}{If\xx{S}{option} this option is turned on, FunnelWeb writes all\xx{screen}{option} diagnostics to the screen (standard output) as well as to the listing file. By default, they are sent only to the listing file. This option has a single numerical attribute that can be specified as a decimal string in the string component of the \p{S} argument. The number is the number of lines of context that should surround each diagnostic sent to the screen.\x{context} Examples: \p{-S}, \p{+S6}, \p{+S0}. \p{Default:~-S}.} \narrowthing{T: Documentation file:}{If\xx{T}{option} this option\footnote{This option is controlled by the \p{T} command line argument because documentation files used to be called typesetter files.} is turned on, FunnelWeb\xx{typeset}{option}\xx{typeset}{file} generates a documentation file in \TeX{} format. The string argument contributes to the name of the documentation file to be created. By default this option is turned off, as experience has shown that most FunnelWeb runs are made during program development; documentation runs occur far more rarely. Examples: \p{-T}, \p{+Tsloth.tex}. \p{Default:~-T}.} \narrowthing{W: Width of Product Files:}{If\xx{W}{option} this option\xx{width}{option}\xx{product file}{width} is turned on, a limit is placed on the length of lines in product files generated during the run. Lines that breach the limit are flagged with error messages. This option has a single numerical attribute that can be specified as a decimal string in the string component of the \p{W} argument. The number is the specified maximum width. This option is one of two limits that are placed on the width of product files. The other limit is an attribute of the input file that defaults to 80 characters, but can be raised or lowered using an output line length pragma. The width that is enforced is the lower of this value and the value of the \p{W} option (if turned on). Examples: \p{-W}, \p{+W100}. \p{Default:~-W80}.} \narrowthing{X: Execute:}{If\xx{X}{option} this option\xx{execute script}{option} is turned on, FunnelWeb executes the FunnelWeb shell script file specified by the string attribute. Examples: \p{+Xmaster}, \p{-X}. \p{Default:~-X}.} \section{File Name Inheritance} \xx{filename}{inheritance} During a single run of FunnelWeb, FunnelWeb can produce many different output files. As it would be very tedious to have to specify the name of each of these files explicitly each time FunnelWeb is run, FunnelWeb provides a system of defaults that allows the user to specify the minimum required to successfully complete the run. To do this FunnelWeb allows file specifications to inherit fields from one another. FunnelWeb structures filenames into three fields\xx{filename}{fields} which are inherited independently. The fields are: \newterm{directory}, \newterm{name}, and \newterm{extension}. On systems having other fields (\eg{}\i{network node}, \i{device name}), the extra fields are considered to be part of the directory field. Version numbers are ignored. A field can inherit a value if its current value is the empty string. The following table gives the full inheritance scheme used in FunnelWeb. \begin{center} \begin{tabular}{|l|l|l|l|l|l|l|} \hline Script & Input & Include & Journal & List & Document & Product \\ \hline & & \p{@i} & & & & \p{@o} \\ \p{+x} & \p{+f} & \p{+i} & \p{+j} & \p{+l} & \p{+t} & \p{+o} \\ \dqp{.fws} & \dqp{.fw} & \dqp{.fwi} & \dqp{.jrn} & \dqp{.lis} & \dqp{.tex} & \\ & & \p{+f} & \p{+f} & \p{+f} & \p{+f} & \\ DefDir & Defdir & Defdir & Defdir & Defdir & Defdir & Defdir \\ \hline \end{tabular} \end{center} The table is arranged with items of highest priority at the top. The \dqp{+} cells refer to the file specification supplied in the given command line argument. \dqp{+F} is the name of the input file. \dq{Defdir} refers to the default directory specification provided by the operating system. Empty cells do not contribute. The following example shows how the table is used. Suppose that the user invoked FunnelWeb as follows:\xx{filename inheritance}{example} \begin{verbatim} fw /usr/ross/work/sloth.fw +twalrus \end{verbatim} To work out what the documentation file should be called, FunnelWeb starts with the empty string and then works down the Document column of the table. The top entry is empty so we ignore it and proceed to the second entry which consists of \dqp{+T}. The user specified the string \dqp{walrus} as the value of this option, and as our current (empty) string does not have a name field, we insert the string \dqp{walrus} into the name field, resulting in the string \dqp{walrus}. Moving down to the next row, we encounter the constant string \dqp{.tex}. This string consists of an empty directory and name field, but a \dqp{.tex} file extension. As our current string \dqp{walrus}, does not already have a file extension (\ie{}the file extension field of our current string is empty), we add in \dqp{.tex}, resulting in the string \dqp{walrus.tex}. Next we encounter the \dqp{+F} field which is the input filename \dqp{/usr/ross/work/sloth.fw} consisting of a directory field \dqp{/usr/ross/work/}, a name field \dqp{sloth}, and a file extension field \dqp{.fw}. Our \dqp{walrus.tex} string already has name and file extension fields, but its directory field is empty, and so we add in the directory field from the input file specification, resulting in the string \dqp{/usr/ross/work/walrus.tex}. Finally, we hit the default directory specification, which is (say) \dqp{/usr/ross/play/}. However, as the directory field of our walrus string is already full, it has no effect. In general, there is no need to remember the exact details of FunnelWeb's filename inheritance. The important thing is to know that it exists, and to use it. \section{FunnelWeb Startup} \xx{FunnelWeb}{startup}\xx{FunnelWeb}{initialization} FunnelWeb's command line options can be divided into two groups. \newterm{Action options} instruct FunnelWeb to performs some sort of independent action such as processing a file. \newterm{Ordinary options} merely modify the way in which FunnelWeb executes the actions. The four action options are: \p{+F}, \p{+K}, \p{+X}, and \p{+H}. For FunnelWeb to be successfully invoked, at least one action option must be specified. If zero action options are specified, FunnelWeb terminates with failure status. If more than one action option is specified, FunnelWeb performs the specified actions in a predefined order. Assuming that the user has specified at least one action, the order in which actions are executed is as follows:\xx{action execution}{order} \narrowthing{Initialization script:}{FunnelWeb\xx{initialization}{script} starts by looking in the current directory for a file called \dqp{fwinit.fws}.\x{fwinit.fws} If it doesn't find one, it doesn't raise any error. If it does find one, it executes it as a FunnelWeb shellscript. Initialization scripts are useful for setting up FunnelWeb options (\eg{}using the \dqp{set} command without having to type them each time).} \narrowthing{Execute argument script:}{If a shellscript has been specified using the \dqp{+X} option, FunnelWeb executes it.} \narrowthing{Process input file:}{If the user has specified an input file using the \dqp{+F} option, then this is processed next (by FunnelWeb proper).} \narrowthing{Display help message:}{If the user requested, using the \dqp{+H} option, that a help message be displayed, the message is displayed at this time.} \narrowthing{Interactive mode:}{If the user specified the \dqp{+K} option, FunnelWeb enters interactive (keyboard) mode.} FunnelWeb processes these actions in the above order regardless of the order in which they appear on the command line. It may be hard to see how some of these actions might be combined. Nevertheless, FunnelWeb allows this. For example, a user might wish to process a batch of files as specified in a script (\dqp{+Xscript.fws}), be reminded of the interactive commands available (\dqp{+Hcommand}), and then enter interactive mode so as to be able to reprocess files for which FunnelWeb reported errors (after correcting the errors in a different workstation window). \section{Scanner} \x{scanner} The scanner reads in the input file and produces a list of tokens which it hands onto the parser. In addition, some input constructs may cause the scanner to modify some of FunnelWeb's options. \subsection{Basic Input File Processing} In order to read in an input file or include file, the scanner calls a submodule called the \newterm{mapper} that reads a file in and creates a contiguous copy of it in memory. The scanner then performs three checks on the file, the first (file termination) of which is performed before scanning commences, and the other two of which take place during scanning before each line is scanned. \narrowthing{File Termination:}{The\xx{file}{termination} first check the scanner makes\xx{line}{termination} is whether the file is terminated properly. A file is considered to be properly terminated if it either contains no lines, or if the last line in the file is terminated by an end-of-line marker. If the scanner detects that an input file is not properly terminated, it adds an end-of-line marker itself (to the copy in memory only).} \narrowthing{Unprintable Characters:}{The second check the scanner makes is for\xx{unprintable}{characters} unprintable characters (ASCII 0--31 and 127--255 (except for EOL(10))) which it flags as errors and replaces by question marks.} \narrowthing{Line Lengths:}{The third check the scanner makes is input line length.\xx{line}{length} When FunnelWeb starts up, a default maximum input line length of 80 is set. This can be changed dynamically during scanning using a \p{@p maximum\_input\_line\_length} pragma. If the number of characters on a line (not including the end of line marker) exceeds this limit, FunnelWeb generates an error.\checked{}} \subsection{Special Sequences} \xx{special}{sequences} The scanner scans the input file from top to bottom, left to right, treating the input as ordinary text (to be handed directly to the parser as a text token) unless it encounters the \newterm{special character}\footnote{This sort of character is often referred to as the \dq{escape character} or the \dq{control character} in other systems. However, as there is great potential to confuse these names with the \dq{escape} character (ASCII 27) and ASCII \dq{control} characters, the term \dq{special} has been chosen instead. This results in the terms \i{special character} and \i{special sequence}.} which introduces a \newterm{special sequence}. Thus, the scanner partitions the input file into ordinary text and special sequences. \begin{verbatim} input_file = {ordinary_text | special_sequence} \end{verbatim} Upon startup, the special character\xx{default}{special character} is \p{@}, but it can be changed using the $<$special$>$\p{=}$<$new\_special$>$ special sequence. Rather than using $<$special$>$ whenever the special character appears, this document uses the default special character \dqp{@} to represent the current special character. More importantly, FunnelWeb's error messages all use the default special character in their error messages even if the special character has been changed. An occurrence of the special character in the input file introduces a special sequence. The kind of special sequence is determined by the character following the special character. Only printable characters can follow the special character. The following list gives all the possible characters that can follow the special character, and the legality of each sequence. The first column gives the ASCII number of each ASCII character. The second column gives the special sequence for that character. The next column contains one of three characters: \dqp{-} means that the sequence is illegal. \dqp{S} indicates that the sequence is a \newterm{simple sequence} (with no attributes or side effects) that appears exactly as shown and is converted directly into a token and fed to the parser. Finally, \dqp{C} indicates that the special sequence is complex, possibly having a following syntax or producing funny side effects. \begin{verbatim} ASC SEQ COMMENT ----------------- 000 \ 016 | Unprintable characters and hence illegal specials. 031 / 032 @ - Illegal (space). 033 @! C Comment. 034 @" S Parameter delimeter. 035 @# C Short name sequence. 036 @$ S Start of macro definition. 037 @% - Illegal. 038 @& - Illegal. 039 @' - Illegal. 040 @( S Open parameter list. 041 @) S Close parameter list. 042 @* - Illegal. 043 @+ C Insert newline. 044 @, S Parameter separator. 045 @- C Suppress end of line marker. 046 @. - Illegal. 047 @/ S Open or close emphasised text. 048 @0 - Illegal. 049 @1 S Formal parameter 1. 050 @2 S Formal parameter 2. 051 @3 S Formal parameter 3. 052 @4 S Formal parameter 4. 053 @5 S Formal parameter 5. 054 @6 S Formal parameter 6. 055 @7 S Formal parameter 7. 056 @8 S Formal parameter 8. 057 @9 S Formal parameter 9. 058 @: - Illegal. 059 @; - Illegal. 060 @< S Open macro name. 061 @= C Set special character. 062 @> S Close macro name. 063 @? - Illegal. Reserved for future use. 064 @@ C Insert special character into text. 065 @A S New section (level 1). 066 @B S New section (level 2). 067 @C S New section (level 3). 068 @D S New section (level 4). 069 @E S New section (level 5). 070 @F - Illegal. 071 @G - Illegal. 072 @H - Illegal. 073 @I C Include file. 074 @J - Illegal. 075 @K - Illegal. 076 @L - Illegal. 077 @M S Tag macro as being allowed to be called many times. 078 @N - Illegal. 079 @O S New macro attached to product file. Has to be at start of line. 080 @P C Pragma. 081 @Q - Illegal. 082 @R - Illegal. 083 @S - Illegal. 084 @T C Typesetter directive. 085 @U - Illegal. 086 @V - Illegal. 087 @W - Illegal. 088 @X - Illegal. 089 @Y - Illegal. 090 @Z S Tags macro as being allowed to be called zero times. 091 @[ - Illegal. Reserved for future use. 092 @\ - Illegal. 093 @] - Illegal. Reserved for future use. 094 @^ C Insert control character into text 095 @_ - Illegal. 096 @` - Illegal. 097 @a \ 109 @m | Identical to @A..@Z. 122 @z / 123 @{ S Open macro body/Open literal directive. 124 @| - Illegal. 125 @} S Close macro body/Close literal directive. 126 @~ - Illegal. 127 to 255 are not standard printable ASCII characters and are illegal. \end{verbatim} The most important thing to remember about the scanner is that \i{nothing happens unless the special character is seen.} There are no funny sequences that will cause strange things to happen. The best way to view a FunnelWeb document at the scanner level is as a body of text punctuated by special sequences that serve to structure the text at a higher level. The remaining description of the scanner consists of a detailed description of the effect of each complex special sequence. \subsection{Setting the Special Character} \xx{setting}{special character} The special character can be set using the sequence $<$special$>$\p{=}$<$newspecialchar$>$. For example, \p{@=\#} would change the special character to a hash (\p{\#}) character. The special character may be set to any printable ASCII character except the blank character (\ie{}any character in the ASCII range $[33,126]$). In normal use, it should not be necessary to change the special character of FunnelWeb, and it is probably best to avoid changing the special character so as not to confuse FunnelWeb readers conditioned to the \p{@} character. However, the feature is very useful where the text being prepared contains many \p{@} characters (\eg{}a list of internet electronic mail addresses). \subsection{Inserting the Special Character into the Text} \xx{special character}{inserting into text} The special sequence $<$special$>$\p{@} inserts the special character into the text as if it were not special at all. The \p{@} of this sequence has nothing to do with the current special character. If the current special character is \p{P} then the sequence \p{P@} will insert a \p{P} into the text. Example: \p{@@\#@=\#@\#@\#=@@@} translates to \p{@\#@\#@}. \subsection{Inserting Arbitrary Characters into the Text} \xx{arbitrary characters}{inserting into text} \xx{control characters}{inserting into text}\x{@circumflex} While FunnelWeb does not tolerate unprintable characters in the input file (except for the end of line character), it does allow the user to specify that unprintable characters appear in the product file. The \p{@\circumflex{}} sequence inserts a single character of the user's choosing into the text. The character can be specified by giving its ASCII number in one of four bases: binary, octal, decimal, and hexadecimal. Here is the syntax: \begin{verbatim} control_sequence = "@^" char_spec char_spec = binary | octal | decimal | hexadecimal binary = ("b" | "B") "(" {binary_digit}8 ")" octal = ("o" | "O" | "q" | "Q") "(" {octal_digit}3 ")" decimal = ("d" | "D") "(" {decimal_digit}3 ")" hexadecimal = ("h" | "H" | "x" | "X") "(" {hex_digit}2 ")" binary_digit = "0" | "1" octal_digit = binary_digit | "2" | "3" | "4" | "5" | "6" | "7" decimal_digit = octal_digit | "8" | "9" hex_digit = decimal_digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | "c" | "d" | "e" | "f" \end{verbatim} Example: \begin{verbatim} @! Unix Make requires that productions commence with tab characters. @^D(009)prog.o <- prog.c \end{verbatim} Note that the decimal \dqp{9} is expressed with leading zeros as \dqp{009}. FunnelWeb requires a fixed number of digits for each base. Eight digits for base two, three digits for base ten, three digits for base eight and two digits for base sixteen. FunnelWeb treats the character resulting from a \p{@\circumflex{}} sequence as ordinary text in every sense. If your input file contains many instances of a particular control character, you can package it up in a macro like any other text. In particular, quick names can be used to great effect: \begin{verbatim} @! Unix "Make" requires that productions commence with tab characters. @! So we define a macro with a quick name as a tab character. $@#T@{@^D(009)@} @! And use it in our productions. @#Tprog.o <- prog.c @#Ta.out <- prog.o \end{verbatim} Warning: If you insert a Unix\x{Unix newline} newline character (decimal 10) into the text, FunnelWeb will treat this as an end of line sequence regardless of what the character sequence for end of line is on the machine upon which it is running. Unix EOL is FunnelWeb's internal representation for end of line. Thus, in the current version of FunnelWeb, inserting character 10 into the text is impossible unless this also happens to be the character used by the operating system to mark the end of line. \subsection{Comments} \xx{comments}{FunnelWeb} When FunnelWeb encounters the \p{@!}\x{@!} sequence during its left-to-right scan of the line, it throws away the rest of the line (including the EOL) without analysing it further. Comments can appear in any line except \dqp{@i}, \dqp{@t}, and \dqp{@p} lines. FunnelWeb comments can be used to insert comments into your input file that will neither appear in the product files nor in the documentation file, but will be solely for the benefit of those reading and editing the input file directly. Example: \begin{verbatim} @! I have used a quick macro for this definition as it will be used often. @$@#C@{--@} \end{verbatim} Because comments are defined to include the end-of-line marker, care must be taken when they are being added or removed within the text of macro bodies. For example the text fragment \begin{verbatim} for (i=0;i}). A quick name sequence consists of \p{@\#}$x$\x{@hash} where $x$, the name of the macro, can be any printable character except space. \begin{verbatim} quick_name = "@#" non_space_printable \end{verbatim} The result is identical to the equivalent ordinary name syntax, but is shorter. For example, \p{@\#X} is equivalent to \p{@}. This shorter way of writing one-character macro names is more convenient where a macro must be used very often. For example, the macro calls in the following fragment of an Ada program are a little clumsy. \begin{verbatim} @! Define @ as "" to turn on debug code and "--" to turn it off. @$@@{--@} @assert(b>3); @if x>7 then write("error") end if \end{verbatim} The calls can be shortened using the alternative syntax. \begin{verbatim} @! Define @#| as "" to turn on debug code and "--" to turn it off. @$@#|@{--@} @#|assert(b>3); @#|if x>7 then write("error") end if \end{verbatim} \subsection{Inserting End of Line Markers} \xx{EOL markers}{inserting} An end of line marker/character can be inserted into the text using the \p{@+}\x{@+} sequence. This is exactly equivalent to a real end of line in the text at the point where it occurs. While this feature may sound rather useless, it is very useful for laying out the input file. For example, the following input data for a database program \begin{verbatim} Animal = Kangaroo Size = Medium Speed = Fast Animal = Sloth Size = Medium Speed = Slow Animal = Walrus Size = Big Speed = Medium \end{verbatim} can be converted into \begin{verbatim} Animal = Kangaroo @+Size = Medium @+Speed = Fast @+ Animal = Sloth @+Size = Medium @+Speed = Slow @+ Animal = Walrus @+Size = Big @+Speed = Medium @+ \end{verbatim} which is easier to read, and more easily allows comparisons between records. \subsection{Suppressing End of Line Markers} \xx{EOL markers}{suppressing} End of line markers can be suppressed by the \p{@-}\x{@-} sequence. A single occurrence of a \p{@-} sequence serves to suppress only the end of line marker following it and must appear \i{exactly} before the end of line marker to be suppressed. No trailing spaces, \p{@!} comments, or any other characters are permitted between a \p{@-} sequence and the end of line that it is supposed to suppress. The \p{@-} sequence is useful for constructing long output lines without them having to appear in the input. It can also be used in the same way as the \p{@+} was used in the previous section to assist in exposing the structure of output text without affecting the output text itself. Finally, it is invaluable for suppressing the EOL after the opening macro text \p{@\{} construct. For example: \begin{verbatim} @$@@{@- I am the walrus!@} \end{verbatim} is equivalent to \begin{verbatim} @$@@{I am the walrus!@} \end{verbatim} The comment construct (\p{@!}) can also be used to suppress end of lines. However, the \p{@-} construct should be preferred for this purpose as it makes explicit the programmer's intent to suppress the end of line. \subsection{Include Files} \xx{include}{files} FunnelWeb provides an include file facility with a maximum depth of 10. When FunnelWeb sees a line of the form \p{@i },\x{@i} it replaces the entire line (including the EOL) with the contents of the specified include file. FunnelWeb's include file facility is intended to operate at the line level. If the last line of the include file is not terminated by an EOL, FunnelWeb issues a warning and interts one (in the copy in memory). The \p{@i} construct is illegal if it appears anywhere except at the start of a line. The construct must be followed by a single blank. The file name is defined to be everything between the blank and the end of the line (no comments (\p{@!}) please!). Example: If the input file is \begin{verbatim} "Uh Oh, It's the Fuzz. We're busted!" said Baby Bear. @i mr_plod.txt "Quick! Flush the stash down the dunny and let's split." said Father Bear. \end{verbatim} and there is a file called \p{mr\_plod.txt} containing \begin{verbatim} "'Ello, 'Ello, 'Ello! What's all this 'ere then?" Mr Plod exclaimed. \end{verbatim} then the scanner translates the input file into \begin{verbatim} "Uh Oh, It's the Fuzz. We're busted!" said Baby Bear. "'Ello, 'Ello, 'Ello! What's all this 'ere then?" Mr Plod exclaimed. "Quick! Flush the stash down the dunny and let's split." said Father Bear. \end{verbatim} As a point of terminology, FunnelWeb calls the original input file the \newterm{input file} and calls include files and their included files \newterm{include files}. The include file construct operates at a very low level. An include line can appear anywhere in the input file regardless of the context of the surrounding lines. FunnelWeb sets the special character to the default (\p{@}) at the start of each include file and restores it to its previous value at the end of the include file. This allows macro libraries to be constructed and included that are independent of the prevailing special character at the point of inclusion. The same goes for the input line length limit which is reset to the default value at the start of each include file and restored to its previous value afterwards. \subsection{Pragmas} \x{pragmas}\xx{pragmas}{visible}\xx{pragmas}{invisible} Most tools have to support some essential, but rather inelegant features. In FunnelWeb these messy bits have all been stuffed into the scanner's \newterm{pragma} (for \i{pragma}tic) construct. A pragma consists of a single line of input (including the EOL) commencing with \p{@p}. This must be followed by a single space, and then the pragma verb. This must be followed by a sequence of zero or more arguments separated by one or more spaces. Four pragmas are available \begin{verbatim} pragma = pragma_ident | pragma_mill | pragma_moll | pragma_typesetter \end{verbatim} The following syntax definitions assist in defining the pragmas. \begin{verbatim} s = {" "}+ ps = ("@p" | "@P") " " number = { decimal_digit }+ numorinf = number | "infinity" \end{verbatim} The arguments to pragmas are case-sensitive and must be specified in lower case. Pragmas are processed and consumed entirely by the scanner. The parser never sees them and so they can play no part in the parser level syntax. As a result, pragma lines can appear anywhere in the entire input file regardless of the surrounding context (\eg{}even in the middle of a macro definition). The sole effect of a pragma is to modify some internal parameter of FunnelWeb. The following sections describe the four FunnelWeb pragmas. \subsubsection{Indentation} \label{indentationpragma}\xx{indentation}{macro expansion} When FunnelWeb expands a macro, it can do so in two ways. First it can treat the text it is processing as a one-dimensional stream of text, and merely insert the body of the macro in place of the macro call. Second, it can treat the text of the macro as a two dimensional object and indent each line of the macro body by the amount that the macro call itself was indented. Consider the following macros. \begin{verbatim} @$@@{@- i=1; while (i<=N) @ endwhile @} @$@@{@- a[i]:=0; i:=i+1;@} \end{verbatim} Under the regime of \newterm{no indentation}\xx{indentation}{none} the loop structure macro expands to: \begin{verbatim} i=1; while (i<=N) a[i]:=0; i:=i+1; endwhile \end{verbatim} Under the regime of \newterm{blank indentation}\xx{indentation}{blank} the loop structure macro expands to: \begin{verbatim} i=1; while (i<=N) a[i]:=0; i:=i+1; endwhile \end{verbatim} The \p{indentation} pragma determines which of these two regimes will be used to expand the macros when constructing the product files. The syntax of the pragma is: \begin{verbatim} pragma_ident = ps "indentation" s "=" s ("blank" | "none") \end{verbatim} Its two forms look like this: \begin{verbatim} @p indentation = blank @p indentation = none \end{verbatim} In the current version of FunnelWeb, the indentation regime is an attribute that is attached to an entire run of Tangle; it is not possible to bind it to particular product files or to particular macros. As a result, it doesn't matter where indentation pragmas occur in the input file or how many there are so long as they are all the same. By default FunnelWeb uses blank indentation. \subsubsection{Maximum Input Line Length} \label{millpragma}\xx{input}{line length}\xx{maximum}{input line length} FunnelWeb generates an error for each input line that exceeds a certain maximum number of characters. At the start of the processing of each input file and each include file, this maximum is set to a default value of 80. However, the maximum can be changed using a maximum input line length pragma.\xx{pragma}{input line length} \begin{verbatim} pragma_mill = ps "maximum_input_line_length" s "=" s numorinf \end{verbatim} The maximum input line length can be varied \i{dynamically} throughout the input file. Each maximum input line length pragma's scope covers the line following the pragma through to and including the next maximum input line length pragma, but not covering any intervening include files. At the start of an include file, FunnelWeb resets the maximum input line length to the default value. It restores it to its previous value at the end of the include file. This pragma is useful for detecting text that has strayed off the right side of the screen when editing. If you use FunnelWeb, and set the maximum input line length to be the width of your editing window, you will never be caught by, for example, off-screen opening comment symbols. You can also be sure that your source text can be printed raw, if necessary, without lines wrapping around. \subsubsection{Maximum Output File Line Length} \label{mollpragma}% \xx{maximum}{product file line length}% \xx{pragma}{maximum product file line length}% \xx{maximum}{output file line length}% \xx{pragma}{maximum output file line length} As well as keeping an eye on input line lengths, FunnelWeb also keeps an eye on the line lengths of product files and flags all lines longer than a certain limit with error messages. Unlike the maximum input line length, which can vary dynamically throughout the input file, the maximum product file line length remains fixed throughout the generation of all the product files. The maximum product file line length pragma allows this value to be set. If there is more than one such pragma in an input file, the pragmas must all specify the same value. \begin{verbatim} pragma_moll = ps "maximum_output_line_length" s "=" s numorinf \end{verbatim} The default value is 80 characters. This pragma is only one of two constraints on the length of the lines of the product files. The \p{+W} command line option also contributes. The actual value that FunnelWeb uses is the minimum of the limits specified in the command line and pragmas. FunnelWeb does not monitor the length of the lines of its other output files (journal file, listing file, documentation file). \subsubsection{Typesetter} \xx{typesetter}{pragma}\xx{typesetter}{independence} The \p{typesetter} pragma allows the user to specify whether the input file is supposed to be typesetter-independent, or whether it contains commands in a particular typesetter language. The pragma has the following syntax. \begin{verbatim} pragma_typesetter = ps "typesetter" s "=" s ("none" | "tex") \end{verbatim} The two forms of the pragma look like this. \begin{verbatim} @ typesetter = none @ typesetter = tex \end{verbatim} A source file can contain more than one typesetter pragma, but they must all specify the same value. The default is \p{none}. The typesetter setting affects two things: \narrowthing{Handling of free text:}{If the typesetter is not \p{none}, Weave writes the free text \i{directly} to the documentation file without changing it whatsoever. This means that if (say) \p{\char`\\ centerline} appears in the input file, it will copied directly to the documentation file. If the typesetter is \p{none}, Weave intercepts any characters or sequences that might have a special meaning to the target typesetter and replaces them with typesetter commands to typeset the sequences so that they will appear as they do in the input. For example, if the typesetter is \p{none} and the target typesetter is \TeX{}, then if \p{\$} (the \TeX{} \dq{mathematics mode} character) appears in the input file, it will be be written to the documentation file as \p{\char`\\ \$}.} \narrowthing{Restrictions on the target typesetter:}{At a later date, different weave modules might be incorporated into FunnelWeb to cater for a variety of different typesetters. If this happens, it will be important to ensure that typesetter-specific source files (\ie{}\p{typesetter} $\ne$ \p{none}) are not processed with different target typesetters. For example, a user might innocently attempt to generate a \p{troff} documentation file from a FunnelWeb source file containing a \p{typesetter = tex} (and by implication \TeX{} control sequences). The pragma could also be useful for catching typesetter clashes in source and include files. The setting \p{none} is special because it is guaranteed to work with any future target typesetter.} The aim of all this is to ensure that any typesetter dependency is correctly proclaimed. Because \p{none} is the default typesetter, a user who creates a source file without a \p{typesetter = x} pragma will soon find that the control sequences they are inserting into the source document are appearing verbatim in the printed documentation! In order to activate these sequences, they will be forced to add a \p{typesetter} pragma, thus making the dependency explicit. It may seem strange to place the \p{typesetter} setting facility within a pragma (\p{@p}) when there is a separate typesetting construct (\p{@t}). This has been done to sustain the rule of thumb that says that pragmas do not participate in the parser-level syntax, but typesetter directives do. \subsection{Freestanding Typesetter Directives} \xx{typesetter}{directives} FunnelWeb provides two kinds of typesetter directive to assist the user to produce documentation. These are \newterm{inline} and \newterm{freestanding}. Unlike pragmas, each of these categories of directive participates in the parser-level syntax and can appear only in certain contexts (see the parser section). Inline directives are designed to be used within paragraphs to alter the look of the enclosed text. Freestanding typesetter directives are designed to appear on lines of their own and have a bigger typographical impact. The syntax of freestanding typesetter directives is almost identical to that of pragmas. All the same syntax rules apply (except that the actual keywords are different). The following subsections describe the four typesetter directives available. \begin{verbatim} ftd = ftd_newpage | ftd_toc | ftd_vskip | ftd_title ts = "@t " \end{verbatim} \subsubsection{New Page} \x{new page}\xx{pragma}{new page} The new page pragma is a typesetting pragma with the following syntax. \begin{verbatim} ftd_newpage = ts "new_page" \end{verbatim} It only form looks like this. \begin{verbatim} @t new_page \end{verbatim} Its sole effect is to cause a \dq{skip to a new page} command to be inserted into the documentation file. The new page command is such that if the typesetter is already at the top of a page, it will skip to the top of the next page. \subsubsection{Table of Contents} \xx{table of}{contents}\xx{pragma}{table of contents} The new page pragma is a typesetting pragma with the following syntax. \begin{verbatim} ftd_toc = ts "table_of_contents" \end{verbatim} It only form looks like this. \begin{verbatim} @t table_of_contents \end{verbatim} Its sole effect is to instruct Weave to insert a table of contents at this point in the printed documentation. This pragma does not skip to a top of a new page first. \subsubsection{Vertical Skip} \xx{vertical}{skip}\xx{pragma}{vskip} The vertical skip pragma is a typesetting pragma that instructs Weave to insert a specified amount of vertical space into the documentation. The pragma has the following syntax. \begin{verbatim} ftd_vskip = ts "vskip" s number s "mm" \end{verbatim} For example: \begin{verbatim} @t vskip 26 mm \end{verbatim} \subsubsection{Title} \x{title}\xx{pragma}{title} The title pragma is a typesetting pragma with the following syntax. \begin{verbatim} ftd_title = ts "title" s font s alignment text font = "normalfont" | "titlefont" | "smalltitlefont" alignment = "left" | "centre" | "right" text = """" {printable_char} """" \end{verbatim} It's effect is to instruct Weave to insert a single line into the printed documentation containing the specified text set in the specified font and aligned in the specified manner. The double quotes delimiting the text are for show only; if you want to put a double quote in the string, you don't need to double them. Here is an example of the pragma. \begin{verbatim} @t title smalltitlefont centre "How to Flip a Bit" \end{verbatim} \subsection{Scanner/Parser Interface} If the scanner terminates without any errors, control is passed to the parser. The parser parses the token list generated by the scanner. The token list consists of text scraps, freestanding typesetter directives, and special sequence tokens. The user should bear in mind that \i{the scanner finishes running before the parser starts running.} This means that the scanner cannot be influenced in any way by higher order structures such as the parser might parse. For example, it is impossible to write a FunnelWeb macro to include a file, or insert a \p{vskip} pragma into the input text. \section{Parser} \x{parser} By the time the parser starts, the scanner has completely terminated. At this point, it is not possible for any more files to be included, and special characters are no longer present to confuse things. All that remains is a list of \newterm{text tokens}, \newterm{special tokens}, and \newterm{typesetter directive tokens}. Text tokens consist entirely of sequences of printable characters and end of line markers. Special tokens represent the special sequences that the scanner found in the input file. Typesetter directive tokens represent the freestanding typesetter directives that the scanner encountered. The parser consumes the token list and builds a macro table that is later used to generate product files. It also constructs a document list that is used to generate the documentation file. The syntax rules appearing in the following sections refer to the token list. \subsection{High Level Structure} \xx{syntax}{high level} At the highest level, the FunnelWeb parser parses the input file (token list) into a sequence of text scraps, macro definitions, and typesetter directives. \begin{verbatim} input_file = {text | macro | directive} \end{verbatim} All three of these kinds of components contribute to the documentation file, but only macro definitions contribute to the product files. If all the free text and directives were removed from a FunnelWeb input file, the product files would not be affected. \subsection{Free Text} \xx{free}{text} \newterm{Free text} is any text that is not part of a macro definition or a directive. A scrap of free text consists of a sequence of items drawn from the following list: non-special printable characters, insert-eol special sequences, insert special character special sequences, insert arbitrary character special sequence. \begin{verbatim} free_text = ordinary_text ordinary_text = {ordinary_char | eol | text_special}+ text_special = "@+" | "@@" | "@^" char_spec ordinary_char = " ".."~"-special \end{verbatim} An example of some rather messy free text is as follows: \begin{verbatim} This@@ is a very@+ messy @^D(009)chunk of text indeed. But FunnelWeb still views it as a single chunk of text. \end{verbatim} FunnelWeb never sees two text chunks next to each other in the input; they are always merged into a single text token. The free text in an input file does not affect the product files. However, by default, it appears in the printed documentation exactly as it is given in the input file, except that it is filled and justified into paragraphs. Any printable character or particular sequence of characters may appear in the free text of a document. FunnelWeb ensures that they will appear exactly as given in the input file, even if they happen to be escape characters or commands in the target typesetter. However, FunnelWeb also provides a special mode that allows this censoring to be overridden. See Section~\label{typesetterpragma} for more information. \subsection{Typesetter Directives} \x{directives} FunnelWeb provides a variety of typesetter directives to assist the user to typeset the document in a typesetter-independent way. These are divided into \newterm{freestanding typesetter directives} (ftd) and \newterm{inline typesetter directives} (itd). The internal syntax of the freestanding typesetter directives has already been discussed in the scanner section. The following syntax rule defines the context in which these constructs can appear. \begin{verbatim} directive = ftd | itd itd = section | literal | emphasis \end{verbatim} The remainder of this section describes the inline typesetter directives. \subsubsection{Section} \xx{section}{constructs} The section directive provides a way for the user to structure the program and documentation into a hierarchical tree structure,\xx{tree}{structure} just as in most large documents. A section construct consists of a case-insensitive identifying letter, which determines the absolute level of the section in the document, and an optional section name, which has exactly the same syntax as a macro name. \begin{verbatim} section = "@" levelchar [name] levelchar = "A" | "B" | "C" | "D" | "E" | "a" | "b" | "c" | "d" | "e" \end{verbatim} The section construct is not quite \dq{inline} as it must appear only at the start of a line. However, unlike the \dqp{@i}, \dqp{@p}, and \dqp{@t} constructs, it does not consume the remainder of the line (although it would be silly to place anything on the same line anyway). FunnelWeb provides five levels of sections, ranging from the highest level of \p{A} (like a \LaTeX{}\x{LaTeX} chapter) to the lowest level of \p{E} (like a \LaTeX{} subsubsubsection). FunnelWeb input files need not contain any sections at all, but if they do, the first section must be at level \p{A}, and following sections must not skip hierarchical levels (\eg{}an \p{@D} cannot follow an \p{@C}). FunnelWeb generates an error if a level is skipped. All section \i{must} have names associated with them, but for convenience, the section name is optional if the section contains one or more macro definitions (\ie{}at least one macro definition appears between the section construct in question and the next section construct in the input file.). In this case, the section \i{inherits} the name of the first macro defined in the section. This feature streamlines the input file, avoiding duplicate name inconsistencies. Any sequence of printable characters can be used in the section name,\xx{section}{name} even the target typesetter's escape sequence (\eg{}in \TeX{}, \dqp{\bs{}}). The following example demonstrates the section construct. \begin{verbatim} @A@ This is the main simulation module for planet earth, simulated down to the molecular level. This is a REALLY big program. I mean really big. I mean, if you thought the X-Windows source code was big, you're in for a shock... @B We start by looking at the code for six legged stick insects as they form a good example of a typical object-oriented animal implementation. @$@@{@- slsi.creep; slsi.crawl; slsi.creep;@} \end{verbatim} In the above example, the name for the level A section is provided explicitly, while the name for the level B section will be inherited from the macro name. \subsubsection{Literal Directive} \xx{literal}{directive} Experience has shown that one of the most common typesetting requirement is that of being able to typeset small program fragments in the middle of the documenting free text. Typically there is a frequent need to refer to program identifiers, and it assists the reader to have such identifiers typeset in the same manner as the program text in the macro definition. FunnelWeb~V1 defined a \TeX{} macro for this (called \p{p}) that simply typeset its argument in \p{tt font}. This proved so useful, that the facility has been made typesetter-independent in FunnelWeb~V3. To specify that some text be typeset in \p{tt font}, enclose the text in curly brace special sequences as follows. \begin{verbatim} literal = "@{" ordinary_text "@}" \end{verbatim} As in macro names, section names, and macro bodies, the text contained within the literal construct is protected by FunnelWeb from any non-literal interpretation by the typesetter and the user is free to enclose \i{any} text covered by the definition \p{ordinary\_text}. FunnelWeb guarantees that, no matter what the text is, it will be typeset in \p{tt font} exactly as it appears. However, the text will be filled and justified into a paragraph as usual. Here is an example of the use of the construct: \begin{verbatim} @C The @{WOMBAT@} (Waste Of Money, Brains, And Time) function calls the @{kangaroo@} input function which has been known to cause keybounce. This keybounce can be dampened using the @{wet_sloth@} subsystem. \end{verbatim} \subsubsection{Emphasis Directive} \xx{emphasis}{directive} The emphasis directive is very similar to the literal directive except that it causes its argument to be typeset in an emphasised manner (\eg{}italics). Like the literal directive, the emphasis directive protects its text argument. \begin{verbatim} emphasise = "@/" ordinary_text "@/" \end{verbatim} Example: \begin{verbatim} @C What you @/really@/ need, of course, is a @/great@/, @/big@/, network with packets just flying @/everywhere@/. This section implements an interface to such a @/humungeous@/ network. \end{verbatim} \subsection{Macros} \xx{macro}{definition} The third category of construct appearing at the highest syntactic level in a FunnelWeb input file is the macro definition. A macro definition binds a unique \newterm{macro name} to a \newterm{macro body} containing an \newterm{expression} consisting of text, calls to other macros, and formal parameters. The syntax for a macro definition is as follows: \begin{verbatim} macro = ("@O" | "@$") name [formal_parameter_list] ["@Z"] ["@M"] ["==" | "+="] "@{" expression "@}" \end{verbatim} The complexity of the macro definition syntax is mostly to enable the user to attach various attributes to the macro.\xx{macro}{attributes} If the user chooses \p{@O}, then the macro cannot be called, but is instead attached to a product file. If the user chooses \p{@\$}, then the macro is an ordinary macro definition that is not attached to a file. By default, a non-file macro must be invoked exactly once by one other macro. Macros that aren't are flagged with errors by the FunnelWeb analyser. However, if the user uses the \p{@Z}\x{@Z} sequence in the macro definition, the macro is then permitted to be invoked zero times, as well as once. Similarly, if the user uses the \p{@M}\x{@M} sequence in the macro definition, the macro is permitted to be called many times as well as once. If both \p{@Z} and \p{@M} are present then the macro is permitted to be invoked zero, one, or many times. The purpose of enforcing the default \dq{exactly one call} rule is to flag pieces of code that the user may have defined in a macro but not hooked into the rest of the program. Experience shows that this is a common error. Similarly, it can be dangerous to multiply invoke a macro intended to be invoked only once. For example, it may be dangerous to invoke a scrap of non-idempotent initialization code in two different parts of the main function of a program! However, FunnelWeb will not generate an error if a macro without \p{@M} is called by another macro that is called more than once. If the text string \p{==}\x{==} (or nothing) follows the macro name, the expression that follows is the entire text of the macro body. If the text string \p{+=}\x{+=} follows the macro name, then more than one such definition is allowed (but not required) in the document and the body of the macro consists of the concatenation of all such expressions in the order in which they occur in the input file. Such a macro is said to be additive and is \newterm{additively defined}. Thus a macro body can either be defined in one place using one definition (using \p{==}) or it can be \i{distributed} throughout the input file in a sequence of one or more macro definitions (using \p{+=}). If neither \p{==} and \p{+=} are present, FunnelWeb assumes a default of \p{==}. Macros attached to product files cannot be additively defined. Additively defined macros can have parameter lists and \p{@Z} and \p{@M} attributes, but these must be specified only in the first definition of the macro. However, \p{+=} must appear in each definition. \subsubsection{Names} \x{names}\xx{macro}{names}\xx{section}{names} Names are used to identify macros and sections. A name consists of a sequence of from zero to 80 printable characters, including the blank character. End of line characters are not permitted in names. Names are case sensitive; two different macros are permitted to have names that differ in case only. Like free text, names are typeset by FunnelWeb and are safe from misinterpretation by the target typesetter. For example, it is quite acceptable to use the macro name \p{@<\bs{}medskip@>} even if the target typesetter is \TeX{}. \begin{verbatim} name = "@<" name_text "@>" name_text = {ordinary_char | text_special} \end{verbatim} \subsubsection{Formal Parameter Lists} \xx{parameter lists}{formal} FunnelWeb allows macros to have up to nine macro parameters, named \p{@1},\x{@1...} \p{@2}, $\ldots$, \p{@9}. If a macro does not have a formal parameter list, it is defined to have no parameters, and an actual parameter list must not appear at the point of call. If a macro has a formal parameter list, it is defined to have one or more parameters, and a corresponding actual parameter must be supplied for each formal parameter, at the point of call. Because FunnelWeb parameters have predictable names, the only information that a formal parameter list need convey is \i{how many} parameters a macro has. For this reason a formal parameter list takes the form of the highest numbered formal parameter desired, enclosed in parentheses sequences. \begin{verbatim} formal_parameter_list = "@(" formal_parameter "@)". formal_parameter = "@1" | "@2" | "@3" | "@4" | "@5" | "@6" | "@7" | "@8" | "@9" \end{verbatim} \subsection{Expressions} \xx{macro}{expressions} Expressions are FunnelWeb's most powerful form of expressing a text string. Macro bodies are defined as expressions. Actual parameters consist of expressions. An expression consists of a sequence of zero or more expression elements. An expression element can be ordinary text, a macro call, or a formal parameter of the macro \i{definition} in which the formal parameter occurs. \begin{verbatim} expression = {ordinary_text | macro_call | formal_parameter} \end{verbatim} \subsection{Macro Calls} \xx{macro}{calls} A macro call consists of a name optionally followed by an actual parameter list. The number of parameters in the actual parameter list must be the same as the number of formal parameters specified in the definition of the macro. If the macro has no formal parameter list, its call must have no actual parameter list. \begin{verbatim} macro_call = name [actual_parameter_list] actual_parameter_list = "@(" actpar { "@," actpar } "@)" actpar = expression | ( whitespace "@""" expression "@""" whitespace ) whitespace = {" " | eol} \end{verbatim} FunnelWeb allows parameters to be passed directly, or delimited by special double quotes.\xx{macro parameter}{delimiting} Each form is useful under different circumstances. Direct specification is useful where the parameters are short and can be all placed on one line. Double quoted parameters allow whitespace on either side (that is not considered part of the parameter) and are useful for laying out rather messy parameters. Here are examples of the two forms. \begin{verbatim} @@( @"x:=1;@" @, @"x<=10;@" @, @"print "x=%u, x^2=%u",x,x*x; x:=x+1;@+@" @) @@(red@,green@,blue@,yellow@) \end{verbatim} As shown, the two forms may be mixed within the same parameter list. Experience has shown that the vast majority of macros have no parameters. \subsection{Formal Parameters} \xx{formal}{parameters} Formal parameters can appear in the expressions forming macro bodies in accordance with the syntax rules defined above. A formal parameter expands to the text of the expansion of its corresponding actual parameter. There is nothing preventing a formal parameter being provided as part of an expression that forms an actual parameter. In that happens, the formal parameter is bound to the actual parameter of the calling macro, not the called macro. After the following definitions, \begin{verbatim} @$@@(@1@)=@{A walrus in @1 is a walrus in vain.@} @$@@(@1@)=@{@@(S@1n@)@} \end{verbatim} the call \begin{verbatim} @@(pai@) \end{verbatim} will result in the expansion \begin{verbatim} A walrus in Spain is a walrus in vain. \end{verbatim} \subsection{Macros are Static} \xx{macro}{definition}\xx{macro}{expansion}\xx{macros}{static} In FunnelWeb, the actions of \i{macro definition} and \i{macro expansion} occur during two separate phases (parser and tangle) and cannot be interleaved. As a result, the FunnelWeb macro facility is completely static. It is not possible for one macro to define another while the first macro is being expanded; each must be defined statically. It is not possible to define a macro to even assist in the definition of other macros. Because the scanner, parser, analyser, and tangler phases are all invoked sequentially, there is no room for feedback of definitions between different levels (\eg{}the user cannot define a macro for the \p{vskip} pragma). This lack of power is fully intentional. By totally excluding the more incomprehensible ways in which a general purpose macro preprocessor can be used, FunnelWeb provides definite guarantees to the reader of its input files: \begin{itemize} \item FunnelWeb guarantees that a piece of text does not contain a macro call unless it contains the special character followed by \p{$<$} or \p{\#}. \item FunnelWeb allows calls to be made to macros that are defined later in the input file. \end{itemize} \section{Analyser} \x{analyser}\xx{checks}{macro} The effect of the parser is to construct a macro table containing a representation of all the macros defined within the document, and a document list which contains a complete representation of the entire document. If there are no error diagnostics (or worse) at the end of the parser run, FunnelWeb invokes the analyser which tests for the following conditions and flags them with errors if they arise. \begin{itemize} \item No macros defined in the input file. \item No macros connected to output files. \item Call of an undefined macro. \item Call having the wrong number of parameters. \item Call of a macro that is connected to an output file. \item No calls made to a macro without the \p{@Z} option. \item More than one call made to a macro without the \p{@M} option. \item Directly or indirectly recursively defined macros. \item Unnamed sections that contain no macro definitions. \end{itemize} FunnelWeb performs a static analysis\xx{static}{analysis} to detect recursion.\xx{macro}{recursion} Unfortunately, the recursion detection algorithm flags all macros that have an infinite expansion rather than just all macros with a recursive definition. If A calls B, and B calls C, and C calls B, then FunnelWeb will flag A as well as B and C. It is hoped that this problem will be fixed in a later version. Because FunnelWeb does not provide any kind of conditional feature, the prevention of recursion does not represent a curtailment of expressive power. Macros may be invoked recursively, but may not be recursive. Thus: \begin{verbatim} @@(@@(Walrus@)@) @! LEGAL recursive invocation. @$@==@{@@} @! ILLEGAL recursive definition. \end{verbatim} \section{Tangle} \x{tangle} If the scanner, parser, and analyser have successfully (\ie{}with no errors, severe errors, or fatal errors) completed, and the Tangle option (\p{+O}) is turned on (it is by default), then the Tangle component of FunnelWeb is invoked to generate the product files specified in the \p{@O} macros of the input file. The operation of Tangle is very simple. Each \p{@O} macro is expanded and written to a file of the same name. As there are a finite number of macros, and the analyser guarantees that the macro structure is non-recursive, Tangle is guaranteed to terminate. Three remaining points are worth discussing. \begin{enumerate} \item Tangle expands macros using blank indentation unless the user has specified otherwise in an indentation pragma in the input file (see Section~\ref{indentationpragma}). \item Tangle keeps track of the length of the lines that it is writing and issues an error if any line of any product file that it generates is longer than the maximum. The maximum is the minimum of a value defaulted or specified in the input file (Section~\ref{mollpragma}), and the value (if any) provided by the \p{+w} command line argument (Section~\ref{commandlineoptions}). \item It is worth the user obtaining some understanding of the resources that FunnelWeb requires to perform its task. \end{enumerate} When FunnelWeb's scanner executes, it reads each file into memory where it is kept for the duration of the run. Thus, there must be room in memory for the entire input file, including all include files. While this approach may seem expensive in memory, it is almost necessary in order to support forward references. To merely scan the input file, recording the macro names, but leaving the text on disk, would require many random access disk seeks. In contrast, FunnelWeb never builds an internal representation of the product file. Instead, each piece of output is written immediately to the product file. This means that as long as the input file fits in memory, the product file can be arbitrarily large. It also means that users need not fear to define or call macros that they know will expand to megabytes of text. Nor need they fear placing a call to such a macro as part of an actual parameter. FunnelWeb does not ever expand actual parameters internally. In fact, it does not expand them until it hits the corresponding formal parameter during its expansion of the called macro. At that point, it looks up the \i{expression} (not the expansion of the expression) for the corresponding actual parameter, and starts expanding it. \section{Weave} \x{weave}\x{typesetting} If the scanner, parser, and analyser have successfully (\ie{}with no errors, severe errors, or fatal errors) completed, and the Weave option (\p{+T}) is turned on (it is \i{off} by default), then the Weave component of FunnelWeb is invoked to generate a text file in the format of a particular typesetter. The result, when fed through the particular typesetter and printed, is a fully typeset representation of the entire input file complete with cross referencing information. \subsection{Target Typesetter} \xx{target}{typesetter} Currently, FunnelWeb produces documentation files in the format of only one typesetter --- \TeX{}. However, the Weave package of FunnelWeb is fairly small, and it is hoped that it can be rewritten so as to provide a collection of typesetter modules from which the user will be able to choose using a command line argument. \subsection{Cross Reference Numbering} \xx{cross}{referencing}\xx{cross reference}{numbering} \xx{section}{numbering} When FunnelWeb produces its typeset documentation, it \i{numbers} each section and each macro definition and cross references the macro definitions. The exact scheme used has been carefully thought out. However, as it can be a little confusing to the beginner, it is explained here in full. The most important thing is that there is \i{no relation} between the macro numbering and the section numbering. In Knuth's Web there are only section numbers. In FunnelWeb, the numbering of sections and macros is separated. In FunnelWeb, \i{sections} are numbered hierarchically in ascending order. For example, the second level-C section of the third level-B section of the first level-A section is numbered \dq{1.3.2}. In contrast, \i{macro definitions} are numbered sequentially in ascending order. For example, the first macro definition is number 1, the second is number 2, and so on. Note that it is \i{macro definitions} that are numbered, not \i{macros}. This distinction is necessary because additive macros (\ie{}the ones with \p{+=}) can be defined by a collection of partial definitions scattered throughout the input file. A single additive macro may be defined in definitions 5, 67, 128, and 153. \section{FunnelWeb Shell} \label{commandshell}% \xx{FunnelWeb}{command shell}\xx{commands}{FunnelWeb}\xx{FunnelWeb}{shell} \subsection{Introduction} One of the goals of FunnelWeb is that it must be extremely portable. Huge efforts, desperate actions, and great sacrifices were made in the name of portability. For example, FunnelWeb is written in~C. An equally important goal was that of correctness and reliability. To this end, it was determined that a large automated suite of test programs be prepared to assist in regression testing. Preparing the test suite was tedious, but achievable. Automating it portably was more difficult. The difficulty faced was that if FunnelWeb was implemented in the form of a utility that could be invoked from the operating system command language, the only way to set up regression testing was in the command language of the operating system of the target machine (shellscripts for UNIX, DCL for VMS, batch files for MSDOS, and \i{nothing} on the Macintosh). The huge variation in these command languages led to the conclusion that either the automation of regression testing would have to be rewritten on each target machine, or a small command language would have to be created within FunnelWeb. In the end, the twin goals of portability and regression testing were considered so important that a small command shell was constructed inside FunnelWeb. This is called the \newterm{FunnelWeb command shell}, or just \dq{the shell} for short. By default, when FunnelWeb is invoked, it does not enter its shell. If just given the name of an input file, it will simple process the input file in the normal manner and then terminate. To instruct FunnelWeb to invoke its shell, the \p{+K} or \p{+X} command line option must be specified when FunnelWeb is invoked from the operating system. It is also invoked upon startup if the file \p{fwinit.fws} exists. Most FunnelWeb users will never need to use the shell and need not even know about it. There are four main uses of the shell:\xx{uses}{shell} \begin{enumerate} \item As a tool to support automated regression testing. \item As a development tool on machines that do not have a built in shell (\eg{}the Macintosh). The shell can be used to process whole groups of files automatically. \item As a convenience. A user working on a multi-tasking, multi-window workstation\x{workstation} may wish to keep an interactive session of FunnelWeb going in one window rather than having to run up the utility each time it is required. \item As a convenient vehicle for enclosing utilities. The FunnelWeb shell contains useful general purpose commands such as the differences command \p{diff}. \end{enumerate} \subsection{Return Statuses} \xx{errors}{shell}\xx{status}{success}% \xx{status}{warning}\xx{status}{error}\xx{status}{severe}\xx{status}{fatal}% \xx{status}{assertion} The hierarchy of diagnostics described in Section~\ref{diagnostics} is also used in the shell commands. Each shell command returns a status which can affect further processing. \narrowthing{Success}{status is the normal command return status.} \narrowthing{Warning}{status is returned if some minor problem arose with the execution of the command.} \narrowthing{Error}{status is returned if a significant problem arises during the execution of the command. However, unlike a severe error, it does \i{not} cause termination of the enclosing shellscript.} \narrowthing{Severe error}{status is returned if a problem arises during the execution of the command that prevents the command from delivering on its \dq{promise}. A severe error causes FunnelWeb to abort the script (and any stacked scripts) to the interactive level. (However, the \p{tolerate} command allows this to be temporarily overridden).} \narrowthing{Fatal error}{status is returned if a problem arises that is so serious that execution of FunnelWeb cannot continue. A fatal error causes FunnelWeb to abort to the operating system level.} \narrowthing{Assertion error}{status is never returned. If an assertion error occurs, FunnelWeb bombs out ungracefully to the operating system. Assertion errors should never happen. If they do, then there is a bug in FunnelWeb.} To be precise, the status returned by each command is a vector of numbers being the number of each of the different kinds of diagnostic generated by the command. Usually only one kind of diagnostic is generated. However, the \p{fw} command and a few of the other commands can generate more than one kind of diagnostic. These status vectors are summed internally where they may later be accessed using the \p{status} command. However, the current diagnostic state evaporates as soon as the next command is encountered. \subsection{Command Line Length} \xx{command}{length} The maximum length of a shell command line is guaranteed to be at least 300 characters. \subsection{String Substitution} \label{stringsubstitution}\xx{string}{substitution} Most command shells provide some form of string substitution so as to provide some degree of parameterization. The FunnelWeb shell provides 36 different string variables named \p{\$0..\$9} and \p{\$A..\$Z} (case insensitive). Each variable can hold a string containing any sequence of printable characters and can be as long as a command line. The \p{define} command\xx{define}{command} allows the user to assign a value to these variables. The \p{define} command takes two arguments. The first is the digit or letter of the variable to be defined. The second is a double quote delimited string being the string value to be assigned to the variable. If you want to include a double quote character within the string, you don't need to double it. Examples: \begin{verbatim} define 3 "/root/usr/usrs/users/users5/thisuser/workdir/fwdir/testdir" define M "/user/local/rubbish/bin/fw" define Q "You don't need to double" double quotes" \end{verbatim} Only the identifying character of the variable being assigned is used in the definition. This syntax is a simple way of preventing the variable from being substituted before it has a chance to be defined! The following points clean up the remaining semantic details: \begin{itemize} \item There is only one set of variables and they are global to all shellscripts. There are \i{no local variables}. \item When a shellscript is invoked using the \p{execute} command, the substitution variables \p{0} through \p{9} are affected. See Section~\ref{executecommand} for more details. \item If you want to include a dollar sign character in a command use \dqp{\$\$}. \item FunnelWeb also defines \dqp{\$/} which translates to the character that separates directory and file name fields in file names on the host machine. For example: Sun=\dqp{/}, Vax=\dqp{]}, Mac=\dqp{:}, PC=\dqp{\bs{}}. \item Substitution is not performed recursively. \end{itemize} \subsection{How a Command Line is Processed} \xx{command line}{processing} When FunnelWeb reads in a command line (from the console or a script file), it processes it in the following sequence: \begin{enumerate} \item The command line is checked for non-printable characters. If there are any, they are flagged with a severe error. \item All dollar string substitution variables in the command line are replaced by their corresponding string. The command line is processed from left to right. Substitutions are performed non recursively. \item At this point, if the line is empty, or consists entirely of blanks, it is ignored and the interpreter moves to the next line. \item A severe error is generated if the line at this stage begins with a blank. \item If the first character of the line is \dqp{!}, the line is a comment line and is ignored. \item The run of non-blanks commencing at the start of the line is compared case-insensitively to each of the legal command verbs. If the command is illegal, a severe error is generated, otherwise the command is processed. \end{enumerate} \subsection{Options} \xx{command}{options}\xx{default}{options} The FunnelWeb shell maintains three sets of command line options. \begin{enumerate} \item The set of options resulting from applying the operating system level command line arguments to the default option settings. \item A set of shell options that prevail during the shell invocation. \item The set of option values active during a particular invocation of FunnelWeb proper. \end{enumerate} When FunnelWeb is invoked from the operating system with just \p{+F}, only the first of these three sets comes into existence. If the user invokes the FunnelWeb shell, the shell options come into existence and are initialized with the value of the first set. These shell options are used as the default for all subsequent \p{fw} commands. However, they can be altered using the script command \p{set}. If a \p{fw} command executed in a shell contains additional command line options, these override the shell options for that run, but do not change the shell options. An example follows: \begin{verbatim} $ fw +k +t ! Original invocation of FunnelWeb from OS. ! Shell options are now default with "+t". FunnelWeb>fw sloth ! Equivalent to fw sloth +t. FunnelWeb>set -l ! Change the l shell option. FunnelWeb>fw sloth +q ! Equivalent to fw sloth +t -l +q. FunnelWeb>fw sloth ! Equivalent to fw sloth +t -l. \end{verbatim} The existence of the shell option set means that the user can set up a set of defaults to be applied to all \p{fw} commands issued within the shell. \subsection{Shell Commands} \xx{shell commands}{list}\xx{commands}{shell} This section describes each of the FunnelWeb shell commands. The syntax is: \begin{verbatim} shell_command = absent | codify | compare | define | diff | diffsummary | diffzero | eneo | execute | exists | fixeols | help | here | quit | set | show | skipto | status | tolerate | trace | write | writeu s = {" "}+ \end{verbatim} As a rule, FunnelWeb shell commands return severe status if their arguments are syntactically incorrect or if they are unable to successfully operate on argument files. \subsubsection{Absent} \xx{command}{absent} The \p{absent} command performs no action except to return a status. If the file specified in its argument doesn't exist it returns success status, otherwise it returns severe status. \begin{verbatim} Syntax : absent = "absent" s filename Example: absent result.out \end{verbatim} This command is useful in regression testing for making sure that FunnelWeb \i{hasn't} produced a particular output file. \subsubsection{Codify} \xx{command}{codify} The \p{codify} command takes two arguments: an input file and an output file. It reads each line of the input file and writes a corresponding line to the output file. The corresponding line consists of a C macro call containing a string containing the input line. The command converts all backslashes in input lines to double backslashes so as to avoid unwanted interpretations by the C compiler. It also converts double quotes in the line to backslashed double quotes. \begin{verbatim} Syntax : codify = "codify" s filename s filename Example: codify header.tex header.c \end{verbatim} The following example demonstrates the transformation. \begin{verbatim} Input Line: \def\par{\leavevmode\endgraf}% A "jolly good hack". Output Line: WX("\\def\\par{\\leavevmode\\endgraf}% A \"jolly good hack\"."); \end{verbatim} The \p{codify} command was introduced to assist in the development of FunnelWeb. It is used to convert longish text files into C code to write them out. The C code is then included within the FunnelWeb C program. For example, the set of \TeX{} definitions that appears at the top of every documentation file was \p{codif}ied and inserted into the FunnelWeb code so that FunnelWeb would not have to look for a file containing the definitions at run time. \subsubsection{Compare} \xx{command}{compare} The \p{compare} command takes two filename arguments and performs a binary comparison of the two files. If the files are identical, success status is returned. If they are different, severe status is returned. No information about the manner in which the files differ is conveyed. \begin{verbatim} Syntax : compare = "compare" s filename s filename Example: compare result.txt answer.txt \end{verbatim} The \p{compare} command was created as the main checking mechanism for regression testing. However, its binary output was soon found to be unworkable and the more sophisticated \p{diff} command was added so that the actual differences between the files could be examined. \subsubsection{Define} \xx{command}{define}\xx{string}{substitution} The \p{define} command assigns a value to a shell string substitution variable. The \p{define} command takes two arguments. The first is the digit or letter of the variable to be defined. The second is a double quote delimited string being the string value to be assigned to the variable. If you want to include a double quote character within the string, you don't need to double it. \begin{verbatim} Syntax : define = "define" s letter s """" text """" Examples: define 3 "/usr/usrs/thisuser/workdir/fwdir/testdir" define M "/user/local/rubbish/bin/fw" define Q "You don't need to double" double quotes" \end{verbatim} The command interpreter expands the command line before it executes the \p{define} command. This means that you can define string substitution variables in terms of each other with static binding. The \p{define} command was introduced to allow the parameterization of the directories involved in regression testing. See Section~\ref{stringsubstitution} for more details. \subsubsection{Diff} \xx{command}{diff}\xx{file}{differences} The \p{diff} command reads in two text files and \i{appends} a report to a log file containing a list of the differences between the two input files. If the log file does not already exist, an empty one is created first. \begin{verbatim} Syntax : diff = "diff" s filename s filename s filename s ["ABORT"] Examples: diff result.tex answer.tex diff.log diff $Otest23.out $Atest23.out $Ldiff.log ABORT \end{verbatim} The \p{diff} command performs a full line-based differences operation. It will identify different sections in a file, even if they are of differing length. The implementation of the \p{diff} command is quite complicated. To be sure that it is at least getting its same/different proclamation right, the \p{diff} command performs a binary comparison as an extra check. The following points describe the rules for determining the result status. \begin{enumerate} \item \p{diff} aborts with a severe error if the log file cannot be opened or created for appending. \item An ordinary error is generated if either or both of the input files cannot be opened. \item If, at the end of the run, the two input files have not been proven to be identical, and the \p{ABORT} keyword is present, \p{diff} returns severe status. \item \p{diff} returns success status if none of the above conditions (or similar conditions) occur, even if the two files are different. \end{enumerate} The \p{diff} command \i{appends} its differences report rather than merely writing it. This allows a regression test script to perform a series of regression tests and produce a report for the user. The \p{diff} command was added to the shell after it had become apparent that the simpler \p{compare} command was not yielding enough information. Whereas early on, regression testing was treated mainly as a tool to ensure that FunnelWeb was being ported to other machines correctly, it began to place an increasing role during development in identifying the effects of changes made to the code. The \p{diff} command supports this application of regression testing by pinpointing the differences between nearly-identical text files. \subsubsection{Diffsummary} \xx{command}{diffsummary} The \p{diffsummary} command writes a short report to the console giving the number of difference operations that have taken place and how many of the pairs of files compared were identical. Counting starts at the most recent execution of a \p{diffzero} command, or if there has been none, when FunnelWeb started up. \begin{verbatim} Syntax : diffsummary = "diffsummary" Examples: diffsummary \end{verbatim} The \p{diffsummary} command was added so as to allow regression testing scripts to display a summary of the results of the test. If the summary indicates that no pair of files differed, then there is no need to look in the \p{diff} log file. \subsubsection{Diffzero} \xx{command}{diffzero} The \p{diffzero} command zeros the different summary counters used by the \p{diff} and \p{diffsummary} commands. \begin{verbatim} Syntax : diffzero = "diffzero" Examples: diffzero \end{verbatim} The \p{diffzero} command was added so as to allow regression testing shellscripts to zero their differences counters at the start of a run. This allows testers to invoke the same regression testing script twice in one interactive session without receiving an inflated differences summary. \subsubsection{Eneo} \xx{command}{eneo} The \p{eneo} command takes one filename argument. If the file does not exist, no action is taken. If the file does exist, it is deleted. In both cases success status is returned. However, if the file exists and cannot be deleted, \p{eneo} returns severe status. \begin{verbatim} Syntax : eneo = "eneo" s filename Examples: eneo result.out \end{verbatim} The \p{eneo} command was added so as to allow regression testing scripts to ensure that existing output files were not present before proceeding with a test run. If FunnelWeb were to fail to generate an output file, it would be extremely undesirable for the old version to be used. ENEO stands for \b{E}stablish the \b{N}on \b{E}xistence \b{O}f. Most operating systems provide a command to delete files. Typically these commands are verbs such as \dq{delete}, \dq{remove}, and \dq{kill}. As a consequence, the designers of delete commands usually consider the command to have failed if it fails to find the file to be deleted. However, in my experience, the most common use for the delete command is to \i{establish the non-existence of} one or more files. Typically, a script is starting up and needs to clear the air before getting started. If the files are there, they should be deleted; if they are not, then that's OK too.\footnote{As far as I know, the \p{eneo} command is original.} \subsubsection{Execute} \label{executecommand}\xx{command}{execute} The \p{execute} command causes a specified text file to be executed as a FunnelWeb shellscript. The first argument is the name of the script file. The remaining arguments are assigned to the substitution variables \p{\$1}, \p{\$2}, $\ldots$, \p{\$9}. Substitution variables in the range \p{\$1} to \p{\$9} that do not correspond to an argument are set to the empty string \p{""}. \p{\$0} is set to the empty string regardless. The execute command can be used recursively, allowing shell scripts to invoke each other. A file extension default of \dqp{.fws} (FunnelWeb Script) applies to script files. \begin{verbatim} Syntax : execute = "execute" s filename {argument_string} Examples: execute megatest.fws /usr/users/ross/fwtest ! execute sloth \end{verbatim} The first example above will result in the following substitution variable assignments. \begin{verbatim} $0 = "" $1 = "/usr/users/ross/fwtest" $2 = "!" $3 = "" ... $9 = "" \end{verbatim} It should be stressed that there are no local variables in the FunnelWeb command language; the variables above are globally modified. The \p{execute} command was added to allow the creation of sub-scripts to test FunnelWeb in particular ways. \subsubsection{Exists} \xx{command}{exists} The \p{exists} command performs no action except to return a status. If the file specified in its argument exists it returns success status, otherwise it returns severe status. \begin{verbatim} Syntax : exists = "exists" s filename Example: exists test6.fw \end{verbatim} This command is useful in regression testing for ensuring that FunnelWeb has produced a particular output file. \subsubsection{Fixeols} \xx{command}{fixeols} The \p{fixeols} command takes two filename arguments: an input file and an output file. It reads in the input file and writes it to the output file changing all the end of line control character sequences to the local format. It can also take one filename argument, in which case it replaces the target file with its transformation. \begin{verbatim} Syntax : fixeols = "fixeols" s filename [s filename] Examples: fixeols imported.hak result.kln fixeols sloth.dat \end{verbatim} The \p{fixeols} command works by parsing the input file into alternating runs of printable characters (ASCII 20 to ASCII 126) and runs of non-printable characters (all the others). It then\xx{non-printable}{characters} parses each run of non-printable characters from left to right into subruns of non-printables not containing the same character twice. It then replaces each subrun with a native EOL.\footnote{Note: A native EOL can be inserted into a text file in a portable manner simply by writing \dqp{\bs{}n} to the text output stream.} For example, if a native EOL is \p{X}, and \p{ABCD} are non-printable characters, and the file to be converted is \begin{verbatim} thisABisABCDanABABexampleABCCCof the conversion. \end{verbatim} then \p{fixeols} would produce \begin{verbatim} thisXisXanXXexampleXXXof the conversion. \end{verbatim} The \p{fixeols} command was devised to solve the problem created sometimes when text files are moved from one machine to another (\eg{}with the kermit program) using a binary transfer mode rather than a text transfer mode. If such a transfer is made, and the text file line termination conventions differ on the two machines, one can wind up with a set of text files with improperly terminated lines. This can cause problems on a number of fronts, but in particular affects regression testing which relies heavily on exact comparisons between files. The \p{fixeols} command provides a solution to this problem by providing a portable way to \dq{purify} text files whose end of lines have become incorrect. The regression testing scripts all apply \p{fixeols} to their input and output files before each test. \subsubsection{Fw} \xx{command}{fw} The \p{fw} command allows FunnelWeb proper to be invoked from a shell script. The syntax is almost identical to the syntax with which FunnelWeb is invoked from the operating system. \begin{verbatim} Syntax : fw = "fw" s ordinary_funnelweb_command_line Examples: fw sloth +t +d fw -l walrus \end{verbatim} Some important points about this \p{fw} command are: \begin{itemize} \item Options are inherited from the default shell options. \item The \p{F} (input file option) must be turned on. \item The \p{K}, \p{H}, and \p{X} options must be turned off. \item The \p{J} option must be turned off. \item The options specified in a \p{fw} command do not affect the default shell options. \item This command performs no action in the VAX VMS version of FunnelWeb. \end{itemize} \subsubsection{Help} \xx{command}{help} The \p{help} command provides online help from within the FunnelWeb shell. It provides access to all of the same messages that the \p{+H} command line option does. \begin{verbatim} Syntax : help = "help" [s help_message_name] Examples: help help commands \end{verbatim} If no message name is given, the default message is displayed. It contains a list of the other help messages and their names. The actual messages themselves are not listed here. \subsubsection{Here} \xx{command}{here} The \p{here} command acts as a target for the \p{skipto} command. When the shell interpreter encounters a \p{skipto} command, it ignores all the following commands until it encounters a \p{here} command. \begin{verbatim} Syntax : here = "here" Example: here \end{verbatim} The \p{skipto}/\p{here} mechanism was created to allow groups of regression tests to be skipped during debugging without having to comment them out. For more information, see Section~\ref{skiptocommand}. \subsubsection{Quit} \xx{command}{quit} The \p{quit} command terminates FunnelWeb immediately and returns control to the operating system. This applies regardless of the depth of the script being executed. \begin{verbatim} Syntax : quit = "quit" Example: quit \end{verbatim} \subsubsection{Set} \xx{command}{set} The \p{set} command modifies the default shell options. For example, \p{set +t} sets the \p{+t} option for all subsequent FunnelWeb runs within the shell until another set command sets \p{-t}. \begin{verbatim} Syntax : set = "set" s ordinary_funnelweb_command_line Examples: set sloth +t +d set -lwalrus \end{verbatim} The restrictions on the \p{set} command are identical to those on the \p{fw} command except that, in addition, the \p{+F} option cannot be turned on in the \p{set} command. The set command is useful for setting option defaults before a long run of regression tests. It could also be useful to set default options in a FunnelWeb shell kept by a user in a workstation window. \subsubsection{Show} \xx{command}{show} The \p{show} command displays the current default shell options. These options are the options that subsequent \p{fw} commands will inherit. \begin{verbatim} Syntax : show = "show" Example: show \end{verbatim} \subsubsection{Skipto} \label{skiptocommand}\xx{command}{skipto} The \p{skipto} command causes the shell to ignore all subsequent commands until a \p{here} command is encountered. \begin{verbatim} Syntax : skipto = "skipto" Examples: skipto \end{verbatim} The \p{skipto}/\p{here} mechanism was created to allow groups of regression tests to be skipped during debugging without having to comment them out. It is like a cut price \p{goto}. For example, supposing that there were eight tests and that you had debugged the first five. You might want to skip the first five tests so that you can concentrate on the next three. The following code shows how this can be done. \begin{verbatim} skipto execute test infile1 execute test infile2 execute test infile3 execute test infile4 execute test infile5 here execute test infile6 execute test infile7 execute test infile8 \end{verbatim} It should be stressed that FunnelWeb performs full command line processing including the dollar substitutions before testing the line to see if it is \p{here}. This can lead to non-obvious problems. For example. \begin{verbatim} skipto ! Test the Parser ! --------------- define X "execute parsertest.fws" $X infile1 $X infile2 $X infile3 $X infile4 $X infile5 here \end{verbatim} The above looks correct, but, because the \p{define} command isn't executed (and \p{\$X} is not defined) the subsequent \p{\$X} lines result in a leading blanks error. The problem can be corrected by defining \p{\$X} before the \p{skipto} command. \subsubsection{Status} \xx{command}{status} The \p{status} command takes two forms. In its first form in which no arguments are given, it writes out the number of warnings, errors and severe errors that 1) were generated by the previous command and 2) have been generated during the entire shell invocation. In its second form it takes from one to three arguments each of which specifies a diagnostic severity and a number. The \p{status} command compares each of these numbers with the number of that diagnostic generated by the previous command and generates a severe error if they differ. \begin{verbatim} Syntax : status = "status" {s ("w"|"e"|"s") num}0..3 Examples: status status w1 e5 s1 status w4 status s1 e2 \end{verbatim} The \p{status} command was introduced to test the status results of commands during their debugging. It is also useful for checking to see that the right number of diagnostics have been generated at particular points in test scripts. \subsubsection{Tolerate} \xx{command}{tolerate} The \p{tolerate} command instructs the shell not to abort processing of the script if the next command generates one or more warnings, errors, or severe errors. For the purposes of this command, a blank line counts as a command, so be sure to place the \p{tolerate} command immediately above the command about which you wish to be tolerant. \begin{verbatim} Syntax : tolerate = "tolerate" Example: tolerate \end{verbatim} The tolerate command was introduced to allow FunnelWeb (\ie{}the \p{fw} command) to be tested in a script under conditions which would normally cause it to abort the script. \subsubsection{Trace} \xx{command}{trace} The \p{trace} command turns on or off command tracing during script execution. By default, tracing is turned off. \begin{verbatim} Syntax : trace = "trace" [s ("on" | "off")] Examples: trace on trace off \end{verbatim} The \p{trace} command was introduced to assist in the debugging of regression test scripts. \subsubsection{Write} \xx{command}{write} The \p{write} command accepts a double-quoted argument and writes it followed by an EOL to the console (standard output). There is no need to double any double quotes occurring within the string. \begin{verbatim} Syntax : write = "write" s string Examples: write "Now about to start the next test." write "You don't need to " double enclosed double quotes." \end{verbatim} The \p{write} command was added so as to allow regression testing scripts to inform the user of their progress. \subsubsection{Writeu} \xx{command}{writeu} The \p{writeu} command is identical to the \p{write} command except that it underlines the text on an additional following output line. \begin{verbatim} Syntax : writeu = "writeu" s string Examples: writeu "Test 6" \end{verbatim} \section{Concluding Remarks} This chapter defines the semantics of the FunnelWeb program. As stated at the start of this chapter, this document takes precedence over the FunnelWeb program. While the definition of FunnelWeb in this chapter is reasonably solid, it is far from watertight, and it is hoped that it can be tightened further in future versions. All constructive criticism will be gratefully received by the author Ross Williams (\p{ross@spam.adelaide.edu.au}). %==============================================================================% % End of Ch3.tex % %==============================================================================%