% % syntart.tex % % Article about syntax.sty % % (c) 1996 Mark Wooding, FWIW % % --- First, some evil hacking --- % % This lot sees if I'm in mid-document; if so it checks that necessary % packages are loaded and moans at the editor if necessary. Otherwise, % it loads an emulation of the layout, which might help me identify bad % line breaks. (I know I shouldn't be using \next like this, although % it's inside a group so I don't care.) \begingroup \makeatletter \edef\next#1#2{\ifx\documentclass\@notprerr#1\else#2\fi} \expandafter\endgroup\next { \ifx\syntdiag\xxundefined \edef\ehelp{\errhelp{I can't seem to find Mr Wooding's excellent^^J% `syntax' package. \space Please get it from CTAN if^^J% necessary, and say `\string\usepackage{syntax}' in the^^J% document preamble.}} \ehelp \errmessage{Package `syntax' not found} \fi \let\mdwendfile\endinput }{ \documentclass{baskart} \usepackage[rounded]{syntax} \shortverb\| \begin{document} \def\mdwendfile{\end{document}} } % --- Some other definitions --- % These three typeset common LaTeX-related things. I have no idea how these % should be formatted, although I guess this lot will be OK. \providecommand{\pkg}[1]{\textsf{#1}} \providecommand{\env}[1]{\textsf{\def\*{\ensuremath{*}}#1}} \providecommand{\cmd}[1]{\texttt{\string#1}} % I like tables centerd horizontally. While I can't assume my nice table % handling is present, I can at least try and make LaTeX's table handling % as pleasant as possible. % % There's a slight problem here: there's a good chance that the `|' character % is active (for verbatim things), and this will make LaTeX's standard % tabular environment /very/ upset. I'll fiddle the catcode, read the % argument, restore the catcode back to whatever it was before, and pass % the argument (with catcodes now firmly carved in concrete) to tabular. % % I also set up some abbreviations (based on the syntax of mdwtab's \hlx % command) for inserting little bits of vertical space around horizontal % rules. I'll describe the macros later. % % Oh, before I go on further, this environment expects an extra argument % which is the number of columns in the table. \newenvironment{mdwtbl}[1]{% \begin{center}% \small% \newcommand{\hv}{\hline\shortrow{#1}} \newcommand{\vhv}{\shortrow{#1}\hline\shortrow{#1}} \newcommand{\vh}{\shortrow{#1}\hline} \begingroup% \catcode`\|=12\relax% \startmdwtbl% }{% \end{tabular}\end{center}% } \newcommand{\startmdwtbl}[1]{\endgroup\begin{tabular}{#1}}% % --- Short rows of vertical rules in tables --- % % The following non-@-sign-requiring code inserts a number of columns of % height 2pt containing vertical rules at their extremities, as suggested % in Knuth's TeXbook, chapter 22. This is basically a primitive version % of the \vgap command from the `mdwtab' package. % % First, I've got to take into account whether the standard `tabular' or % the enhanced array.sty version is in use. The array package defines a % new parameter called \extrarowheight which I can check for. The % important difference is that the standard version puts negative space on % each side of a vertical rule to make it appear to have zero width. I % consider this to be a bug in the original version, since it makes the % left and right hand table rules look terribly uneven. \ifx\extrarowheight\xxundefined \newcommand{\shortvrule}{% \kern -.5\arrayrulewidth% \vrule height 2pt width \arrayrulewidth% \kern -.5\arrayrulewidth% } \else \newcommand{\shortvrule}{% \vrule height 2pt width \arrayrulewidth% } \fi % I'll do this in a rather odd way, to avoid playing with global registers. % I'll build a token sequence which will do the job in the token register % \toks 0, and then expand it. \newcommand{\shortrow}[1]{% \crcr% \omit% \iffalse{\fi\ifnum0=`}\fi% \toks 0={\shortvrule\hfil\shortvrule}% \count 0=#1\relax% \loop% \advance\count 0 by -1\relax% \ifnum\count 0>0\relax% \toks 0=\expandafter{\the\toks 0&\omit\hfil\shortvrule}% \repeat% \ifnum0=`{}\fi% \the\toks 0% \relax% <------ VITAL! \cr% } % --- Grammar typesetting things --- \grammarindent=.5in % --- Whew. Now I can actually start --- %\section{Syntax diagrams and other fun} \title{Syntax diagrams and other fun} \author[Mark Wooding]{Mark Wooding\\Email: \texttt{mdw@straylight.co.uk}} \begin{Article} %\begin{multicols}{2} % Oh, how I hate these waffly introductions \section{Introduction} Among other things, I write manuals for computer programs, and descriptions of syntax diagrams tend inevitably to creep in. Formal BNF~grammars are relatively easy to typeset using some simple definitions and a list-based environment. However, they can be rather daunting for less technical readers, containing as they do all manner of funny metasymbols.\footnote {Even the word `metasymbol' is a little scary.} Books attempting to cater for such readers (and even some technical manuals about C~programming) tend to present syntax using diagrams: the idea, if you haven't come across them already, is that you follow the lines on the diagram around writing any of the items you come across on your journey, until you reach the end. For example, the diagram below attempts to present the various diverse elements comprising the arsenal of the Spanish Inquisition, according to the now legendary Monty Python sketch. \begin{syntdiag} \begin{rep} \begin{stack} `fear and surprise' \\ `surprise and fear' \\ `ruthless efficiency' \\ `fanatical devotion to the pope' \\ `nice red uniforms' \end{stack} \\ `and' \end{rep} \end{syntdiag} The diagram permits any sequence of one or more `weapons', separated by the word \lit{and}. Compare this with the BNF equivalent: \begin{grammar} ::= | `and' ::= `fear and surprise' \alt `surprise and fear' \alt `ruthless efficiency' \alt `fanatical devotion to the pope' \alt `nice red uniforms' \end{grammar} Which do you think is easier to read? The \pkg{syntax} package provides some commands for typesetting syntax diagrams like the one above (and for typesetting BNF~grammars). \section{Building syntax diagrams} Syntax diagrams are typeset using the \env{syntdiag} environment. This puts \LaTeX\ into a special `syntax diagram' mode: only syntax diagram commands should be used while in this mode: anything else will upset the typesetting and produce output which looks truly awful. To help you lay out your source nicely, spaces and newlines (including blank newlines) are totally ignored within syntax diagrams. (They will be enabled again when spaces and newlines are actually useful, so there's no need to worry about this.) One important point about syntax diagrams must be made here: like the \env{verbatim} environment, the \env{syntdiag} environment cannot be used inside the argument of a command. The other point is that you can only use syntax diagrams when you're in paragraph mode -- there's a \env{syntdiag\*} environment which is designed for use in LR mode, although that's more oriented towards presenting fragments of diagrams. If you just write an empty \env{syntdiag} environment, you get a line across the current text column with a double headed arrow on each end, like this: \begin{syntdiag} \end{syntdiag} This is clearly not much good, and we need to learn how to make it more interesting. \section{Simple bits of syntax diagrams} Syntax diagrams are built up from a small number of simple blocks. You just need to learn how to put these blocks together to build quite complicated looking diagrams. The simplest things you can put in syntax diagrams are \emph{syntax objects}. There are three types of syntax objects built in, and they have special abbreviations because they get used so much. The three types are: \begin{description} \item [Nonterminals] stand for some (possibly fairly complicated) syntactic entity. They look like \synt{this}\footnote {Well, you can change the style so that they look like anything you want, although this is how they look by default, and I'd recommend that you don't change the style too radically, because you'll confuse your readers utterly. The styles of the other syntax objects can also be changed.} in syntax diagrams. You type nonterminals by typing the text within `\lit*{<}\dots\lit*{>}', like \verb||. \item [Terminals] describe text which should be typed in exactly as shown. They look like \lit{this} (or possibly like \lit*{this}) in the diagram. You can type a quoted terminal by just typing single quotes around the text, like \verb|`this'|; unquoted terminals like the second example are obtained by using double quotes, like \verb|"this"|. \item [Composites] can contain terminals and nonterminals, together with normal text typed in \LaTeX's LR~mode. You type a composite object by using the \cmd{\tok} command -- the argument contains the text of object. Within this argument, you can type nonterminals and terminals using the abbreviations described above. Also, the `\verb"|"' character will produce a $\mid$ symbol, which is used to indicate alternatives. \end{description} % This figure wants to go at the top of the next page. Putting it later % puts it on a float page, which is not what I want at all. So I'll put % it here, because this appears to work. \begin{figure*}[t] \begin{syntdiag} \begin{stack} \begin{rep} \end{rep} \begin{stack} \\ `.' \begin{stack} \\ \begin{rep} \end{rep} \end{stack} \end{stack} \\ `.' \begin{rep} \end{rep} \end{stack} \begin{stack} \\ \begin{stack} `E' \\ `e' \end{stack} \begin{stack} \\ `+' \\ `-' \end{stack} \begin{rep} \end{rep} \end{stack} \end{syntdiag} \caption{Syntax diagram for a floating point number} \label{fig:synt.float} \end{figure*} \begin{figure*}[t] \vspace*{-10pt} % An unpleasant hack. It seems that if multicol has to % fill an exact page it loops infinitely and spews rubbish % about overful vboxes in \output to the terminal. The % negative skip here stops the page from looking full to % multicol, so it will balance the text it has and be % very happy. (Bletch bletch.) \csname sd@roundfalse\endcsname \begin{syntdiag} \begin{stack} \begin{rep} \end{rep} \begin{stack} \\ `.' \begin{stack} \\ \begin{rep} \end{rep} \end{stack} \end{stack} \\ `.' \begin{rep} \end{rep} \end{stack} \begin{stack} \\ \begin{stack} `E' \\ `e' \end{stack} \begin{stack} \\ `+' \\ `-' \end{stack} \begin{rep} \end{rep} \end{stack} \end{syntdiag} \caption{The same diagram with in the `square' style} \label{fig:synt.square} \end{figure*} % We will now resume our usual programming. The text within a nonterminal or terminal is read in way similar to \LaTeX's \cmd{\verb} command. You type the text more or less exactly as you'd like it to look. Problems occur if you want to produce something like \lit{doesn't} -- if you type \verb|`doesn't'|, then \TeX\ will think that the text ends at the \emph{first} \lit{'} character. You can avoid this problem by preceding the offending character by a backslash, making \verb|`doesn\'t'|, which is unfortunately less readable, but at least it actually works. Similar considerations apply to the \lit{\char'042} and \lit{>} characters. There's no need to start escaping characters like this unless there's actually an ambiguity. You can also type `\verb|\\|' to obtain a backslash character, and `\verb*|\ |' to get the `\verb*| |' symbol. Right: we can now put something into our empty syntax diagram. By saying \begin{list}{}{\leftmargin=\parindent \parskip=0pt \topsep=0pt} \footnotesize \begin{verbatim} \begin{syntdiag} `angry gnat' "gnu going \"moo!\"" \end{syntdiag} \end{verbatim} \end{list} we get something which looks not dissimilar to this: \begin{syntdiag} `angry gnat' "gnu going \"moo!\"" \end{syntdiag} \section{Building larger structures} Once you can write simple syntax objects, you need to know how to combine them in interesting ways. You've already seen that just writing objects next to each other inserts them in series (I managed to sneak that in above). Other contructions are described by environments nested within the main \env{syntdiag} environment. A choice of one item from a list is represented by stacking the available options one above the other. Such a construction is typeset using the \env{stack} environment; the individual rows of the stack are separated by \verb|\\| commands. An item which may be repeated any number of times is shown with a loop above it. Such a structure can be typeset with the \env{rep} environment; the text to be repeated forms the body of the environment. If the repeated texts are to be separated from each other, you can describe this separating text by preceding it with the \verb|\\| command. These two environments can be nested within each other as required. Arbitrarily complicated structures can be built in this way. We now have enough equipment to start drawing real diagrams. For example, a floating point number might be represented like this: \begin{list}{}{\leftmargin=\parindent \parskip=0pt \topsep=0pt} \footnotesize \begin{verbatim} \begin{syntdiag} \begin{stack} \begin{rep} \end{rep} \begin{stack} \\ `.' \begin{stack} \\ \begin{rep} \end{rep} \end{stack} \end{stack} \\ `.' \begin{rep} \end{rep} \end{stack} \begin{stack} \\ \begin{stack} `E' \\ `e' \end{stack} \begin{stack} \\ `+' \\ `-' \end{stack} \begin{rep} \end{rep} \end{stack} \end{syntdiag} \end{verbatim} \end{list} (The output is shown in figure~\ref{fig:synt.float}.) You can probably think of all sorts of syntax diagrams which can't be drawn by this system. Such diagrams are rather unlikely to appear in practice, they tend to be harder to understand, and all of them can be transformed into something which \emph{can} be represented, without altering their meanings. If you find that you can't draw the diagram you want, you should first see if it can't be transformed fairly simply into something which you can draw. If that doesn't work, try complaining at the author of the package. \section{Other things} Syntax diagrams have a habit of getting rather large. When possible, this package will try to break long syntax diagrams over several lines. Usually, a diagram begins with a double arrow (like `\begin{syntdiag*} [\left{>>-} \right{...}] \end{syntdiag*}'), and ends with a funny pair of inward pointing arrows (like `\begin{syntdiag*} [\left{...} \right{-><}] \end{syntdiag*}'). When a diagram is broken, single arrows are put on the broken ends as a visual clue that the lines aren't complete. Quite often, the \TeX\ will find fairly good line breaks, and most of the time, the breaks will be fairly reasonable. If you want to take matters into your own hands, you can use the \verb|\\| command to force a break. Finally, there are two different styles of syntax diagrams which can be drawn by the package -- there's the `rounded' style I used in figure~\ref{fig:synt.float}, and the `square' style shown in figure~\ref{fig:synt.square}. You can choose which style to use in your document by passing the option \lit{square} or \lit{rounded} when you load the package. The default is to draw `square' style diagrams. %\medskip\hrule\small\kern\smallskipamount %\hfill Mark Wooding \\ \hspace*{\fill} \texttt{mdw@straylight.co.uk} %\end{multicols} \end{Article} %\mdwendfile