\documentstyle[11pt]{article}
\title {Literate Programming on a Team Project\thanks{This research
has been sponsored in part by the USAF, Rome Air Development Center, under
contract number F30602--86--C--0071.
The first author gratefully acknowledges the support of the Fannie and
John Hertz Foundation.
This paper is reprinted from {\em Software---Practice \& Experience}, 21(7):677--683, July 1991.}}
\author{Norman Ramsey\thanks{Current address: Department of Computer
Science, Princeton University, Princeton, New Jersey 08544.}
   \hskip3pt and Carla Marceau\\\mbox{}\\Odyssey Research Associates\\
301A Harris B. Dates Drive\\Ithaca, New York 14850}
\date{February 4, 1991}
\setcounter{secnumdepth}{0}
\setcounter{tocdepth}{6}
\renewcommand{\baselinestretch}{1.33}

\makeatletter
\def\refno#1{\nocite{#1}\@ifundefined
       {b@#1}{{\bf ?}\@warning
       {Reference number `#1' on page \thepage \space undefined}}%
{\hbox{\csname b@#1\endcsname}}}
\makeatother

\def\remark#1{\marginpar{\raggedright\hbadness=10000\footnotesize\it #1}}
% \def\remark#1{\relax}
\let\logo=\sc
\font\logo=logo10
\def\metafont{{\logo METAFONT}}


\begin{document} % 1
\clubpenalty=10000
\widowpenalty=10000
\maketitle

\begin{abstract} % 2
% % We used {\tt WEB} on a group project to write a large program, over
% % 33,000~lines. 
% % The program was not intended to be published.
% % We encountered many problems using {\tt WEB}, but we believe we are
% % better off for having used it.
% % We describe both the problems and why we believe {\tt WEB} was
% % beneficial despite them.
% We used literate programming, with {\tt WEB},  on a group project to
% write a program of 
% over 33,000~lines.
% The program, Penelope, was not intended to be published, but the
% literate source has served as good internal documentation through
% development and maintenance. 
% Using {\tt WEB} without help from its developer uncovered a number of
% problems, which are described below.
% Despite the problems, {\tt WEB} has been effective in helping us
% document the design and implementation of Penelope.
We used literate programming on a team project to write a 33,000-line
program for the Synthesizer Generator.  The program, Penelope, was
written using {\tt WEB}, a tool designed for writing literate
programs.  Unlike other {\tt WEB} programs, many of which have been
written by {\tt WEB}'s developer or by individuals, Penelope was not
intended to be published. We used {\tt WEB} in the hope that
both our team and its final product would benefit from the advantages
often attributed to literate programming.  The {\tt
WEB} source served as good internal documentation throughout
development and maintenance, and it continues to document Penelope's
design and implementation.  
Our experience also uncovered a number of
problems with {\tt WEB}.
\end{abstract} % 2


\section{Introduction}

Donald Knuth coined the term ``literate programming'' when describing
{\tt WEB}, the tool he used to build \TeX~\cite{knuth:literate}.
He believes that ``the time is ripe for significantly better
documentation of programs, and that we can best achieve this by
considering programs to be {\em works of literature}.''
Knuth and others have presented examples of such
programs~\cite{bentley:lp1,bentley:lp2,cvw:loom,gries:pearls}. 


Literate programming is usually discussed in the context of
publishing programs or of publishing books or articles about programs.
\TeX\ and {\metafont}, the original applications of {\tt WEB},
have  been published as books~\cite{knuth:tex,knuth:metafont}.
Other published literate programs seem to be primarily for teaching.
Programs to find random sequences and count common words
were written to illustrate the power of literate
programming~\cite{bentley:lp1,bentley:lp2}. 
Another program to count common words
is a tutorial on how to
develop and tune a small program~\cite{cvw:loom}.
Another illustrates formal methods of program
development and the use of abstract data types~\cite{gries:pearls}.


A valedictory assessment points out three common aspects of the published
literate programs:  cosmetics, polish, and verisimilitude, of which
verisimilitude---the property of using one input to produce both
compilable program and published document--is deemed
essential~\cite{cvw:assessment}.  
Additional expectations of a literate
programming tool include flexible order of elaboration,
ability to develop program and documentation concurrently in one
place, cross-references, and indexing~\cite{thimbleby:review}.

{\tt WEB} is the principal tool used for literate programming; a
number of implementations are
available~\cite{knuth:literate,thimbleby:cweb,guntermann:cweb,levy:cweb,sewell:mangle,ramsey:building}.
{\tt WEB}  programmers
 interleave source code and descriptive text in a single document.
When using {\tt WEB}, a programmer divides the source code into
small textual units called {\em modules}, and each module carries
associated documentation.
In the {\tt WEB} source, different modules may 
be written in any order.
 The programmer is encouraged to choose an
order that helps explain the program.
The modules are like macro definitions; they have
the form
\begin{quote} % 2
\tt@<{\em module name\/}@>={\em body}
\end{quote} % 2
where the body contains both code and references to other modules.
{\tt WEB} programmers can also define and use macros similar to
C~preprocessor macros.
 One module is unnamed; its body represents the underlying program.

Two filters process {\tt WEB} source.
 {\tt TANGLE} reads the {\tt WEB} source and expands the
unnamed module according to its definition.
References to named modules are expanded where they occur, and the end
result is that {\tt TANGLE} extracts the underlying program from the
{\tt WEB} source.
This program is formed from the code fragments that the programmer put
into the module definitions, assembled in the order required by the compiler.
A second filter, {\tt WEAVE}, reads the {\tt WEB} source and converts
it to \TeX\ input, with which {\TeX} can produce
high-quality typeset documentation of the program.
Examples of {\tt WEB} source and typeset documentation can be found in
Reference~\refno{knuth:literate}.


\section{Using Literate Programming}

 Penelope is a language-based editor intended to help
programmers develop formally verified Ada
programs~\cite{ramsey:developing,marceau:interactive,guaspari:formal}. 
It parses annotated Ada programs,
performs static semantic checking and overload resolution,
computes weakest preconditions by predicate transformation,
simplifies the resulting preconditions,
and 
helps users construct proofs of those preconditions.
Its source code is
an editor specification for
the Synthesizer Generator~\cite{reps:synthesizer:89}. 
The Synthesizer Generator builds an editor of attributed syntax
trees.
When a  user defines a tree, the editor attributes it.
When the user changes the tree,
the editor recomputes only those attribute values that have
changed~\cite{reps:generating}.

Penelope has two written specifications: the Ada language
reference manual and a denotational-style definition of Ada predicate
transformers~\cite{adalrm,polak:predicate}.
After three years of work, the {\tt WEB} source for the editor
is over 33,000 lines.
Over 13,000 of those lines are interleaved documentation.
The editor has been used to verify a software repeater for an
asynchronous
communication line
and security properties of a part of the
ASOS operating system
kernel~\cite{marceau:verified,weber:beyond,anderson:asos}. 
Seven programmers have written {\tt WEB} source, 
but no more than four have worked
on it at any one time.
At this writing, the editor is being extended
to support Ada libraries. % programs that use data abstraction.

We decided to write the editor as a literate program because 
we expected implementing predicate
transformation to be error-prone.
To avoid errors, we used {\tt WEB} to juxtapose
the specification and implementation of
each predicate transformer.
We used {\TeX}'s math mode to write the specifications in the notation
of denotational semantics.~\cite{polak:predicate,polak:program}. 

The right model for a literate program that is being maintained and
extended is not the novel but the car repair manual.
We began writing new code as an
explanation or tutorial for our colleagues, but
as the text grew we treated it more like a reference work.
As with any document, we  revised repeatedly to clarify meaning.
We often used
{\tt TANGLE}'s reordering mechanism to make major revisions in the 
{\tt WEB} source without changing the underlying program.
% Parnas and Clements say this; see {parnas:rational}


We divided Penelope into parts according to functionality:
we wrote chapters on 
abstract syntax,
concrete syntax, 
predicate transformation,
static semantic checking, 
proof construction,
and simplification of terms.
Each of the first four chapters follows the structure of the Ada
Language Reference Manual, since each is closely related to
Ada.
Only knowledge of the Ada manual is required to navigate this code;
a reader who knows where to find the
{\tt exit} statement in the Ada manual can find the predicate
transformation of the {\tt exit} statement in the corresponding
section of the predicate transformation chapter.
We gave the final chapters structures related to the problems of
simplification and proof construction.


Penelope has grown so large that each chapter
is itself structured like a reference work.
 Sections within chapters have a different structure. 
They are tutorials that use the traditional approach
to documentation: begin with a problem statement or specification,
discuss possible solutions, explain the design of the preferred
solution, then develop the implementation and its description. 
A short introductory chapter describes the major parts of the program
the organization used to form the text.
A ``how to find it'' section helps readers find source code.

We have used the Penelope source
not only for reference but also to introduce new programmers to the
project.  % ???
% For example, as a prelude to extending the 
% Ada subset handled by Penelope, a new project member might read Chapter
% 6 of the Red Dragon Book and also the chapter of Penelope on static
% semantic checking.


We used some of the cosmetic features of {\tt WEB};
we found
tables of variables, functions, and files all useful to  describe the
organization of the code. 
When working on Ada static semantic checking, we found it useful to
include in the {\tt WEB} source some fragments of relevant documents.
The fragments included visibility and overloading rules from the Ada reference
manual~\cite{adalrm} and a presentation of the Ada type system derived
from Reference~\refno{ganzinger:operator}.


\section{Evaluating {\tt WEB}}

 Using {\tt WEB} without help from its developer uncovered a number of
problems. 
Some problems relate to the criteria in References
\refno{cvw:assessment}~and~\refno{thimbleby:review}, but others do not.

We had no trouble with
{\tt WEB}'s fundamental mechanisms, the section and
the named module.
They provide verisimilitude and flexible order of elaboration.
The small size of modules makes it feasible to develop code and
documentation in one place using an ordinary text editor.
The order of fragments is the same in the {\tt WEB} source and in the
published document, which simplifies polishing.

\medskip
We found some of {\tt WEB}'s cosmetic features inadequate, others
superfluous. 
Cosmetics should include appropriate media for presenting programs: 
not just math and tables but also diagrams and
figures~\cite{bentley:more-pearls:graphics,bentley:lp2,sedgewick:algorithms}. 
 Describing data structures was hampered by {\TeX}'s lack of support for
diagrams and pictures.
{\TeX} does make it easy to use mathematical notation, which
 helped considerably in describing 
 predicate transformation,
 the simplifier, and
 the proof system used in the proof constructor.

Programmers complained more about prettyprinting than about all other {\tt WEB}
problems combined. 
{\tt WEAVE} ignores the programmer's choice of indentation
and line breaks; it breaks and indents lines on the basis of
the  syntactic categories of tokens.
(Programmers can force line breaks by inserting special control sequences
like {\tt@/} into their program text, but that's about all.)
We spent too much time tinkering
with prettyprinting, trying to make {\tt WEAVE}'s output
acceptable to everyone.
We would have been better off with no prettyprinter, or with a
prettyprinter that changed 
only the typographic treatment of program fragments without changing
the placement of tokens on the page.

We used the index of identifiers 
 during reviews only, usually to find function definitions
and to locate code that had been misplaced.

\medskip
{\tt WEB} formats interleaved documentation using {\TeX}, which some
programmers  already use for writing documents.
{\tt WEB} and {\TeX} are integrated poorly.
One cannot lift pieces of {\tt WEAVE}'s output and put them in other
documents without adjusting the \TeX\ code.
{\tt WEB} works even less well with \LaTeX; \LaTeX\ constructs cannot
be used in {\tt WEB} source, and getting {\tt WEAVE} output
to work in \LaTeX\ documents requires tedious adjustments by hand.
The other direction is easier;
we wrote project documents using \LaTeX, and, with some adjustments,
we could include excerpts from these documents in our code.
%%%%%%%% Carla wants no more details
% \remark{I can give more details.  For example, the {\tt webmac} macros
% should make it easy to pull out the {\tt WEAVE} output routine and use
% a plain \TeX\ or \LaTeX output routine---but they don't.
% This is a sore point with me.}
%%%%%%%%%
 We were accustomed to writing \LaTeX\ documents,
so it was annoying to be forced to switch to plain
\TeX\ to write programs.

% \paragraph{Editing {\tt WEB} source}
\medskip

\TeX's input often looks very different from \TeX's output, especially
when mathematics is used.
Similarly, {\tt WEB} source looks very different from the typeset
document produced by {\tt WEAVE} and {\TeX}.
When people use \TeX\ to write papers, 
it may be acceptable for the \TeX\ input to be full of
confusing hieroglyphics, because the {\TeX} input is almost never
read---only the printed version is read.
{\tt WEB} source is read frequently because programmers must edit it.
Both the difficulty of reading the source and the marked difference between
source and listing complicate editing.

\medskip % \paragraph{Structuring the document}

{\tt WEAVE}'s standard table of contents mechanism is  a list of
section names.
 (A section is a named group of modules together with their
interleaved documentation.)
This flat structure is inadequate for
describing even a moderate-sized program.
Extra structure in the form of ``parts'' has been added to
the book version of {\TeX}; a part contains several
sections~\cite{knuth:tex}. 


We changed {\tt WEAVE}'s table of contents mechanism to make
hierarchical organization possible.
 The new mechanism supports
chapters, sections, and two levels of subsections.
We did so without changing {\tt  WEAVE} itself; instead we changed the
\TeX\ macros that support {\tt WEAVE}.
The new macros recognize special symbols at the beginnings of section
names; these symbols indicate which sections are really chapters,
subsections, and so on.
Using this hierarchy made the  table of contents
an important guide to the code;
the only practical way to find a particular part of the editor code
was to begin with the table of contents.


The {\tt WEAVE} listing of Penelope is over 800~pages;
its table of contents is about 8~pages.
We needed to extract and print parts of this document,
but {\tt WEAVE}  processes only complete documents.
We extracted parts in two ways.
When we wanted just a few, small, closely related parts, we created a
special {\tt WEB} file that held just those parts, and printed it.
For something more general, we used a shell script that
removed parts of {\tt WEAVE}'s output before passing the rest to {\TeX}.
This script recognized the special symbols in the section names so that, for
example, we could include or exclude whole chapters by name,
without having to
enumerate their contents.
It would have been more expensive, and no less awkward, to use
standard mechanisms for extracting pages from {\TeX}'s output. 

Special symbols in section names are an awkward way of indicating
structure.
 A more natural way of indicating structure, like the
{\LaTeX} mechanism, would have been welcome.
Tools should make it easy to use the 
 structure to help readers extract excerpts from literate
programs.

\medskip % \paragraph{Fit with configuration management tools}

 {\tt WEB} works poorly with {\tt make}~\cite{feldman:make}.
 {\tt TANGLE} is designed to read and write a complete program.
Some {\tt TANGLE}s can write multiple files,
which can then be compiled separately, but
those files all get rewritten
 every time the {\tt WEB} source changes,
and {\tt make} therefore recompiles them all.
This problem is familiar;
other preprocessors, like {\tt yacc}~\cite{johnson:yacc}, can also
cause excessive recompilation.
The workaround described on page 265 of
Reference~\refno{kernighan:unix} works for {\tt TANGLE}.
%%%%%%%% Carla wants no more details
% \remark{Workaround doesn't help too much when the output
% from {\tt TANGLE} includes information about source line numbers?}
%%%%%%%%%

{\tt WEB} users may be tempted
to break their {\tt WEB} source into many files and run them through
{\tt TANGLE} separately. 
Doing so defeats the purpose of writing a literate
program; separate compilation does not necessarily imply separate
explanation. 
For example, one would prefer to place a
unit's specification and implementation in the same {\tt WEB} 
source, even when
they should be compiled separately.

\medskip % \paragraph{Fit with compilers and debuggers}

Our  {\tt TANGLE}~\cite{ramsey:spiderwebman} emits
the {\tt\#line} directive of the C preprocessor, making
messages from compilers and debuggers refer to line numbers in
the {\tt WEB} source instead of to those in {\tt TANGLE}'s output.
Not all compilers  or all {\tt TANGLE}s support such mechanisms.
Renumbering is essential for large programs.


\section{Discussion}

Most literate programming papers refer to programs that are 
polished, publishable ``works of art.''
Our primary goal in writing Penelope
was not to write a publishable program, nor to
evaluate literate programming as a software engineering technique, 
but to build a prototype editor embodying the results
of research in formal verification. 
Our experience has given us some subjective
impressions of the benefits of applying literate programming
to team development, as well as  more specific conclusions about
the difficulties of applying literate programming.

There are few published techniques for writing literate
programs, especially for writing large ones.
Most of our programmers complained about the awkward tools and about
the lack of guidance in their use.
We were unable to develop precise methods for writing literate
programs or even clear criteria about what constituted a
literate program.
Instead, we relied on peer review of programs,  rewriting them
until the project members understood them.
These programs were rarely polished to the point necessary for
publication;  less polished presentations were adequate.

One review of a literate program emphasizes the role of
juxtaposed code and documentation~\cite{thimbleby:review}.
It cites several benefits of this juxtaposition, including an
incentive to explain and hence to understand what one is doing.
During peer reviews of Penelope,
we insisted that explanations of programs include explanations of design.
% this insistence mitigated against the
% tendency of programmers to begin implementing without thinking about
% design. 
% We found that
% the juxtaposition reduced the overhead
% involved in maintenance; 
Juxtaposing design documentation and code reduced the overhead of
maintenance because maintainers reading the code did not have to look
elsewhere for design documents.

We cannot say to what extent literate programming can replace
standard software development methodology.  However, putting a
clear description of design in our source code helped make it possible
for a changing team of programmers to develop it over a span of three
years.  We were able to apply standard guidelines, like those in
Reference~\refno{parnas:rational}, to describing Penelope's design. 

We believe
literate programming helped us substantially.
This belief is based not on measurements but on our subjective
comparisons of experience on this project to other projects.
A programmer who has used standard software development systems at an
international computer manufacturing company reports that a key
difference in Penelope was that the documentation was {\em used}, precisely
because of its proximity to the source code.
The project manager, when at a large software house,
learned to expect technical staff first to criticize
implementations
for drifting away from the original intent, then to
call for complete rewrites.
There have been no complaints about the quality of Penelope's
implementation. 
The programmers have been surprised at how easily they have extended
and modified one another's work.
For example, an editor for constructing proofs was implemented  by a
programmer who then left the project. 
The programmer who took over the job of maintaining the proof
constructor read the program in two hours and found herself well
prepared to change the code.

We do not ascribe Penelope's success purely to literate programming;
other factors contributed.
We began implementation with a clearly defined
goal and worked from detailed, sometimes formal, specifications.
We had time to design the system carefully and used a declarative
programming language. 
We did not have to develop a user interface but used the one provided
by the Synthesizer Generator~\cite{reps:synthesizer:89}.


We will continue to use literate programming
for Penelope and for future projects.
{\tt WEB}'s problems are such that we believe it will be
cost-effective for future projects to develop literate programming
tools that address some of the criticisms presented here.


\section{Acknowledgments}
Silvio Levy provided his {\tt CWEB} implementation as a base for the
{\tt WEB} used to implement the Penelope editor.
The staff of the Ada Verification project at Odyssey Research made
possible the experience with Penelope on which this paper is based.
This paper has been much improved by comments from David Hanson and
from the anonymous referees.
Stuart Feldman commented on this paper and discussed ideas for better
tools. 

\bibliographystyle{unsrt}
\bibliography{web,ada,cs,ramsey}

%\newpage
%\tableofcontents

\end{document} % 1