Preston Briggs1
preston@tera.com
HTML scrap generator by John D. Ramsdell
ramsdell@mitre.org
scrap formatting and continuing maintenance by Marc W. Mengel
mengel@fnal.gov
In 1984, Knuth introduced the idea of literate programming and
described a pair of tools to support the practise [2].
His approach was to combine Pascal code with TEX documentation to
produce a new language, WEB, that offered programmers a superior
approach to programming. He wrote several programs in WEB,
including weave and tangle, the programs used to support
literate programming.
The idea was that a programmer wrote one document, the web file, that
combined documentation (written in TEX [3]) with code
(written in Pascal).
Running tangle on the web file would produce a complete
Pascal program, ready for compilation by an ordinary Pascal compiler.
The primary function of tangle is to allow the programmer to
present elements of the program in any desired order, regardless of
the restrictions imposed by the programming language. Thus, the
programmer is free to present his program in a top-down fashion,
bottom-up fashion, or whatever seems best in terms of promoting
understanding and maintenance.
Running weave on the web file would produce a TEX file, ready
to be processed by TEX. The resulting document included a variety of
automatically generated indices and cross-references that made it much
easier to navigate the code. Additionally, all of the code sections
were automatically prettyprinted, resulting in a quite impressive
document.
Knuth also wrote the programs for TEX and METAFONT
entirely in WEB, eventually publishing them in book
form [4,5]. These are probably the
largest programs ever published in a readable form.
Inspired by Knuth's example, many people have experimented with
WEB. Some people have even built web-like tools for their
own favorite combinations of programming language and typesetting
language. For example, CWEB, Knuth's current system of choice,
works with a combination of C (or C++) and TEX [7].
Another system, FunnelWeb, is independent of any programming language
and only mildly dependent on TEX [9]. Inspired by the
versatility of FunnelWeb and by the daunting size of its
documentation, I decided to write my own, very simple, tool for
literate programming.1.1
Nuweb works with any programming language and LATEX [6]. I wanted to use LATEX because it supports a multi-level sectioning scheme and has facilities for drawing figures. I wanted to be able to work with arbitrary programming languages because my friends and I write programs in many languages (and sometimes combinations of several languages), e.g., C, Fortran, C++, yacc, lex, Scheme, assembly, Postscript, and so forth. The need to support arbitrary programming languages has many consequences:
WEB and CWEB are able to
prettyprint the code sections of their documents because they
understand the language well enough to parse it. Since we want to use
any language, we've got to abandon this feature.
However, we do allow particular individual formulas or fragments
of LATEX code to be formatted and still be parts of output files.
Also, keywords in scraps can be surrounded by @_ to
have them be bold in the output.
WEB knows about Pascal,
it is able to construct an index of all the identifiers occurring in
the code sections (filtering out keywords and the standard type
identifiers). Unfortunately, this isn't as easy in our case. We don't
know what an identifier looks like in each language and we certainly
don't know all the keywords. (On the other hand, see the end of
Section 1.2.2)
WEB are
concerned with control of the automatic prettyprinting. Since we
don't prettyprint, many commands are eliminated. A further set of
commands is subsumed by LATEX and may also be eliminated. As a
result, our set of commands is reduced to only four members (explained
in the next section). This simplicity is also reflected in
the size of this tool, which is quite a bit smaller than the tools
used with other approaches.
tangle and weave into
a single program that performs both functions at once.
A further reduction in compilation time is achieved by first
writing each output file to a temporary location, then comparing the
temporary file with the old version of the file. If there is no
difference, the temporary file can be deleted. If the files differ,
the old version is deleted and the temporary file renamed. This
approach works well in combination with make (or similar tools),
since make will avoid recompiling untouched output files.
In addition to producing LATEX source, nuweb can be used to generate HyperText Markup Language (HTML), the markup language used by the World Wide Web. HTML provides hypertext links. When an HTML document is viewed online, a user can navigate within the document by activating the links. The tools which generate HTML automatically produce hypertext links from a nuweb source.
The bulk of a nuweb file will be ordinary LATEX. In fact, any
LATEX file can serve as input to nuweb and will be simply copied
through, unchanged, to the documentation file--unless a nuweb command
is discovered. All nuweb commands begin with an ``at-sign''
(@). Therefore, a file without at-signs will be copied
unchanged. Nuweb commands are used to specify output files,
define macros, and delimit scraps. These are the basic
features of interest to the nuweb tool--all else is simply text to be
copied to the documentation file.
Files and macros are defined with the following commands:
Scraps have specific begin markers and end markers to allow precise control over the contents and layout. Note that any amount of whitespace (including carriage returns) may appear between a name and the beginning of a scrap.
@d Check for terminating at-sequence and return name if found
Therefore, we provide a mechanism (stolen from Knuth) of indicating
abbreviated names.
@d Check for terminating...
Basically, the programmer need only type enough characters to
identify the macro name uniquely, followed by three periods. An abbreviation
may even occur before the full version; nuweb simply preserves the
longest version of a macro name. Note also that blanks and tabs are
insignificant within a macro name; each string of them is replaced by a
single blank.
Sometimes, for instance during program testing, it is convenient to comment
out a few lines of code. In C or Fortran placing /* ... */ around the relevant
code is not a robust solution, as the code itself may contain
comments. Nuweb provides the command
@%
only to be used inside scraps. It behaves exactly the same
as % in the normal LATEX text body.
When scraps are written to a program file or a documentation file, tabs are expanded into spaces by default. Currently, I assume tab stops are set every eight characters. Furthermore, when a macro is expanded in a scrap, the body of the macro is indented to match the indentation of the macro invocation. Therefore, care must be taken with languages (e.g., Fortran) that are sensitive to indentation. These default behaviors may be changed for each output file (see below).
When defining an output file, the programmer has the option of using flags to control output of a particular file. The flags are intended to make life a little easier for programmers using certain languages. They introduce little language dependences; however, they do so only for a particular file. Thus it is still easy to mix languages within a single document. There are three ``per-file'' flags:
#line directives in the
output file. These are useful with C (and sometimes C++ and Fortran) on
many Unix systems since they cause the compiler's error messages to
refer to the web file rather than to the output file. Similarly, they
allow source debugging in terms of the web file.
make files.
We have two very low-level utility commands that may appear anywhere in the web file.
Identifiers must be explicitly specified for inclusion in the
@u index. By convention, each identifier is marked at the
point of its definition; all references to each identifier (inside
scraps) will be discovered automatically. To ``mark'' an identifier
for inclusion in the index, we must mention it at the end of a scrap.
For example,
@d a scrap @{
Let's pretend we're declaring the variables FOO and BAR
inside this scrap.
@| FOO BAR @}
I've used alphabetic identifiers in this example, but any string of
characters (not including whitespace or @ characters) will do.
Therefore, it's possible to add index entries for things like
<<= if desired. An identifier may be declared in more than one
scrap.
In the generated index, each identifier appears with a list of all the scraps using and defining it, where the defining scraps are distinguished by underlining. Note that the identifier doesn't actually have to appear in the defining scrap; it just has to be in the list of definitions at the end of a scrap.
Nuweb is invoked using the following command:
nuweb flags file-name...One or more files may be processed at a time. If a file name has no extension,
.w will be appended. LATEX suitable for
translation into HTML by LATEX2HTML will be produced from
files whose name ends with .hw, otherwise, ordinary LATEX will be
produced. While a file name may specify a file in another directory,
the resulting documentation file will always be created in the current
directory. For example,
nuweb /foo/bar/quuxwill take as input the file
/foo/bar/quux.w and will create the
file quux.tex in the current directory.
By default, nuweb performs both tangling and weaving at the same time. Normally, this is not a bottleneck in the compilation process; however, it's possible to achieve slightly faster throughput by avoiding one or another of the default functions using command-line flags. There are currently three possible flags:
nuweb -to /foo/bar/quux
would simply scan the input and produce no output at all.
There are several additional command-line flags:
stderr.
Nikos Drakos' LATEX2HTML Version 0.5.3 [1] can be used
to translate LATEX with embedded HTML scraps into HTML. Be sure
to include the document-style option html so that LATEX will
understand the hypertext commands. When translating into HTML, do not
allow a document to be split by specifying ``-split 0''.
You need not generate navigation links, so also specify
``-no_navigation''.
While preparing a web, you may want to view the program's scraps without
taking the time to run LATEX2HTML. Simply rename the generated
LATEX source so that its file name ends with .html, and view
that file. The documentations section will be jumbled, but the
scraps will be clear.
Because nuweb is intended to be a simple tool, I've established a few restrictions. Over time, some of these may be eliminated; others seem fundamental.
@
signs.
@O or @D (instead of @o and @d).
This doesn't work very well as a default, since far too many short
scraps will be broken across pages; however, as a user-controlled
option, it seems very useful. No distinction is made between the
upper case and lower case forms of these commands when generating
HTML.
Several people have contributed their times, ideas, and debugging skills. In particular, I'd like to acknowledge the contributions of Osman Buyukisik, Manuel Carriba, Adrian Clarke, Tim Harvey, Michael Lewis, Walter Ravenek, Rob Shillingsburg, Kayvan Sylvan, Dominique de Waleffe, and Scott Warren. Of course, most of these people would never have heard or nuweb (or many other tools) without the efforts of George Greenwade.
Since maintenance has been taken over by Marc Mengel, online contributions have been made by:
<wb@fnal.gov>
<n.d.vanforeest@math.utwente.nl>
<jgoizueta@jazzfree.com>
<karp@hp.com>
=bibnames.sty =bibnames.sty =path.sty ;''
This document was generated using the LaTeX2HTML translator Version 2K.1beta (1.47)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 nuwebdoc.tex
The translation was initiated by Marc Mengel on 2002-02-25