Let me first define what LaTeX is and what its primary goals are. LaTeX is a huge add-on macro package for the TeX typesetting system developed by Prof. Donald E. Knuth. If we are not overly picky, we mean ``TeX plus all LaTeX macros'' when we say ``LaTeX system'' or just ``LaTeX''. LaTeX itself was written by Leslie Lamport, who found TeX to be very powerful, but too difficult for everyday use. Therefore he modeled LaTeX after the Scribe system. Scribe puts its emphasis on the logical structure of a document instead of the physical markup. (For those readers proficient in HTML tag <em>
is an example logical markup and tab <i>
is the corresponding physical markup.)
LaTeX -- as plain TeX -- allows a normal computer user to typeset documents with production-ready quality. It has been intended that a LaTeX author prepares articles or even books on her local computer, then walk over to the printer shop with a diskette to have the document printed on a high-resolution phototypesetter, and finally have it bound as a book (... shipped off the book to all bookstores in the alpha-quadrant, make millions from it, and two years later win the Intergalactic Pulitzer Prize. -- OK, this is a bit of a stretch).
In the next sections I will introduce very briefly to LaTeX, but I would like to recommend the Not So Short Introduction to LaTeX to everyone who wants to learn LaTeX. The 95-pages document is available for free on the Net. Please see ``Further Reading'' for details.
LaTeX gets installed by most current Linux distributions. You can check whether it is available on your machine by asking
latex --version
at the command line. My system responds with
TeX (Web2C 7.3.1) 3.14159 kpathsea version 3.3.1 Copyright (C) 1999 D.E. Knuth. Kpathsea is copyright (C) 1999 Free Software Foundation, Inc. There is NO warranty. Redistribution of this software is covered by the terms of both the TeX copyright and the GNU General Public License. For more information about these matters, see the files named COPYING and the TeX source. Primary author of TeX: D.E. Knuth. Kpathsea written by Karl Berry and others.
Here is an example of a very short, yet complete LaTeX document:
\documentclass{article} % preamble
\pagestyle{empty}
\begin{document} % body Here comes the text. \end{document}
Every LaTeX document consists of a preamble and a body. The preamble reaches from the definition of the document's class, \documentclass[
options]
{class}
, up to, but excluding \begin{document}
. The body is everything from \begin{document}
to \end{document}
.
The preamble in the example features only one command, \pagestyle{empty}
, which instructs LaTeX to omit all page decorations such as running heads or page numbers. The percent signs introduce comments that extend to the ends of the respective lines.
If we apply these simple rules to the three different versions of two paragraphs that follow, we conclude that they all will be typeset the same. I have added line numbers at the beginning of each line to point out empty lines, which separate the paragraphs. The numbers are not part of the text.
1 I am a short sentence in the first paragraph. 2 3 I'm the only sentence in the second paragraph.
1 I am a short sentence 2 in the first paragraph. 3 4 I'm the 5 only sentence 6 in the second 7 paragraph.
1 I am a short sentence in the first paragraph. 2 3 4 I'm the only sentence 5 in the 6 second paragraph.
I have collected the few most important special characters along with the ways how to insert them literally into a text.
\dots
'' or ``\/
''.
Note that ``\\
'' does not insert a single backslash character into the text as many C-programmers might assume right now. The control sequence ``\\
'' inserts a line break, whereas a literal backslash is produced by ``$\backslash$
''. To maximize the confusion, ``\
''--this is a backslash followed by a blank space--is a command, too! It inserts a so-called control space, a space (more precisely: exactly one space) that is never eaten up like ordinary spaces as explained in section ``Paragraphs''.
You get literal curly braces by quoting them with a backslash like this ``\{
'' and ``\}
''.
Comments extend up to and include the newline character at the end of a line. Thus LaTeX comments differ from one-line comments in all general programming languages, as those exclude the newline character. For the user this means, he can mask a newline by ending a line with a comment.
Hessenberg-% Triangular % <- note space directly in front of the %-sign Reduction
is equivalent to
Hessenberg-Triangular Reduction
To typeset a literal percent sign, use ``\%
''.
The sequence math is typeset inline in mathematical typesetting mode. To get a literal dollar sign, use ``\$
''.
The following table summarizes all ASCII characters that are treated specially by LaTeX. The rightmost column of the table suggests one or more possible equivalent sequences to get the plain ASCII character into the text. As can be guessed from the entries for caret and twiddle, \char
code_number inserts the ASCII character with the decimal index code_number into a document.
ASCII characters that are special for LaTeX. The right column denotes the strings (in LaTeX) which produce the ASCII characters in the middle column.
Name | ASCII | LaTeX |
---|---|---|
sharp | # |
\# |
dollar | $ |
\$ |
percent | % |
\% |
ampersand | & |
\& |
multiplication sign | * |
* or $*$ |
minus sign | - |
$-$ |
less-than sign | < |
$<$ |
greater-than sign | > |
$>$ |
backslash | \ |
$\backslash$ |
caret | ^ |
\char94 |
underscore | _ |
\_ |
curly braces | { , } |
\{ , \} |
vertical bar | | |
$|$ |
twiddle | ~ |
\char126 |
\
'' and either extend from the backslash to the next non-letter character (kind 1) or consist of exactly one non-alphanumeric character (kind 2). So ``\raggedleft
'' and ``\makebox
'' are commands of kind 1 whereas ``\\
'' and ``\"
'' are commands of kind 2. Arguments are passed to commands within curly braces ``{'', ``}''. Empty arguments can be omitted.
Examples:
\raggedleft{} % no argument \raggedleft % same as above
\makebox{Text inside of a box.} % single argument
\parbox{160pt}{This text is typeset inside of a box.} % two arguments
The number of arguments passed to a command is fixed. However, some commands accept optional parameters. These are passed within square brackets (``[
'', ``]
'') and usually precede the arguments just as the options precede the arguments in most UN*X utility programs.
Example:
\parbox[t]{10cm}{I am a top-aligned paragraph.} % one option, two arguments
Here t
is the optional parameter.
Spaces that follow a type 1 command name without arguments (like the second ``\raggedleft
'' above) are ``eaten''; they are not passed on to the output.
\begin{
environment}
Text within the environment.
\end{
environment}
An environment changes the appearance of the text within it. Environments control the alignment, the width of the margins and many other things. Some predefined environments are: center
, description
, enumerate
, flushleft
, flushright
, itemize
, list
, minipage
, quotation
, quote
, tabbing
, table
, tabular
, verbatim
, and verse
.
Environments do nest. For example, to get a quotation typeset flush against the right margin, use the flushright
environment and the quotation
environment.
\begin{flushright} \begin{quotation} Letters are things, \\ not pictures of things. \\ -- Eric Gill \end{quotation} \end{flushright}
An environment only affects text inside of it; it encapsulates all changes, like a different indentation occurring within the environment. (Well -- unless you happen to change a global variable, but I won't tell you how to do that, so you're safe.)
LaTeX knows three or four heading levels depending on the documentclass. Class article
has three section levels, whereas classes book
and report
feature chapters as a fourth and topmost heading level.
\chapter{
heading}
% only for class book
and report
\section{
heading}
\subsection{
heading}
\subsubsection{
heading}
Note that as in POD, discussed in Part I, sectioning commands act as separators. They do not group together text with a start marker and an end marker, but their mere appearance groups the text. This will be different in DocBook, as I shall show in next month's article.
LaTeX ships with three kinds of list-generating environments:
They correspond to unnumbered lists, numbered lists, and definition lists in HTML, or =item *
, =item 1
, =item
term lists in POD.
The items themselves are introduced with ``\item
''. An item can consist of more than one paragraph.
For description lists the optional parameter given to ``\item
'' as in ``\item[
term]
'' specifies the term. The text following ``\item[
term]
'' is term's definition.
Examples:
What emacs can do for you: \begin{itemize} \item Cut and paste blocks of text \item Fill or justify paragraphs \item Spell check documents \end{itemize}
Starting emacs for the first time \begin{enumerate} \item Start emacs from the command line:
\texttt{\$ emacs}
emacs will show you its startup screen and soon switch to a buffer called \texttt{*scratch*}.
\item Hold down the Control~key and press~H. You see a prompt at the bottom of the screen (or emacs window).
\texttt{C-h (Type ? for further options)-}
\item Press~T to start the emacs tutorial. \end{enumerate}
Some emacs commands: \begin{description} \item[C-x C-c] Quit emacs. \item[C-x f] Open a file. \item[C-x r k] Kill rectangle defined by mark and point, this is, by the active region. \end{description}
All cross references need two parts: a pointer (the link) and a pointee (the anchor). Anchors in LaTeX are inserted with \label{
anchor-name}
. Every anchor is located in a particular section and on a particular page. These two pieces of information are retrieved with \ref{
anchor-name}
and \pageref{
anchor-name}
at any place in the document.
Example use of \ref
:
\section{Setup}\label{section:setup} ...
\section{Summary}\label{section:summary} As has been pointed out in section~\ref{section:setup} `Setup', ...
Example use of \pageref
:
\section{Setup}\label{section:setup} The steel used in the sample chamber is alloyed with Ti (0.5\%), Cr (0.1\%), and Mn (0.1\%).\label{definition:chamber-alloy}
\section{Experiments}\label{section:experiments} For sample chamber is made of stainless steel (see page~\pageref{definition:chamber-alloy} for the exact metallurgical composition), ...
One of the major advantages of the LaTeX typesetting system is to allow the user to define her own commands and environments. Say you want to mark up all replaceable parameters in the description of a UN*X utility, like in
cd directory
to be rendered as, for example,
cd directory
Here, cd
is the utility's name, and directory
is the replaceable parameter.
Often utility names are typeset in bold face, and replaceable parameters in italics. Thus, a good solution would be to write
\utilityname{cd} \replaceable{directory}
where \utilityname
and \replaceable
switch fonts to bold face and italics respectively. With the help of \utilityname
and \replaceable
we can consistently mark up further command lines:
\utilityname{pushd} \replaceable{directory} \utilityname{ls} \replaceable{filename}
To define a new LaTeX command, use
\newcommand{
command-name}[
number-of-arguments ]{
command-sequence}
where command-name is the new command's name, number-of-arguments is the number of arguments the new command takes (it defaults to 0 if omitted), and command-sequence are the LaTeX commands to execute when command-name is called.
For our example, define \utilityname
and \replaceable
as:
\newcommand{\utilityname}[1]{\textbf{#1}} \newcommand{\replaceable}[1]{\textit{#1}}
The predefined commands \textbf
and \textit
switch fonts to text bold face (in contrary to math bold face) and text italic. Arguments are referred to by #
digit, where digit takes on values from 1 to 9.
To give you an impression of the usefulness of our newly defined commands, suppose we would like to generate an index entry for each utility that is mentioned in the text. Command \index{
term}
puts term in the index. We only need to modify the definition of \utilityname
to
\newcommand{\utilityname}[1]{\textbf{#1}\index{#1}}
and are done. (For the curious: index levels are separated with vertical bars. So, we probably would prefer \index{utility|#1}
as it neatly groups all utilities together. See the documentation of makeindex for details.)
New environments are defined with
\newenvironment{
environment-name}[
number-of-arguments ]{
starting-sequence}{
ending-sequence }
the only difference being that \newenvironment
requires two command sequences: one to open the environment, starting-sequence, and one to close it, ending-sequence. Continuing the example of a quotation typeset flush left against the page's margin, we define our own own quotation environment:
\newenvironment{myquotation}% Note: "%" masks newline {\begin{flushright}\begin{quotation}}% {\end{quotation}\end{flushright}}
which is then used like this:
\begin{myquotation} Letters are things, \\ not pictures of things. \\ -- Eric Gill \end{myquotation}
Neither commands, nor environments can be defined multiple times with \newcommand
or \newenvironment
. These commands only serve first time definition. Redefinitions are done with \renewcommand
and \renewenvironment
, which take on the same arguments as their first-time cousins.
LaTeX offers an extremely rich set of inline markup. I restrict the discussion to the same three inline markup changes which I discussed for Perl's plain old documentation format: emphasis, italics, bold face, and typewrite (code) font.
\textit{
argument}
-- Typeset argument in text italics.
\emph{
argument}
-- Emphasize argument. The default configuration switches to and from italics depending on the current font setting. If the current font is upright, \emph
uses italics; if the current font is italics, it uses an upright font. This way the emphasized parts of text always stand out.
Why have \textit
and at the same time \emph
? The commands express different requests. \textit
unconditionally demands the argument to be typeset using an italics font. Period. \emph
on the other hand asks for emphasizing its argument, however the emphasizing may look like. The default uses an italics font as explained above, but \emph
can be redefined to use a bold font, underlining, or anything else the writer imagines for her preferred method of emphasizing. The command name emph
always catches the concept of emphasis and hides the implementation.
\textbf{
argument}
-- Typeset argument in text bold face.
Based on \textbf
, we can define our own logical markup commands, like for example
\newcommand{\important}[1]{\textbf{#1}}
\texttt{
argument}
-- Typeset argument in text typewriter font.
As with \textbf
, \texttt
can be wrapped into user-defined commands:
\newcommand{\sourcecode}[1]{\texttt{#1}}
LaTeX files usually carry the extension tex. LaTeX translates these tex-files into so called device independent (dvi) files. dvi files are a binary representation of the source. They can be previewed to dvisvga on the console (given the terminal supports high-resolution graphics), or, for example, xdvi under the X11 windowing system. Often dvi files are converted to Postscript with the dvips tool. If Portable Document Format is desired, pdflatex transforms tex files into pdf files in a single step.
So far so good. LaTeX makes wonderfully looking Postscript documents, and its pdf sibling does the same, but outputs Portable Document Format files. Didn't we say we want HTML, too? Sure, we did! But LaTeX cannot help us here; we need another tool: latex2html. This tool transforms a LaTeX source file into a set of html files that are properly linked together according to the source file's structure.
latex2html has a home page at http://www.latex2html.org where it is available for download. It can also be obtained from http://www.ctan.org or better one of its many mirrors. To see whether it is installed on your Linux system, try
latex2html --version
and you should get an answer like
This is LaTeX2HTML Version 2K.1beta (1.57) by Nikos Drakos, Computer Based Learning Unit, University of Leeds.
What do I have to change to make my LaTeX document translatable with latex2html? -- Good news: almost nothing! Just ensure that the packages html
and makeindex
are referenced in the document's preamble, this is, at least add
\usepackage{html,makeidx}
to it. Now file my_document.tex can be translated to HTML with the call
latex2html my_document.tex
latex2html takes care of almost all issues that arise when a LaTeX file is translated into a set of html files. However, references to other parts in the document or other documents are conceptually different in printed documentation and HTML. Consider the LaTeX snippet
In the following, we summarize the findings using a cylindrical coordinate system. See page~\pageref{definition:coordinate-system} for the definition of the coordinate system.
where LaTeX dutifully replaces \pageref{definition:coordinate-system}
with the page number on which \label{definition:coordinate-system}
, the anchor of the page reference, occurs. Where is the problem? First, a set of html pages does not have a rigid notion of a ``page number''. Second, latex2html does replace \pageref{definition:coordinate-system}
with a hyper-link to the spot where \label{definition:coordinate-system}
is rendered. The link is a dark square for graphical browser or the marker ``[*]
'' for text browsers. But the whole construct looks awkward -- almost distracting and this is not latex2html's fault:
In the following, we will summarize the findings using a cylindrical coordinate system. See page [*] for the definition of the coordinate system.
Latex2html needs our help! The paragraph, which contains the reference, ought to be rephrased for the on-screen version, for example to:
In the following, we will summarize the findings using a <a>cylindrical coordinate system</a>.
where I have indicated the hyperlink with HTML anchor tags. To allow for two different versions depending on the output format, latex2html defines the \hyperref
command.
\hyperref[
reference-type]{
text for html version}{
pre-reference text for LaTeX version}{
post-reference text for LaTeX version}
The optional parameter reference-type selects the counter the reference refers to:
ref
''\ref
does. The reference text is the section number (``4'', ``1.5.2'', ``3.4.2.1'', etc.).page
'' or ``pageref
''\pageref
does. The reference text is a page number (``25'', ``xxiii'', etc.).Rewritten with \hyperref
our example looks like this
In the following, we will summarize the findings using a \hyperref[pageref]% {cylindrical coordinate system}% for HTML {cylindrical coordinate system. See page~}% for LaTeX { for the definition of the coordinate system}% trailing text for LaTeX {definition:coordinate-system}.% label the reference refers to
LaTeX renders it to
In the following, we will summarize the findings using a cylindrical coordinate system. See page 97 for the definition of the coordinate system.
and latex2html produces
In the following, we will summarize the findings using a cylindrical coordinate system.
from it.
A problem related to the one we have just encountered with references happens when hyperlinks come into play. In the HTML version of the document hyperlinks are essential; in the printed version, they are of little use: Compare ``Click here'' with ``Press your pencil against this letter''? Sometimes, however, the author really wants to include the target of the hyperlink, an universal resource locator (URL), in the printed text. latex2html defines two commands that exactly cater these needs.
\htmladdnormallink{
link text}{
universal resource locator}
\htmladdnormallinkfoot{
link text}{
universal resource locator}
Both commands generate the hyperlink <a href = "universal resource locator">link text</a> in the HTML version. The first only renders link text in the LaTeX version, suppressing universal resource locator completely. The second adds a footnote containing universal resource locator. The typical usage of these commands is
The text of this article can be downloaded from our \htmladdnormallink{web site}{http://www.linux-gazette.org}.
and
The text of this article can be downloaded from our \htmladdnormallinkfoot{web site}{http://www.linux-gazette.org}.
where the LaTeX result of the first looks like this
The text of this article can be downloaded from our web site.
for the second web site
gets a footnote marker and a footnote with the URL is placed at the bottom of the page. The HTML output will show up both times as
The text of this article can be downloaded from our web site.
As a last resort several commands and environments enable the writer to divert her text between LaTeX and HTML versions of the document:
\latex{
short text for LaTeX only}
\html{
short text for HTML only}
\latexhtml{
short text for LaTeX only }{
short text for HTML only}
\begin{latexonly}
text for LaTeX only \end{latexonly}
\begin{htmlonly}
text for HTML only \end{htmlonly}
I recommend to use diversion of output only if no more specialized latex2html command or environment can produce the desired markup, for splitting always requires to keep both branches in sync.
lshort
on your local Linux system, or use the search facilities at www.ctan.org for find a mirror close to you.For a beginner the hypertext pages can neither replace the Short Introduction, nor Lamport's book. For the intermediate LaTeX user, however, they are a valuable help in case printed documentation is out of reach.
Next month: DocBook