Fri, 29th Jul 2011 (HowDidIDoThat :: LaTeX)
The internet Class
This is a class designed for those that want to use LaTeX to publish material on the internet. As it becomes more common to publish material via some content-management system, so it becomes rarer to generate (X)HTML documents directly and content for the web is usually written in some limited markup language. This class is designed to make it possible to author material for such systems using the facility of LaTeX.
It is important at the outset to know what this class is designed for and the easiest way to do that is to explain what it is not for. It is not intended as a way to take an already authored LaTeX document and convert it into a format suitable for putting on the internet. It is also not intended as a "plugin" for a content-management system so that authors can use LaTeX as the input format for their blog, forum, wiki, or whatever.
The first of these is both impossible and undesirable. It is impossible for the following simple reason. Because TeX controls the whole process of creating its output, it can do some amazing things. A web document, on the other hand, is extremely malleable and the reader can transform it considerably; therefore, not everything that TeX can do can be (easily) done on the web. It is, via a considerable amount of trickery, to get quite close but only at the expense of this malleability. And this flexibility is a good thing as it allows the reader to make the document as easy for them to access as they can. This is why it is undesirable to support absolutely anything that LaTeX can do.
The second, the plugin, is also undesirable. There is a good reason that the current input formats were chosen: they are simple. They are simple to understand by a human: when writing on a wiki, it would be extremely inconvenient to have to learn the current page's set of macros so there is an advantage to having them consistent across the whole system. They are also simple to understand by a computer: parsing a file in, say, markdown is extremely quick and can be done in real time by scripting languages such as Perl and PHP. Parsing TeX is much harder due to its complexity, and so either the scripting language has to limit the syntax or it has to pass it to an external program, both of which can put severe limitations on the system.
So this package is designed for someone who wants to write a web page, not necessarily directly, using their LaTeX skills. They want the full flexibility of LaTeX together with its familiarity (presumably they write other documents in LaTeX already) but know at the outset that the document will end up on the web.
Usage of this package is extremely simple. It is designed as a class and so should be loaded with a simple:
There are various options available for the class (these are processed internally by the
The formats are as follows.
There is also the matter of putting mathematics on the web. Some systems load a particular method by default, but this can be overridden using the
One File, Many Outputs
Although the intention is that a document written using this system be written knowing the eventual output, it is certainly not unreasonable to use a single file for several outputs, or to use a single fragment in documents on different systems. Whilst the attempt is to make it so that the same input works for all outputs, there will be times when one output type requires a slightly different input to another. For that situation, the
Let us take
The second use,
(There is not yet support for specifying multi-modes.)
pdftotext -enc ASCII7 -layout -nopgbrk texfile.pdf
Since line-breaks and indentation are often significant in text formats, the text modes define an extremely wide page in the hope that no paragraph will be quite that long. Whilst
Too many to mention!
The major limitation is to do with external packages. Most external packages will not work directly with this class (at least, in a text mode). This harks back to the problem of taking an already-written document and converting it. LaTeX packages were written with the understanding that the final outcome would be a static document, not some text that will be further processed.
That is not to say that no packages will ever be supported. Packages could be supported by writing an alternative which translates the commands to a sensible output. However, this will need to be done on a case-by-case basis as the need arises.
Thus, for now, it is best to put all the
Obtaining the Code
This package is still in early days so is not yet on CTAN. It can be downloaded here as a zip file or a tar.gz file. This also contains the source of this document, which was written using this class.