This document is intended for project members, developers who wish to use/modify the source code as well as interested users.
The project is managed with Maven 1. It is written in Java v1.5 and dependent on the following packages:
To build and install the project as a Maven plugin, just execute the Maven-goal plugin:install.
Because the commandline version is run from a start script, there are two Maven-goals: dist.app.windows is used to build a commandline version with a windows batch script, dist.app.unix will build a version with a unix shell script.
There is currently no Maven-goal to create an Ant-task. Building it manually is rather trivial though: Build the jar with the Maven-goal jar:jar, and use it as described in the usage instructions.
The documentation can simply be built by executing the Maven-goal site.
The conversion can be grouped in 2 steps: First the LaTeX document will be converted into an object model, which in a second step is passed to a specific renderer class which creates the final output. This is to ensure that new renderers (new output formats) can be added more easily, and are independant of TeX syntax. Another benefit is that by using the intermediate object model, one could also create different parsers for different input formats.
The object model is designed to represent the content and logical structure of the document, not to store layout information. The layout of the output is the responsibility of the renderers, and thus can be tailored to match the specific output formats. E.g. there could be 2 kinds of HTML renderers, one which will create a single long webpage, and another which spreads the document over several pages (e.g. one page per section) and links them together via a table of contents page.
Below you see an example of how a simple LaTeX document would be represented in the object model. The connecting lines do not represent inheritation, instead many object model classes can have child nodes (similar to the W3C DOM), which is the relation displayed here.
There are two main modules, the reader (LaTeX document to object tree; package org.texconverter.reader.tex) and the renderer (object tree to final output; org.texconverter.renderer). The classes of the object model reside in the package org.texconverter.dom.
The reader module itself can be grouped into 4 important parts:
org.texconverter.reader.tex.builder), which is responsible for constructing the object tree and storing context information. org.texconverter.reader.tex.parser.Tokenizer), which segments the LaTeX input into distinct tokens for easier processing. org.texconverter.reader.tex.parser), which processes the TeX tokens. Necessary changes to the context or the object tree are communicated to the builder module. org.texconverter.reader.handler), which processes LaTeX or TeX commands. Necessary changes to the context or the object tree are communicated to the builder module.
Also of importance is the file cmdDefs.xml, residing in the folder plugin-resources. Here all handled (or ignored) commands are specified (their kind and count of arguments, the commandhandler class responsible for handling them, ..).
There is currently only one renderer (well, one and a half), which is versatile enough to cover all currently supported output formats. It uses Jakarta Velocity to render the object model objects. The different Velocity template sets can be found in the folder plugin-resources/vm/<output format name>.
The documentation you are currently reading resides in the folder "texdocs" in several LaTeX files. On executing the Maven-goal site, these files will get converted (by the TeXConverter-JUnit-tests) to XDoc and finally (by the Maven-xdoc-plugin) to HTML.
There are a few documentation files which use XDoc-specific features (like navigation.xml), these can be found in the folder site/xdocs/.