HTMLslicer is a Java application that slices a long HTML source document into either or both:
- a set of smaller hyperlinked HTML documents (your resulting HTML pages set)
- a help document compatible to the JavaHelp system.
You can create a long technical document by using a word processor that can generate an HTML-formatted file, such as Microsoft Word (I am using Word 97) using all of its advantages (ex. WISIWIG, lexical and grammatical correctors, etc.). Once processed by HTMLslicer, all the resulting HTML pages share a common style from your own templates. If you prefer, you can also use the simple templates provided with this software.
HTMLslicer has been created and is owned by Marcel St-Amant. You may, however, enjoy the use of a copy of this application for free (See: License).
Here is a typical screenshot:

Application Main Window
Features
The HTMLslicer:
- Can be used on many OS platforms
- Is Fully documented
- Has an integrated help system (Note: a Sun Microsystems JavaHelp runtime engine is included with this package)
- Has Drag-Drop support (not available on Linux)
- Can save your setup
- Is fast and extremely easy to use
- Can generate a Table Of Contents (TOC) page
- Can generate hyperlinks between resulting HTML pages
- Can generate the metadata files for the creation of a JavaHelp compatible help document
- Can use templates
- Generates detailed reports to help debugging potential problems
Note 1: This application is not robust; it may crash (not your system) if the source file contains HTML tag errors or is not HTML compliant. If this happens, simply stop the application, correct your source document and restart HTMLslicer.
Note 2: It does not handle heading tags that contain attributes (ex. <h2 class="subtitle">), only simple heading tags (ex. <H2>). A future version may be able to handle them.
Requirements
- Installed Java virtual machine (tested on version 1.3.1 and 1.4.0)
- Basic (elementary level) knowledge of HTML
Installation
Simply unzip the htmlslicer.zip file under a sub-directory (or folder) of your choice.
In this section, the specialized terminology (in bold) will be introduced and the slicing process will be explained.
HTMLslicer (the application) takes, as inputs:
- Your long HTML-formatted source document (or source document)
- The application settings
- Various templates
- A body template
- A header template (if requested)
- A footer template (if requested)
- A transition template (if requested)
- A TOC template (if requested)
It searches for the headings in your source document that are at the level equal or higher than your set heading level for segmentation. For example, if you set to slice at <H2> heading tags, your document will be segmented at each point where a <H1> tag and a <H2> tag are found (its cleavage points). There are two special segments (part of the document between 2 cleavage points) that may be used as a potential body template source:
- Top part
: Your document segment between the beginning and the first cleavage point (usually at the first <H1> tag).
- Bottom part
: Your document segment that contains the strings "</BODY></HTML>".
It then:
- Slices the source document at its cleavage points to create its segments,
- Finds a proper file name for each segment based on the segment title (the top heading of the segment),
- Adjusts the hyperlinks found in the segments so that they can still work with the resulting HTML pages
- For each segment
- Applies the specified templates (more about this later) to get the resulting HTML page content
- Saves this as a resulting HTML page file.
- By using a TOC template and the titles of segments, it generates a TOC file in HTML format (if requested)
- Using a set of JavaHelp metadata templates, it generates the proper JavaHelp metadata files for your resulting HTML pages.
The figure below illustrates how these templates (except the transition one) are used for generating your resulting HTML pages set:

Input documents used for generating the resulting HTML pages set.
As shown in the figure above, one resulting HTML page (ex.: The page related to the segment N of your source document) contains the "Top part" of a body template (Your source document may be used for that), a header (if requested), the segment content, a footer (if requested) and the "Bottom part" of the body template. If your source document has been used as a Body template, the "Bottom part" contains only this text: "</BODY></HTML>".
Not illustrated above, is the impact of the transition template. It generates a sub table of content with a hyperlink for each sub-section HTML page that follows. This resulting part (the transition part) is placed between the segment part and the footer part of such pages. In this document the, Introduction, User Interface, Templates Reference and Appendix sections contain such a transition part.
This section introduces to the use of the application.
- Using your preferred word processor that supports HTML templates (ex. Microsoft Word 97), create your document and save it as an HTML-formatted file (this is your source document file). Use H1, H2 and H3 headings; these mark the cleavage points of your document. Inset your images as linked pictures (to create a <IMG SRC="yourPicture.gif">) and create you internal hyperlinks using named bookmarks (<A NAME="yourBookmark">) and links (<A HREF="#youTargetBookmark">). Put all your pictures under a common sub-directory. The source of this document is provided to you as an example (Example.zip file), try it.
- Start the HTMLslicer (double-click on its system icon); you should see a splash screen then the application window at the upper-left corner of your screen. Do not modify the setup panels (Keep it as it was when you just installed it, if you have modified it, do the setup exactly as shown in the screenshots; see Tabbed Panels).
- Drag your own source document HTML file and drop it into the Drop-In panel of HTMLslicer. Note: Under Linux, you need to use the File > Open HTML menu command instead.
- Do the menu command: File > Slice Html
- HTMLslicer will create your resulting HTML page files set (and a TOC file). These files are the segmented version of your source document and they are located in the same directory. These resulting HTML files use the same image files as your source file. Start your browser from the TOC.htm file and enjoy the results. Simple!
The HTMLslicer user interface has the following components:
There are other components that can be viewed as textual user input interfaces; they are in the form of templates (See Templates Reference).
There are two menus:
File Menu
- Open Html:
It displays a File Open dialog. Use this dialog to select the source HTML file to be spliced. The Ctrl-O key combination has the same effect. Note: A more convenient method is to use the drag-drop method (See Drop-In Panel).
- Slice Html:
This menu item is enabled once a source file has been selected. This command will segment the source file into a more manageable size as specified through the Slice Setup panel. It could generate a TOC file and a JavaHelp metadata file set as specified through the TOC Setup panel. It will use the templates as specified through the Style Setup panel. The Ctrl-H key combination has the same effect.
- Save Setup:
It saves the actual setup as Config.ser file under this application directory. The same file will be used when you start HTMLslicer. The Ctrl-S key combination has the same effect.
- Exit:
It terminates this application. The Ctrl-X key combination has the same effect.
Help Menu
- Contents:
This menu item is enabled if the JavaHelp system has been installed and is accessible. It displays this document in a Help window.
- About:
It displays a summary description (Also used as a splash window) of this application.
There are four panels:

Drop-In panel display
This panel contains a large text area. It plays two roles:
- As a file drop target.
Dropping an HTML file into this area has the same effect as using the File>Open Html menu command and selecting your source file through the displayed dialog. Dropping the file is the more convenient method. Important Note: The drop target feature is not available while running under Linux.
- As viewer area
that contains the report of the last slicing operation. Once you do File>Slice Html, you will see the text area being filled with step-by-step report as the slicing and TOC operation is progressing. You can view the text to check if any problem had occurred. The same text and some extra information (See Generated Report File) are saved as a text file (JHSplitterLog.txt) in the same directory as your source file.

Slice Setup panel display
This panel contains two panes:
- Slice at
pane: It contains three radio buttons; each one corresponds to a heading level. The selected one specifies that the source HTML file will be segmented at its heading of the specified level and higher (ex. If <H2> Heading is selected, the cleavage points are at <H1> and <H2> headings).
- Naming
pane: Only one radio button is enabled (Heading Title with Number).
- Lowercase Names
checkbox: If checked, all HTML files created by HTMLslicer will have names that contain lowercase letters only.
- Page Extension
text field: Contains the specified name extension for all HTML files created by HTMLslicer. If no name extension is given or if the extension does not start by a dot, then the ".htm" extension is used.
Note: The name of each segment is based from its heading title, and to avoid potential name conflicts with other segments, a unique number might be appended. If you have a sub-heading called "Introduction" under many main headings, you will get "Introduction.htm" for the first file and "Introduction1.htm" for the second file and "Introduction2.htm for the third file, etc. To generate the related file name, the under-stroke character replaces spaces, punctuation, non-letter and non-numeral characters found in the related heading title. For example, a segment with its main header title "Top & Bottom" will be found in a resulting HTML file named "Top___Bottom.htm" (with three consecutive under-strokes).

TOC Setup panel display
This panel contains three items:
- JavaHelp metadata
checkbox: If checked, the application will create the necessary files required for integrating the resulting HTML files into a JavaHelp system of your Java application. Very convenient for a Java programmer who wants to integrate his or her application to its documentation as a help system.
- Create TOC.htm
checkbox: If checked, the application will create a TOC.htm file that contains a list of all your generated files, each file being identified by its main heading title. Each item of the list hyperlinks to the related file.
- Template =
text field: This text field contains the file name of the TOC template.

Style Setup panel display
This panel contains four combined items (check box with its related text field):
- Body Template
: It determines which template to use as the unchanging HTML form (the body template) that will surround the segment part of each resulting HTML page. Your own source file is used as a template if the checkbox is unchecked (See Concepts). If checked but with an empty text field, a common template is used (it is located at the application directory: the BodyTemplate.htm file). If the field contains a text, the text corresponds to the name of a file located in your source document directory that will be used as a body template.
- Header Template
: It determines which template to use for generating a header part; it will be placed just above the segment part of your resulting HTML page. Such a template contains variables that will provide for "Next", "Previous", "TOC" and "Home" hyperlinks. If unchecked, no header is generated. If checked but with an empty text field, a common template is used (it is located at the application directory: the HeaderTemplate.txt file). If the field contains a text, the text corresponds to the name of a file located in your source document directory that will be used as a header template.
- Footer Template
: It determines which template to use for generating a footer part; it will be placed just below the segment part of your resulting HTML page (or below the sub-TOC footer part, if requested for). Such a template contains variables that will provide for "Next", "Previous", "TOC" and "Home" hyperlinks. If unchecked, no footer is generated. If checked but with an empty text field, a common template is used (it is located at the application directory: the FooterTemplate.txt file). If the field contains a text, the text corresponds to the name of a file located in your source document directory that will be used as a footer template.
- Transition Template:
It determines which template to use for generating a sub-TOC footer part; it will be placed just below the segment part of your resulting HTML pages if at least one following page corresponds to a lower header. Such a page contains a table of contents of those pages. Such template contains variables that will provide hyperlinks for the following pages with a lower-level header. If unchecked, no sub-TOC is generated. If checked but with an empty text field, a common template is used (it is located at the application directory: the TransitionTemplate.txt file). If the field contains a text, the text corresponds to the name of a file located in your source document directory that will be used as a transition template.
This application contains an extensive feedback system; it has four components:
- The status bar:
It displays the completion report of the last command. Usually it contains the name of the last command but it may contain an error message that explains why the last command was not successful (ex. if you open a binary file instead of your source HTML document).
- The title bar:
It contains the application name "HTMLslicer" and the file path of the selected source document file.
- The Drop Area and Activity Report text area:
It contains a list of processing steps that went through while generating the resulting HTML pages and the TOC file. Each step uses one line. Each line starts by an alert-level tag followed by a detailed message. (See Drop Area and Activity Report below).
- The generated report file:
A file named "JHSplitterLog.txt" is generated and is located in the same directory as your source HTML document. It contains the same information displayed in the text area plus extra information. This file will help resolve problems that you might encounter while processing your source HTML document. (See Generated Report File below)
The messages generated by this application contain two parts: An alert-level tag as a prefix and a body part that follows. There are four possible alert-level tags:
- INFO:
This represents the lowest alert level, it just confirms that the step or operation went successfully.
- WARN:
This represents a low alert level, it means that the related step or operation completed fully but with some compromises.
- ERR:
This represents an error message, it means that the step or operation has not been accomplished for the reason explained in the message.
- FAIL:
This represents an application-level error message; the application cannot continue properly under such a circumstance.
This text area contains a long list of messages; each corresponds to a processing step of your source document. The steps are in the following general sequence:
- Opening of all the required files
- Slicing the source HTML document (there is one line report for each segment or slice)
- Naming of the segments
- Adjusting the hyperlinks contained in the segments
- Saving of all segments using the specified templates (there is one line report for each segment) Important Note: If a segment name is identical to your source document file name, the latter will be renamed with "0-" prefix. You get a full set of your resulting HTML pages without overwriting your precious source file. A warning message will be generated. ATTENTION: If a segment name is identical to any other file that is already located in the directory, it will be overwritten; pay special attention to the names of your templates located there.
- Creation of the TOC (Table of Content) HTML file (if requested by the setup)
- Creation of the required JavaHelp metadata files (if requested by the setup)
A report file named JHSplitterLog.txt is created (replacing the previous one, if any) when a source document file has been selected. Each time the File > Slice HTML command is given (on the same file), a report of the activity is appended to the file. Each report is separated by the following message:
"INFO: ========================================="
Each report contains three sections, each one is separated by a series of dashes (i.e. --------) in this order:
- The section of collected information. It is a list of main headings of the segments, followed by a list of heading titles with their eventual bookmarks ID (i.e. the NAME attribute of the HTML <A> tag).
- The content of item type identifiers (See TOC Template) from the selected TOC template (An item type identifier describes the format to be used for generating an item of the table of content).
- A copy of the Drop Area and Activity Report text area display
Any error found will also be reported in the report file.
The HTMLslicer documentation set is presented in two formats:
- A simple set of HTML files that you can view with your Internet browser. Start it at "TOC.htm".
- A help document that is integrated into the HTMLslicer application as a JavaHelp system. There is only one limitation; you might not be able to navigate through the help document when the File Open modal dialog is displayed.
The HTMLslicer application uses four types of templates while processing your source HTML document:
- Body template, used for the creation of the static part of the resulting HTML page file generated for each segment
- Header/Footer templates, used for the creation of sequential page navigation for the resulting HTML page file generated for each segment (if requested)
- Transition template, used for the creation of local table of contents (or sub-TOC) of sub-sections under the page (if requested)
- TOC template, used for the creation of a TOC page of your resulting HTML pages set (if requested)
- JavaHelp metadata templates, used for the creation of your JavaHelp compatible help document (if requested)
Each template contains a static part and variables. All variables share a common format:
- It starts with the @ character
- Then the variable name using capital letters only (no space within nor with the @ characters)
- And ends with a second @ character
An example: @PREV@
Each template has its own recognized set of variables, as explained in the following sections.
The body template is a simple HTML file that may contain one variable: @DOC@. If the variable is absent, HTMLslicer assumes that an implicit @DOC@ variable has been placed just before the last HTML tags: "</BODY></HTML>". Each resulting HTML page file contains (See also Concepts):
- the "top part" of the template file that precedes the @DOC@ variable,
- then, optionally, the header part (if requested, see Style Setup Panel),
- then the related segment from your source document,
- then, optionally, the footer part (if requested, see Style Setup Panel),
- then, finally, the "bottom" part of the template file that follows the @DOC@ variable.
Indeed, @DOC@ represents the HTML current segment from your source document (with, optionally, its header and footer parts). A body template might come from three possible sources:
- Your source document: it does not contain any variable. The "top" part of such a template is the part contained between the beginning of your document file, and up to the first header, where the segmentation starts. The bottom part is simply the string: "</BODY></HTML>".
- The file named BodyTemplate.htm that is located in the application directory (the "default template"). The simplest one to use in the one I am providing with the installation package. Examine it, because it is a very good introduction. Later, you can replace it with your own designed default template, but it should have the same file name "BodyTemplate.htm".
- The name of the file that you specified in the Style Setup panel and that should be found in your source document directory.
The header and footer parts are generated with the help of related templates. Both templates recognize the following variables:
- @PREV@
: It will be transformed into the relative file-path to the file that contains the previous segment from your source document.
- @NEXT@
: It will be transformed into the relative file-path to the file that contains the next segment from your source document.
- @TOC@
: It will be transformed into the relative file-path to the TOC.htm file
- @HOME@
: It will be transformed into the relative file-path to the first segment from your source document.
An example will illustrate how such a template works. Here is a simple footer template:
<HR><A HREF="@PREV@">Prev</A> <A HREF="@NEXT@">Next</A>
If your documents had the following segments "Chapter1", Chapter 2" etc; the resulting files will be Chapter1.htm, Chapter2.htm, etc. For your third segment, the footer part will become:
<HR><A HREF="Chapter2.htm">Prev</A> <A HREF="Chapter4.htm">Next</A>
Probably, you get the idea.
A header and/or footer template might come from two possible sources:
- The files named HeaderTemplate.txt and FooterTemplate.txt that are located in the application directory; they are the default header and footer templates. You can replace these files with your own version as long as they have the same names.
- The name of the files that you specified in the Style Setup panel and that should be found in your source document directory.
The Transition template contains variables and a special <X1> and </X1> tag pairs that delimit the part that is repeated for each item of the sub-table of contents. The top and bottom parts of the template are not repeated and contain no variable. The part that is delimited by the <X1> and </X1> tags contains the following variables:
- @FILEPATH@
: This variable will be replaced by the relative file-path of the corresponding HTML file that will be referenced by the related item of the sub-table of contents.
- @TITLE@
: This variable will be replaced by the title of the corresponding HTML file that will be referenced by the related item of the sub-table of contents.
Here is a simple transition template, the one used for this documentation:
<HR><H4>Table of Contents</H4><UL>
<X2> <LI><A HREF="@FILEPATH@">@TITLE@</A></LI>
</X2></UL>
The TOC template contains variables and special item type identifier tags, or, I call them the "X" tags. As explained earlier, an item type identifier contains formatting information for generating an item that identifies a resulting HTML page that corresponds to a specific level of its related heading. A TOC page has the following parts:
- A top part that contains non-list contents such as a title and an introduction that precedes the list of sections and sub-sections of your document set. Only one variable is recognized: @NAME@. It will be replaced by the file name of your source document (without its ".htm" extension).
- Parts that are enclosed within complementary pairs of X tags; they describe the format of each item of the table of contents, more on this, below.
- The end part that is not enclosed between the X tags; it should end with the usual </BODY></HTML>, but may contain text, images and other hyperlinks.
There are 7 recognized X tags pairs, three item tag pairs and four transitional tag pairs. The item tag pairs are:
- <X1>
and </X1>: List item for level 1 sections of your documents, those that reference your document segment from <H1> headings.
- <X2>
and </X2>: List item for level 2 sections of your documents, those that reference your document segment from <H2> headings.
- <X3>
and </X3>: List item for level 3 sections of your documents, those that reference your document segment from <H3> headings.
The template part contained within these X tag pairs can have the following variables:
- @FILEPATH@
: This variable will be replaced by the relative file-path of the corresponding HTML file that comes from the segment of the same level (i.e. <H2> level if the part is enclosed between <X2> tag pairs).
- @TITLE@
: This variable will be replaced by the title of the corresponding HTML file that comes from the segment of the same level.
A typical simple second level item part will be:
<X2> <LI><A HREF="@FILEPATH@">@TITLE@</A></LI>
</X2>
The following X tags (recognized by the fact that they contain 2 digits) are "transitional" tags. They do not contain any variable.
- <X12>
and </X12>: Part that represents a transition from level 1 to level 2.
- <X23>
and </X23>: Part that represents a transition from level 2 to level 3.
- <X32>
and </X32>: Part that represents a transition from level 3 to level 2.
- <X21>
and </X21>: Part that represents a transition from level 2 to level 1.
A typical simple 1st to 2nd level transition is:
<X12><UL>
</X12>
The TOC.htm file of this document has been created with the help of the TOCTemplate.htm file that is located in the application directory. The TOC template might come from two possible sources:
- The file named TOCTemplate.htm that is located in the application directory (the "default TOC template"). You can replace this file with your own designed TOC template and it should have the same name.
- The name of the file that you specified in the TOC Setup panel and that should be found in your source document directory.
Four templates are required to create a corresponding set of JavaHelp metadata files; their related file names are:
- BasicTOC.xml
: The Table of contents template that is compatible with the JavaHelp system.
- BasicIndex.xml
: The Index template that is compatible with the JavaHelp system. For the moment, this file is empty and no index is generated with this version (1.0) of the application.
- BasicMap.jhm
: The Map template that is compatible with the JavaHelp system. It lists the correspondence between a topic ID and the HTML file URL that contains the topic.
- Basic.hs
: The help set template that is compatible with the JavaHelp system. It describes the fact that the help document contains a table of contents and a search engine.
Unless you are a skilled hacker or Java programmer, I recommend that you do not modify these files. With these templates the following files will be created in your source document directory:
NAMETOC.xml: The resulting TOC metadata file for your Java-based help document.
Map.jhm: The resulting map file for your Java-based help document.
NAME.hs: The resulting help set document for your Java-based help document.
The "NAME" part of the above file name is replaced by the name of your source document file. A Java programmer will be able to integrate your HTML file set and the metadata file set to the related Java application. This integration part is documented by Sun Microsystems and is out of the scope of this document.
This appendix contains three sections:
- The end user license of this application: HTMLslicer
- Information about my small software project page
- Information about me
HTMLslicer v. 1.0 freeware.
Copyright ( C ) 2002 by Marcel St-Amant
By using this software you accept all the terms and conditions of this License Agreement
This software is the property of Marcel St-Amant and is protected by copyright law. There is no license fee, and registration is not required for HTMLslicer v.1.0 freeware. However, this Agreement does not grant you any rights to enhancements or updates, or support or maintenance of the product. Future versions of the product may not be freeware.
This software is provided 'as-is', without any express or implied warranty. In no event will the author be held liable for any damages arising from the use of this software.
Permission is granted to anyone to use this software for document generation purpose, including commercial applications and create and publish derivative work from it (the resulting HTML page file set, TOC page file and JavaHelp compliant help document).
You may not modify this software in any way, nor reverse-engineer it. If you find a bug or have a suggestion, please contact the author by Email, however, this does not imply that the requested correction or enhancement will be applied.
You may distribute this software to anyone with the following restrictions:
- The origin of this software must not be misrepresented; you must not claim that you wrote the original software.
- You must provide an EXACT copy of the original package; no modification is allowed.
- If you provide the software on your Internet site, the author of this software must be clearly identified and a link to his site be provided. Please inform the author by E-mail.
Author: Marcel St-Amant
Email: bigfeet@videotron.ca
Big Feet Software
Montreal, Quebec, CANADA.
About Big Feet Software
Big Feet Software provides Java-based application in the following domains:
- Internet-related
- Documentation-related utilities
About Me
I am Marcel St-Amant, a physicist and geophysicist by training (a BSc and MSc in these fields) and worked as such for the first part of my career. I gradually became more involved in computers then in software development. I am a Sun-Certified Programmer for Java 2 platform.
As hobbies, I did many electronic and optic projects (IR vision system, UV laser, various detectors, remote control and amateur scientific instrumentation, etc). I like to do software for controlling and monitoring various electronic gadgets. I also like to create Java applications. I draw comics (European style).
Author: Marcel St-Amant
Email: bigfeet@videotron.ca
Big Feet Software
Montreal, Quebec, CANADA.