The Library

Overview

The library has several different levels of support for reading and writing .rb files. If all you want to do is to read a .rb file, you only need to know about the RbFile class. If what you want to do is to create a new .rb file, you only need to know about the RbMake class and the RbFetch routines.

Keep in mind during all the talk of "objects" and "classes" and the code is not C++ (though that would have made its implementation easier), but simply object-oriented C code. I decided to use C rather than C++ simply because C is still much more portable than C++.

RbFile

The lowest level is the RbFile class, which lets you read and write the .rb file format.

For reading, the class opens the file, gives you a list of the sections that comprise a book (I call these sections "pages"), and then lets you choose which of the pages you want to read (uncompressing the data as needed).

To create a new file, the class writes the necessary header data, lets you append (properly formatted and associated) pages (e.g. you must supply the hidden .hidx (HTML index) data yourself), and then it finalizes the file when you close it.

RbMake

If you want to create a new .rb file, using the RbFile class would be a royal pain since it does nothing to help you format the data that needs to go into the file. This is the function of the RbMake class -- it takes ordinary data and transforms it into the low-level data that can be included in a .rb file. The RbMake object contains its own RbFile object for doing the actual writing of the file, and expects you to add RbPage objects. The easiest way to do this is to use the RbFetch routines.

RbFetch

The RbFetch routines (with no associated object) are available to perform the task of scheduling all the files that need to be processed, fetching them (or reading them locally), and then calling the appropriate functions to transform the data into RbPage objects. There are also a couple routines that take an existing .rb file and use its contents as data for inclusion in the new .rb file. You need to have an RbMake object already created to use these functions. These routines may be logically considered to be part of the RbMake class, and are only separated out in order to allow some future power-user to supply alternate fetch functionality.

RbPage

The RbPage class combined with the RbHtml routines are used to turn an input file into all the data needed to create one or more low-level page sections in the actual .rb file. For instance, an RbPage object generated from an HTML file will write itself out as the (filtered) .html page, the associated .hidx page, and (optionally) the .hkey page. Also, the writing of the first .html RbPage object will trigger the writing of the .info page (which contains book data such as the author, title, etc.). All RbPage objects have a pointer back to the RbMake object for which they were created.

RbHtml

The RbHtml routines are a few functions (with no associated object) that allows you to push HTML or text data into an RbPage object. These routines may be logically considered to be part of the RbPage set of routines, and are only separated out in order to allow some future power-user to supply alternate HTML-parsing functions.

RbImage

The RbImage class is a set of image-processing functions that transform an image into an internal representation (the RbImage object), manipulate it (e.g. transform it into a B&W format), and then render it as a PNG image for inclusion in the .rb file. You can choose to supply your own version of these routines if you want to do the image conversion yourself.

RbInfoHash

The RbInfoHash class maintains a hashed list of NAME -> VALUE pairs that were derived from the .info page of an .rb file. These routines also let you easily build up a set of values (when creating a new book) and then converts the data into the right format for writing out. Note that the RbMake object has an RbInfoHash object embedded within it.

GrabUrl

There are some GrabUrl routines that make it easy to fetch web pages. They also provide a means of specifying authorization information for password-protected web pages.

Error Reporting

The RbError error-reporting functions have support for a user-supplied hook that will let you output the error in the manner of your choosing, as well as doing whatever fatal-error cleanup you might need to do.

Matching and Replacement

There are two ways to match things. Either use the wildcard matching routines, or use Perl-Compatible Regular Expressions, as supplied by the PCRE library.

Beyond mere matching are the Substitution routines. This small set of functions allows you to parse perl-like substitution commands and then execute them to modify the pages before they get their HTML parsed.

Utility Routines

In addition to the ever-present MBuf class, there are a smattering of utility routines that help you do things like interpret URLs based on the current page, and to interpret file suffixes.

Tell Me More

To find out more, please check out the commandline tools or the rbmake home page.

All this was created by Wayne Davison.