The Rbmake Commandline Tools
The tools consist of the following four programs:
rbmake, rbburst,
rbinfo, and rbdump.
What's New?
Here's a summary of some of the things that got added recently:
- [Feb 3] The library (and thus rbburst and rbinfo) can now
read in the REB1100's annotation file -- the single XML file that ends
in ".an". If this file is found, the library does not attempt to open
the old, 3-file markup that the Rocket eBook generates.
- [Feb 3] The dictionary-building code was significantly
improved. You can now add extra words for the current definition
(after the initial one in the BIG-B tags) by using SMALL-B tag sets.
The code also allows you to specify just the suffix of an additional
word (if you've specified the root word with syllable breaks and you
start the additional word with a hyphen). The paragraph-bookification
code was also enhanced to not mangle dictionary-definition paragraphs,
which allows an author to easily add an indexed glossary to a book
while still being able to use the -b (book-style paragraphs) option.
- [Feb 3] You can now choose which types of punctuation
enhancement you want to turn on (when using an option file). This
allows you to selectively have rbmake enhance double quotes, single
quotes, emdashes, and/or ellipses in any combination.
- [Feb 3] Rbburst now restores paragraphs that it turned into
"book form" back into their original P-tag form. A new -R (raw)
option allows you to extract the raw HTML from a book, if you need to
see the book's HTML without any of rbburst's tag restoration.
- [Nov 12] It is now possible to specify different
edge-enhancement settings for different images (based on their names).
- [Nov 12] You can now store Auth-Info in your option file (if
you want to).
- [Nov 12] The new -P option for rbmake can be used to generate a
base64-encoded string of a username & password (slightly obscuring
it for use with Auth-Info).
- [Nov 06] Rbburst can now extract markup for a joined book at
the same time that it is unjoining the book's pages. This allows you to
see at a glance what section of the book the markup is in by just reading
through the markup summary page.
- [Oct 28] The -f option now takes an argument that tells rbmake
how many links deep we should travel from the supplied starting page(s).
To get the old behavior, specify "-fa" (think "any depth") or "-fy" (think
"follow: yes").
- [Oct 28] Password-protected pages will now cause rbmake to
prompt you for a username and password rather than to fail. For
non-interactive use, you can prevent this with the new -n option.
- [Oct 28] The new -N option tells rbmake the maximum number of
HMTL/text pages that it should fetch.
- [Oct 28] Several fixes were made in the handling of whitespace
in names, and the grabbing of CGI pages that include a '&' was also
fixed.
- [Oct 28] Fixed the creation of a reference (dictionary) file
(using rbmake's -k option).
- [Oct 20] The library now handles the conversion of images into
B&W on its own so that it can do a much better job than ImageMagick
was doing. This means that the code only (currently) supports GIF, JPEG,
and PNG images, but that's the vast majority of the images you'll
encounter.
- [Oct 20] The rbmake utility now has a -E option that allows you
to set how much edge-enhancement you'd like to have applied to the images.
- [Oct 15] You can now ask rbinfo to summarize the markup of a book.
Use the same options as you would with rbburst.
- [Oct 15] Added the -M option to rbburst to have it merge in markup
that exists in the same directory as the the book.
- [Oct 14] If image-inclusion is off, [Image: ALT] tags are placed in
the text.
- [Oct 13] You can now have rbburst merge the markup from your
Rocket eBook into the book that it extracts. It merges tags into the
pages where the markup is positioned, and also creates a summary page of
all the markup in the entire book. Keep in mind that you'll need to have
already fetched the markup from your reader via the RocketLibrarian, and
that you'll need to specify where to find the markup files (see the
Extracting Your Markup topic in the rbburst
section).
- [Oct 12] MS Windows library DLLs released.
- [Oct 11] Rbmake understands the <HR NEW-PAGE> tag as an
unambiguous way to specify a page break (since some web sites use a
0-sized HR as a regular line).
- [Oct 4] First release of tool-binaries for MS Windows.
- [Sep 30] You can now specify option-rewriting rules in an rbmake
option file. This is handy if you use user-supplied args in the option
file (like the Webscription-option file does -- see
ws.opt).
- [Sep 29] Rbmake supports page-rewriting rules that let you customize or
otherwise improve the HTML and/or text content that you are importing.
- [Sep 29] Rbmake supports reading and writing option files, in addition
to its commandline options.
- [Sep 22] First binary release of for Linux i386.
- [Sep 16] HTML documentation added.
- [Sep 9] First public release of the rbmake source.
Rbmake Highlights
Here's a rundown of some of the highlights of the book-building tool:
- You can specify rewriting rules that lets you improve web pages
or books as the pages are fetched. For instance, I have some
sample option files
that are setup to fetch news from USA Today and Yahoo, as well as to
greatly improve the Rocket edition of the book "Thinking in Java", and also
a Webscription option-file that rewrites Baen's HTML ebooks.
- Rbmake supports both local and http-fetched config files that lets you
easily use someone else's contributed web-fetching and rewriting rules or
make your config files available for others to use.
- You can manipulate the ReB's "Go To" menu more easily now, either
using the -g option, the "Menu-Item:" option-file setting, or putting META
tags into source file(s). The syntax for the META tag is <META
NAME="rocket-menu" CONTENT="Description=URL">.
- You can specify one or more paths that either restrict or expand
the fetching of web pages. This makes it easier to setup a repeating
web-fetch or to grab something that has a start page in one place and the
rest of the content someplace else.
- You can choose to join all the web pages into a single document,
making it easier to read on the ReB (without breaking any hyperlinks).
- You can choose either web-style paragraphs or book-style paragraphs.
- You can choose to turn normal ASCII quotes and dashes into "curly"
quotes and em-dashes.
- Rbmake supports the ISO numeric entities (unlike the RocketWriter).
For instance, the HTML ebook "Thinking in Java" imports with visible
punctuation in the text.
- Rbmake doesn't have the RocketWriter's annoying bug of dropping
characters from the ends of paragraphs, the middle of apostrophed words,
and dropping spaces after bold/italic words (which it loves to do when it
sees a document that contains enhanced punctuation).
- The TABLE-translation algorithm makes the text nicely readable (within
the limits of the ReB's constricted HTML).
Some Book-Making Examples
The following example command will download just the technology news
from Yahoo's plain-text web site. Since the start page is somewhere
completely different from the news-article pages, I use the -M option
(with some wildcards) to restrict the followed links to just the desired
tech-news pages. I also specify a large number of options (see the usage
message for rbmake below for what they mean). Here's the command (note
that this would really be all one line):
rbmake -jpebziofa yahoo
-M 'http://dailynews.yahoo.com/htx/nm/200\d\d\d\d\d/tc/*'
'http://dailynews.yahoo.com/htx/tc/nm/?u'
The resulting book is named yahoo.rb. It has the news all in one
page, the punctuation is enhanced, the paragraphs are book-style,
images are included, the erroneous page-breaks are removed, and I get
prompted to set the author & title information.
Another way to handle this is to create an option file. You can dump
the default options to a file like this:
rbmake -D >default.opt
You could also just add the -D option to the previous yahoo command and
it would dump an option file that represents all those options. With an
option file, it is easy to reuse settings and share them with others. For
instance, I have two option files named
usatoday.opt and
yahoo.opt. You
can use these directly from the web, like this:
rbmake -l http://rbmake.sourceforge.net/samples/usatoday.opt
rbmake -l http://rbmake.sourceforge.net/samples/yahoo.opt
A more interesting option file is named
tij2.opt. It was
used to create the Thinking in Java edition 2 ebook I submitted to the
Rocket Library. If you were
to download the TiJ HTML ebook from Bruce
Eckel's web site and unzip it into a local directory, then you could use
this command to create an ebook just like
the
one I created:
rbmake -l http://rbmake.sourceforge.net/samples/tij2.opt
Also, if you're a fan of Baen's
Webscription ebooks (which are
some great science fiction and fantasy, IMO), you can use my
ws.opt file to
transform the HTML version of one of their ebooks into an even better
version than the one that they provide (for instance, it has a full-sized
cover image, it has book-style paragraphs, enhanced punctuation, the
chapters start at the top of a new page, and the "goto" menu has links to
things like the maps and the table of contents). The ws.opt file also
demonstrates how to pass arguments into the option-file. Here's how you
use it:
rbmake -L 0671578545 -l ws.opt
The -L option (which must come before the option file) specifies that
the "0671578545" parameter is to be provided to ws.opt as its $1 parameter
(you could specify more -L options for more parameters). This script uses
this name to indicate which book to convert. In this case, rbmake will
create the book Ashes_of_Victory.rb.
The ws.opt file also shows you two different ways to populate the "Go
to" menu: use the "Menu-Item" setting (which would also be accessible via
the -g option), and via the insertion of META tags into the body of the
web page (which is done by using some text-substitution rules).
Extracting Your Markup
You can now ask rbburst to merge-in the markup from your ebook by using
the -m or -M option. The -M option expects the markup files to be in the
same directory as the .rb file, and using the same name. For instance,
typing "rbburst -M path/foo.rb" would look for the markup files
"path/foo.ra", "path/foo.rh", and "path/foo.rn". Only use this option if
you've copied your markup files out of the Librarian's directory
structure.
The other markup-merging option, -m, needs to know where to find your
Library books and what rocket-ID the files are stored under (since the
markup is unique to each reader, you can have different markup files for
each reader you own). The easiest way to do this is probably to set the
RB_LIB_DIR and RB_ID environment variables.
The RB_LIB_DIR environment variable specifies the ".../Library/books"
directory (where the "[none]" dir and your rocket-ID dir exist). MS
Windows users can normally ignore this setting unless the RocketLibrarian
was not installed in the default place on the C: drive (the default
setting is
"C:\Program Files\NuvoMedia\RocketLibrarian\Library\books").
In addition, rbburst has the -d option that can be used to override any
other directory location.
The RB_ID environment variable specifies the rocket-ID of the reader
where your markup files are stored (a rocket-ID looks like this:
somebody1234). You can also specify the ID via the -r option,
which is especially useful if you have multiple ReB readers.
To set an environment variable differs depending on your operating
system. For DOS, you could put a
"set RB_LIB_DIR=C:\Some\Path" into the autoexec.bat file.
For Unix-like OSes (including Linux), put something like
"setenv RB_LIB_DIR /some/path" or
"export RB_LIB_DIR=/some/path" into the appropriate .profile
or .login file (see your shell's manpage for the details).
Note that the -m option is not confused by a .rb file that was exported
from the library using a different name. The software knows the right
name to use when it looks up the files in your Library dir.
Here's an example. The following command sets a rocket-ID and uses the
default library directory location:
rbburst -r somebody1234 -m SomeBook.rb
Tell Me More
To find out more, see the various tool manpages:
rbmake, rbburst,
rbinfo, and rbdump
Or check out the library interface or the
rbmake home page.
All this was created by Wayne Davison.