RbHtml Routines


The RbHtml routines are used by the RbFetch routines (or in rare instances, can be used by you directly) to flesh out the contents of an RbPage object. You may logically consider them to be a part of the RbPage class, since their only purpose is to update the HTML-releated values in an RbPage object (and they are only separated out into their own file to allow some future power-user to supply alternate HTML-parsing routines). You may safely ignore these functions in normal circumstances.

To use these routines, you need to have already created an RbPage object. Normally you'll just call the routines in that class to get your HTML or text data properly parsed since they call these routines as appropriate for the page's type. If you find this is not sufficient for you, read on.

For an HTML file, you can push data into the "content" buffer by calling the RbHtml_parsedPushFunc() function.

For a text file, you can push the data into the "content" buffer by calling RbHtml_parsedTextPushFunc() function (which converts text into HTML using a variety of user-selectable options -- see the RbMake class for details on how to specify such optional settings).

After pushing the last of the data data for this page, you must call RbHtml_flushParsedPush() before you use RbPage_write() to turn the page data into its final format.

The RbPage data that these routines populate include: "content" (for the filtered HTML data), the "tagTreeRoot" object (which contains a tree of the the paragraph-affecting tags that were used in the HTML), the "paras" MArray (which notes the paragraph positions of all the paragraph-affecting HTML tags), and the "names" HashTable (which contains all the NAME/ID positions and whether they got referenced or not).

If you want to replace these routines, you'll probably also need to supply your own RbHtml_init() and RbHtml_cleanup() routines, which get called by RbMake_init() and RbMake_cleanup(), respectively.


The 2 callback routines that this code calls are RbMakeAllowUrlFunc and RbMakeScheduleUrlFunc. The pointers to these functions will have already been specified when you called RbMake_new().

The code calls the RbMakeAllowUrlFunc when it wants to know if you want to include this link, image, or audio file in the .rb file. The function should return true or false.

The code calls the RbMakeScheduleUrlFunc when it wants you to schedule an URL to be fetched. For many people, this function will simply pass the URL and the page-type on to rbFetchURL().