Did you ever need to index an xml doc

and preserve the xml information in the index? May I present “the XML Indexer“.

My brother, who’s very populer AJAX Bible app has been getting attention, needed an xml index of the KJV Bible. He asked if I could help him get it. We would be parsing the KJV in XML format and I needed to pull out the reference information for every occurence of every word. Well I thought an xml indexer might be useful in more than one capacity and there wasn’t much on the net or cpan with the capability to do it. It needed to be light and fast because it was going to be parsing the entire bible so a DOM parser was out of the question. So I wrote my own.

xml_indexer.pm is a module to index the words in an xml document and preserve the xml information about each occurence of the word. It’s a little rough around the edges right now but it works. It uses the expat parser so it’s light and fast. Look at the bible_index.pl script for an example of how it works. I’ll do a tutorial on it later.

Update:
This baby has been confirmed to parse the entire bible in Zaphania xml format in under 3 minutes. That is a 16 MB file. It spits out a 23 MB index in that space of time. Quite honestly it surprised me.

2 Responses to “Did you ever need to index an xml doc”

  1. robbie mccorkle Says:

    Yes! Awesome. precisely what i was looking for. I saw a couple other implementations out there in ajax but none as developed with the speed of this bible tool.. AND you let me download it? Well then you’re definitely going in my kudos list.

    Btw, HI! I’m Robbie- my wife (Aissa) and I have been inspired to start an online bible study. So we did. Just like that. Took three weeks to pick up, but after an easy posting at myspace.com/biblestudyonline we became rather popular quickly. Our idea is spreading fast and we hope to find as many useful tools as possible in order to zip up a copy and let the world download it and use it.

    You can find out current bible study tool at doodleprints.com/biblestudyonline where we currently hold our own bible studies. We’ve just reimplemented a new design, which will again be changing for ability to resize per screen res, as well as bible features and instructions.

    We’re looking for any help and sugestions on God’s idea to spread His Word - and I believe your tool would be a HUGE benefit to the lost - to help them quickly and easily understand the bible.

    Keep in touch, perhaps you can help me with some cool interface features as this project grows.

    Let go, and let GOD!

    Robert & Aissa McCorkle
    New Web Ministries

  2. robbie Says:

    are you thinking of implementing an option to show the books? That would be helpful for a default page view - or a nice link option whether it be “new/old testament” links or “books” link.

Leave a Reply