Tuesday 29 May 2007

Indexing

I made a start on the indexer a while ago, and while it could do a single pass I haven't yet looked at how to make it a background task. It's similar to subscriptions, which are done with Windows Task Scheduler, but the Unix indexer script takes great care to interrupt the indexer if it crashes. I need to investigate how Task Scheduler works to see if it can do this.

On top of this is the problem of indexing full text. Most of the formats the Unix version can understand are converted using Unix specific tools; these need to be converted to pure Perl, Windows specific, or portable tools. I already found an module which should do the trick for HTML, and GhostScript will probably work for PS and PDF.

No comments: