[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Q]: Intranet tools



"Peter L. Peres" <plp@actcom.co.il> writes:

> 
> How about freeWAIS ? It already comes with a HTML interface for searching,
> and all you have to do, is add a processor for inserted record, that
> should trigger a rebuild. That is free, and known to work well (at least
> one version of Linux used it to allow full text indexing of the entire
> system and manpages in HTML).
> 

Hello Peter. Thank you for a kind reply.

Actually you've raised another question: how one would implement a
global Intranet search facility? There are many issues about this and
the most important is that documents don't share the same format. You
may have: manpages (roff), mailing list archives (mailfolder), web
pages (HTML), raw text documents, PostScript, PDF, DVI, TeX etc. Add
the fact that documents above may be also compressed (let's assume
gzip compression only) and you'll get the picture.

One should be able to:

1. Index all of the documents above in a reasonable way (for example,
you wouldn't like to index a 'newpath' word for your documents in
PostScript)

2. Once a user finds an appropriate entry using a search facility over
an index and clicks on the link, the document has to be converted to
HTML and displayed in the browser.

This is kinda difficult to achieve. Of course, one can just have a
sink of HTMLized copies of each document in the system, but this
solution is also kind of ugly...

-- 
Alexander L. Belikoff
Berger Financial Research Ltd.
abel@bfr.co.il