[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

OCR and Document Management for Linux



Hi,

I'm looking for a simple document management system for Linux. All
I need is a system that can index the words of each scanned document
(using OCR), let the user add his own keywords about the document,
and later retrieve this document code (so you can find the physical
copy of it in your office) or a digital image of it. Hebrew support
(indexing Hebrew letters) is important (although I believe that it is
not too hard to teach the OCR Hebrew letters), web interface is an
advantage, but conversion of the image to text is not needed, so the
requirements are quite minimal. Also, the input format of the
document is NOT important, since there are tools for conversion from
any format to any other format (I plan to index the incoming FAXes
too, automatically, because they currently reach a computer instead
of a FAX machine).

There are many such systems for Windows; Is there anything for Linux?

I searched Freshmeat, but couldn't find anything. I found some OCR's
(which can be modified with some effort, to do the job detailed
above), but I don't have any idea which is the best: GOCR, OCRE,
XOCR, etc. Moreover, there is an Open Source OCR, "Illuminator", that
from a first look seems to be a way superior, but is not mentioned at
all in Freshmeat. There are some other Open Source OCR's which are
not mentioned, such as LOCR, SOCR, etc.

Can anybody point at a solution for Linux, or at least to help me to
choose the right OCR?

Thanks in advance,
-- 
Eli Marmor

=================================================================
To unsubscribe, send mail to linux-il-request@linux.org.il with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail linux-il-request@linux.org.il