[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: hebrew support for wordtrans



A couple of other notes:

On Sun, 18 Nov 2001, Tzafrir Cohen wrote:

> On Sun, 18 Nov 2001, guy keren wrote:
>
> >
> > On Sun, 18 Nov 2001, Ricardo Villalba wrote:
> >
> > > In first place I was using the "ISO 8859-8" text codec in Qt 2 for
> > > converting hebrew text to unicode, but now thiscodec seems that doesn't
> > > work in Qt 3 (I've got only one hebrew letter for each hebrew text).
> > >
> > > So I used with Qt 3 the "ISO 8859-8-I" text codec and seems to work but
> > > the results differ to those in Qt 2.
> > >
> > > I'm attaching two pictures, one from qwordtrans compiled with Qt 2 (and
> > > using internally "ISO 8859-8") and another one from qwordtrans compiled
> > > with Qt (which uses "ISO 8859-8-I"). Which text is correct?

If you want to avoid most of the fonts issues, you can use the ISO10646-1
encoding. The hebrew translation for KDE uses this encoding, and it
generally makes life easier when working in a mostly-kde/qt environment
(for qt>=2). This may help you avoid things like the question marks in
the window title.

XFree comes with the font misc-fixed-*-iso10646-i that includes hebrew
glyphs.

> >
> > actually, none of them looks fully correct, but the qt2 version seems
> > better - it puts allwords in the right order, but reverses english-letter
> > words inside the linx (see the 'XINU' part). the qt3 version, on the other
> > hand, makes the word ordering itself incorrectly.

It has another problem, that is a typical problem of working with "visual"
hebrew: the "reversing" of hebrew is done before the line breaking. Since
after the "revrsing" the beginning of the line contains the end of the
text, the line breaking puts the beginning of the text in the next line.

Note that you have no control over the place of the line break (This is
something for QT to decide. You don't want to mess with the display code
too much and handle all the special cases).

>
> It is not that bad. Besides the problem with the English text ("UNIX"
> showed in reverse) the only problem is that the bidi base direction is LTR
> (Left To Right, e.g: English). If you had to give a parameter of base
> direction you may be safer by giving "Neutral" as the base direction.
>
> In case you want to handle hebrew/arabic code slightly diffeernt:
>
> The bidi renderer (QT3, in this case) has to know something about the
> context of the text that it renders. It can be:
>
> * part of a bigger chunk of LTR text. In this case any LTR character
> breaks a sequence or RTL(Right-To-Left, e.g: Hebrew) characters to
> subsequences that are "reversed" seperately.
>
> * Part of a bigger chunk of RTL text. Similar, but here RTL is the
> dominant.
>
> * Neutral: either RTL or LTR. determained by the first character which has
> a "direction" (e.g: an english letter or a hebrew letter)
>

BTW: if any of you wants to test this with different base directions, and
assuming that no bidi rendering is done in the QT2 program, one can use
biditext (Current temporary localtion:
http://linuxclub.il.eu.org/R2L/tzafrir/r2l-tarball/ . Hmm... Me should do
something to finish packaging it).

After installing it, you can run:

biditext qwordtrans

And then control the base direction using one of the 'r2l' programs.
You'll need to refresh the window to have this change take effect: either
figure out how to use 'refreshd' from that package, or hide/unhide the
window to redraw it.

-- 
Tzafrir Cohen                        /"\
mailto:tzafrir@technion.ac.il        \ /  ASCII Ribbon Campaign
Taub 229, 04-829-3942                 X   Against  HTML  Mail
http://www.technion.ac.il/~tzafrir   / \


=================================================================
To unsubscribe, send mail to linux-il-request@linux.org.il with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail linux-il-request@linux.org.il