[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [proposal] The CRCT Standard for bidirectionality support -Version 0.00




> Departing, in a rather violent way, from Unicode specifications, I am

In fact, I don't see too much departing of Unicode specs... Also, I don't
see why one should depart from it, instead of implementing it?

> being saved by text entry software, entered text shall be brought into
> a canonical form, as far as directionality markers are concerned.  

Unicode defines this to be "logical order", with possible directionality
markers.

> This canonical form shall allow the user to sort records according to
> a bidirectional text key, in unambiguous way - without having to

What do you mean by "bidirectional text key"?

> ignore the directionality markers added to the text.



> Directionality markers:
> -----------------------
> 1. There shall be four markers:

Unicode defines following control chars: (page 6-72)
200E	left-to-right mark (LRE)
200F	right-to-left mark (RLM)
202A	LTR embedding (LRE)
202B	RTL embedding (RLE)
202C	pop directional formatting (PDF)
202D	LTR override (RLO)
202E	RTL override (LRO)

> 2. In non-HTML documents, those markers shall be assigned the
> following ASCII codes: [TBD]

I fear, it won't fit ASCII... One may define any hacks one wishes, but
would it be accepted - this is the question. ASCII wasn't designed to
handle bi-di texts.

> 3. In HTML (and related, such as XML) documents, those markers shall
> have representation similar to the following: 

See http://www.w3.org/International/O-HTML-bidi.html

HTML4 has <SPAN> tag and DIR attribute to define directional
blocks.

See also:
http://babel.alis.com:8080/web_ml/html/rfc-i18n/rfc-i18n-4.en.html


> 3. The proposal is geared toward the needs of Hebrew users.  Other RTL
> languages may have needs which I overlooked.  Again, let me know what
> is needed!

I don't even want to think now about Arabic yet, judging from Unicode book
it is much more problem than with Hebrew... And I am completely ignorant
in it also :( Though standard certainly should include it, and Unicode
does.

And the last: please, configure your mail agent so that it won't produce
lines longer than 72 characters. It's very inconvenient to read it.

-- 
frodo@sharat.co.il	\/  There shall be counsels taken
Stanislav Malyshev	/\  Stronger than Morgul-spells
phone +972-2-6245112	/\  		JRRT LotR.
http://sharat.co.il/frodo/	whois:!SM8333