[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[proposal] The CRCT Standard for bidirectionality support - Version 0.00



Departing, in a rather violent way, from Unicode specifications, I am proposing the
following standard for bidirectionality support in the world of text processing software (such
as word processors, browsers, Motif edit boxes, etc.).

(The CRCT name, which I am proposing for the standard, has its historical roots in the
choice of the CTRL-R and CTRL-T control characters as a way to switch the directionality
of text in VIC-20/Commodore-64 based telecommunication software used by deaf persons
for phone communications about 10-15 years ago.)

Principles:
-----------
1. There shall be complete separation between the world of text entry and editing (as in edit
boxes and word processors), and the world of text rendition (as in browsers).
2. Text renderers shall employ no implicit methods for inferring text direction from character
codes being used.  It shall rely only upon software (or user) entered directionality markers
embedded in the text.
3. Text entry software (such as word processors) shall provide the user with a way to have
full control of the directionality of any entered text.
4. Before being saved by text entry software, entered text shall be brought into a canonical
form, as far as directionality markers are concerned.  This canonical form shall allow the user
to sort records according to a bidirectional text key, in unambiguous way - without having
to ignore the directionality markers added to the text.

Text entry/editing:
-------------------
The following levels of control shall be provided:
1. Full user control - to switch the direction of text being entered (even if the user switched from
Hebrew letters to Latin letters or digits), the user has to enter the appropriate directionality
marker where he wants to switch direction.
2. Stupid automatic control - Hebrew letters shall switch to RTL direction, Latin letters and
digits shall switch to LTR direction.  Any other characters shall leave the direction as it is.
3. Intelligent automatic control - level of performance similar to that seen in MS-Word.

In levels 2 and above, it shall be possible to override the automatically-determined directionality
of text by inserting the appropriate directionality marker.
Levels 1 and 2 are mandatory.  Level 3 is optional.

Directionality markers:
-----------------------
1. There shall be four markers:
RTL-start
RTL-end
LTR-start
LTR-end
2. In non-HTML documents, those markers shall be assigned the following ASCII codes:
[TBD]
3. In HTML (and related, such as XML) documents, those markers shall have representation
similar to the following:
RTL-start = <rtl>
RTL-end = </rtl>
LTR-start = <ltr>
LTR-end = </ltr>
4. The start and end markers must be paired with each other.  The effect of a start marker shall
be to push down the directionality of previous text (logical order speaking) and enter the desired
directionality mode.  The effect of an end marker shall be to pop up the previous directionality.
5. Every document shall be considered as enclosed in implicit LTR-start to LTR-end sequence.
In other words, it shall be rendered in LTR direction unless there is explicit override of RTL
direction.

Notes:
------
1. This is only first draft.  If there is agreement to the principles presented in this proposal, but
disagreement to the details - then let me know and I'll revise accordingly.
2. I am sure that I overlooked all kinds of subtle points while trying to simplify the
proposed standard.  I'll appreciate those points being brought to my attention.
3. The proposal is geared toward the needs of Hebrew users.  Other RTL languages may
have needs which I overlooked.  Again, let me know what is needed!
4. By addition of more directionality markers, it may be possible to extend the proposal to
accommodate languages which employ both horizontal and vertical directions.
                                                                                 --- Omer
E-mail:  omerz@actcom.co.il

SPAM Warning:  by sending me UNSOLICITED COMMERCIAL E-MAIL (known also as "SPAM")
you irrevocably agree to pay me US$500.- plus any legal fees incurred while trying to collect this
amount due - for the service of receiving your UNSOLICITED COMMERCIAL E-MAIL message.