Kenichi Handa | 18 Aug 04:06

Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

I think it's about the time to decide how to display these
formatting characters: LRE, RLE, LRO, RLO, PDF, LRM, RLM.

Eli wrote:
> Anyway, characters such as LRM should be automatically
> composed with the character that follows them, and then
> they will be invisible.

But, if we do that, users lose control exactly which part of
text he selects or delete, exactly where to insert a text
because he can't put cursor between LRM, etc. and the
following character.

I can think of these modes:

(1) invisible-mode (perhaps the default)

Hide them, for instance, by
  (aset standard-display-table #x202e [])

Then, you have to type C-f or C-b twice to pass over those
characters.  That means users can still put cursor anywhere
if he moves cursor carefully.

(2) light-visible-mode

Show them by a space of 1-pixel width.

(3) heavy-visible-mode

(Continue reading)

Eli Zaretskii | 18 Aug 05:11
Picon

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Date: Wed, 18 Aug 2010 11:06:46 +0900
> 
> I think it's about the time to decide how to display these
> formatting characters: LRE, RLE, LRO, RLO, PDF, LRM, RLM.

I agree.

> Eli wrote:
> > Anyway, characters such as LRM should be automatically
> > composed with the character that follows them, and then
> > they will be invisible.
> 
> But, if we do that, users lose control exactly which part of
> text he selects or delete, exactly where to insert a text
> because he can't put cursor between LRM, etc. and the
> following character.

Is that necessarily bad?  I'm not sure this will constitute a problem
in the usual cases.

> I can think of these modes:

I agree these modes are useful, but I'm not sure they should be the
default.  Yair wrote some time back that he intends to extend
whitespace.el to cover this functionality, so these ideas (all of the
good, I think) should be used there.

We could let the users use all those, plus the auto-composition with
the following character, and see what they like more.
(Continue reading)

Kenichi Handa | 18 Aug 06:31

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <83pqxgr4xa.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> > But, if we do that, users lose control exactly which part of
> > text he selects or delete, exactly where to insert a text
> > because he can't put cursor between LRM, etc. and the
> > following character.

> Is that necessarily bad?  I'm not sure this will constitute a problem
> in the usual cases.

Just for viewing or for cuting&pasting the whole paragraph,
the hide-mode I wrote in the previous mail won't constitute
a problem.  I think that is an usual case.

A problem happens when one wants to put point at (before or
after) those formatting characters or wants to insert/delete
those characters, which doesn't seem like a usual case.

For instance, just after one inserts RLE, it can be deleted
by DEL.  But, once one moves point (and thus RLE is composed
with the following char), he must use C-d to delete that
RLE.

In addition, we have some technical problem in composing.

For instance, if you have this text:
  r2l contents is embeded here [RLE] R2L CONTENTS [PDF].
the current bidi code generates glyphs in this visual order:
   ... here [PDF] STNETNOC L2R [RLE].
and that means we can't compose [PDF] with the following ".".
(Continue reading)

Eli Zaretskii | 18 Aug 07:56
Picon

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Cc: emacs-bidi <at> gnu.org
> Date: Wed, 18 Aug 2010 13:31:18 +0900
> 
> A problem happens when one wants to put point at (before or
> after) those formatting characters or wants to insert/delete
> those characters, which doesn't seem like a usual case.

I agree.  For that use-case, the user will want to reveal these
characters first.

> For instance, just after one inserts RLE, it can be deleted
> by DEL.  But, once one moves point (and thus RLE is composed
> with the following char), he must use C-d to delete that
> RLE.
> 
> In addition, we have some technical problem in composing.
> 
> For instance, if you have this text:
>   r2l contents is embeded here [RLE] R2L CONTENTS [PDF].
> the current bidi code generates glyphs in this visual order:
>    ... here [PDF] STNETNOC L2R [RLE].

Yes, this is intentional: when these formatting characters are
revealed, they should enclose the text they affect.

> and that means we can't compose [PDF] with the following ".".

Hmm.. I thought composition works on the buffer text level, i.e. it
examines characters in logical order (even if it sometimes does it
(Continue reading)

Kenichi Handa | 18 Aug 09:53

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <E1Olbdk-0005bf-9j <at> fencepost.gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> > For instance, if you have this text:
> >   r2l contents is embeded here [RLE] R2L CONTENTS [PDF].
> > the current bidi code generates glyphs in this visual order:
> >    ... here [PDF] STNETNOC L2R [RLE].

> Yes, this is intentional: when these formatting characters are
> revealed, they should enclose the text they affect.

There is another way to enclose the text:

    ... here [RLE] STNETNOC L2R [PDF].

I don't know which is good.  Does the current code conform
to "5.2 Retaining Format Codes" of UAX#9?

> > and that means we can't compose [PDF] with the following ".".

> Hmm.. I thought composition works on the buffer text level, i.e. it
> examines characters in logical order (even if it sometimes does it
> backwards).  So you should be able to compose PDF with the period that
> follows it, no?

The current code never composes characters striding over
bidi boundary.

In the above example, when bidi_move_to_visually_next sets
IT's position at [PDF], scan direction is changed, and thus
composition_compute_stop_pos is called to find the next
(Continue reading)

Eli Zaretskii | 18 Aug 12:37
Picon

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Cc: emacs-bidi <at> gnu.org
> Date: Wed, 18 Aug 2010 16:53:27 +0900
> 
> > > For instance, if you have this text:
> > >   r2l contents is embeded here [RLE] R2L CONTENTS [PDF].
> > > the current bidi code generates glyphs in this visual order:
> > >    ... here [PDF] STNETNOC L2R [RLE].
> 
> > Yes, this is intentional: when these formatting characters are
> > revealed, they should enclose the text they affect.
> 
> There is another way to enclose the text:
> 
>     ... here [RLE] STNETNOC L2R [PDF].

We could do that if it would help.  But since we have now decided not
to use auto-compositions for this, I guess this is a moot point.

For the record, I used the former method because I consider the
formatting characters part of the embedding.

> I don't know which is good.  Does the current code conform
> to "5.2 Retaining Format Codes" of UAX#9?

UAX#9 is totally silent regarding the _display_ of the formatting
characters.  If you just go by the book, you will reorder the above
text as

    ... here STNETNOC L2R [RLE] [PDF].
(Continue reading)

Kenichi Handa | 19 Aug 03:12

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <E1Olg1Q-0002gL-7P <at> fencepost.gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> UAX#9 is totally silent regarding the _display_ of the formatting
> characters.  If you just go by the book, you will reorder the above
> text as

>     ... here STNETNOC L2R [RLE] [PDF].

> because RLE has the same level as the embedding, while PDF has the
> level of the outer text.  That is why bidi.c artificially enlarges the
> level of PDF to make it part of the embedding.  You will see a comment
> about this near the end of bidi.c:bidi_level_of_next_char.

I see.  Thank you for the explanation.  I agree that the
current way better at least for editting.

---
Kenichi Handa
handa <at> m17n.org
Yair F | 18 Aug 16:33
Picon

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

On Wed, Aug 18, 2010 at 5:06 AM, Kenichi Handa <handa <at> m17n.org> wrote:
> I think it's about the time to decide how to display these
> formatting characters: LRE, RLE, LRO, RLO, PDF, LRM, RLM.
>

This is not a bidi-specific issue.  It is possible that these characters
and other Unicode control characters needs to be trated as non-visiblle
characters. The assitional characters are CGJ, IAA, IAS, ZWNBSP (BOM),
IAT LSEP, PSEP, WJ, Invisible Operators, ans all Zero width characters:
ZWSP, ZWNJ, ZWJ.

All of these characters modify the environment aboud them but do not
display glyphs. In some way they are like the TAB character.

I am thinking of extending whitespace.el to make them visible if the
user wishes, but IMO, they should bot be visible by default. See the
impact on the HELLO fle.

> (1) invisible-mode (perhaps the default)
>
> Hide them, for instance, by
>  (aset standard-display-table #x202e [])
>
> Then, you have to type C-f or C-b twice to pass over those
> characters.  That means users can still put cursor anywhere
> if he moves cursor carefully.
Yes, but currently the cursore is "gone" if it is on invisible character.

>
> (2) light-visible-mode
(Continue reading)

Kenichi Handa | 19 Aug 04:49

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <AANLkTinrGPzGquxPmfigvZLzGbids39yU942sfwKGMYk <at> mail.gmail.com>, Yair F
<yair.f.lists <at> gmail.com> writes:

> On Wed, Aug 18, 2010 at 5:06 AM, Kenichi Handa <handa <at> m17n.org> wrote:
> > I think it's about the time to decide how to display these
> > formatting characters: LRE, RLE, LRO, RLO, PDF, LRM, RLM.

> This is not a bidi-specific issue.

Yes.  I included emacs-devel <at> gnu.org in CC:.

> It is possible that these characters
> and other Unicode control characters needs to be trated as non-visiblle
> characters. The assitional characters are CGJ, IAA, IAS, ZWNBSP (BOM),
> IAT LSEP, PSEP, WJ, Invisible Operators, ans all Zero width characters:
> ZWSP, ZWNJ, ZWJ.

> All of these characters modify the environment aboud them but do not
> display glyphs. In some way they are like the TAB character.

> I am thinking of extending whitespace.el to make them visible if the
> user wishes, but IMO, they should bot be visible by default. See the
> impact on the HELLO fle.

Extending whitespace.el will be good.  But, anyway we must
hide those characters by default, so we need some char-table
to specify that.  Currently standard-display-table is not
made by default.  I think the first step is to made it by
default, and specify [] for all of those characters.

(Continue reading)

Kenichi Handa | 1 Nov 09:15

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <tl7vd77z591.fsf <at> m17n.org>, Kenichi Handa <handa <at> m17n.org> writes:

> I'd like to extend the elements of a display table.
> Currently only a glyph vector or nil is allowed.  It seems
> good to extend it so that it can completely control the
> displaying of a character (like by face and display text
> properties).

> At least, for (2), I want to specify a space width (relative
> or absolute), and for (4) I want to specify a special form
> (list?) containing a mnemonic label.

I've just committed a basic infrastructure for displaying
non-graphic and no-font characters.  For that, instead of
extending the current display table, I implemented a new
char table glyphless-char-display.

------------------------------------------------------------
Char-table to control displaying of glyphless characters.
Each element, if non-nil, is an ASCII acronym string (displayed in a box)
or one of these symbols:
  hexa-code: display with hexadecimal character code in a box
  empty-box: display with an empty box
  thin-space: display with 1-pixel width space
  zero-width: don't display

It has one extra slot to control the display of a character for which
no font is found.  The value of the slot is `hexa-code' or `empty-box'.
The default is `empty-box'.
------------------------------------------------------------
(Continue reading)

Eli Zaretskii | 1 Nov 10:57
Picon

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Date: Mon, 01 Nov 2010 17:15:13 +0900
> Cc: emacs-bidi <at> gnu.org, emacs-devel <at> gnu.org
> 
> I've just committed a basic infrastructure for displaying
> non-graphic and no-font characters.

Thanks!

> Char-table to control displaying of glyphless characters.
> Each element, if non-nil, is an ASCII acronym string (displayed in a box)
> or one of these symbols:
>   hexa-code: display with hexadecimal character code in a box
    ^^^^^^^^^
Suggest to name this "hex-code" instead.

>   empty-box: display with an empty box
>   thin-space: display with 1-pixel width space
>   zero-width: don't display
> 
> It has one extra slot to control the display of a character for which
> no font is found.  The value of the slot is `hexa-code' or `empty-box'.
> The default is `empty-box'.

What will happen on a TTY?

> glyphless-char-control is a variable defined in `characters.el'.
  ^^^^^^^^^^^^^^^^^^^^^^
Suggest to name this "glyphless-char-display-control".
(Continue reading)

Kenichi Handa | 1 Nov 12:16

Re: Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <E1PCr8f-0003bZ-UC <at> fencepost.gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> > Char-table to control displaying of glyphless characters.
> > Each element, if non-nil, is an ASCII acronym string (displayed in a box)
> > or one of these symbols:
> >   hexa-code: display with hexadecimal character code in a box
>     ^^^^^^^^^
> Suggest to name this "hex-code" instead.

As Google search found much more pages of "hex code" than
"hexa code", ok, I'll change the name..

> >   empty-box: display with an empty box
> >   thin-space: display with 1-pixel width space
> >   zero-width: don't display
> > 
> > It has one extra slot to control the display of a character for which
> > no font is found.  The value of the slot is `hexa-code' or `empty-box'.
> > The default is `empty-box'.

> What will happen on a TTY?

Ah, I forgot to mention about that.  At first, empty-box,
hexa-code, and acronym are displayed by using a new face
glyphless-char which is defined as this.

(defface glyphless-char
  '((((type tty)) :inherit underline)
    (t :height 0.6))
   ...)
(Continue reading)

Eli Zaretskii | 13 Nov 14:51
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Cc: emacs-bidi <at> gnu.org, emacs-devel <at> gnu.org
> Date: Mon, 01 Nov 2010 20:16:57 +0900
> 
> And, for tty, as it's impossible to do the same thing as
> graphic terminal, the current code does this:
> 
> thin-space: same as empty-box
> hexa-code: display "U+XX", "U+XXXX", "U+XXXXXX" ,
> 	"E+XXXXXX" depends on the character code (the last
>         one is for a character of code >= #x110000).
> acronym: surround an acronym by "[" and "]" as this:
> 	"[ZWNJ]", "[LRE]"
> 
> At the moment, that is hardcoded in the function
> produce_glyphless_glyph of term.c.
> 
> And, for tty, `no-font' means a character not encodable by
> the terminal coding system.

There are a few issues that perhaps need to be fixed:

  . If the default value of terminal-coding-system is nil, glyphless
    character display does not take effect: all the non-ASCII
    characters are displayed as question marks.  I think this is
    because safe_terminal_coding claims it can safely encode any
    character.  This look inconsistent and confusing, so I think we
    should fix that.

  . Composite characters are displayed as question marks regardless of
(Continue reading)

Kenichi Handa | 17 Nov 06:58

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <83aalde3i0.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> There are a few issues that perhaps need to be fixed:

>   . If the default value of terminal-coding-system is nil, glyphless
>     character display does not take effect: all the non-ASCII
>     characters are displayed as question marks.  I think this is
>     because safe_terminal_coding claims it can safely encode any
>     character.  This look inconsistent and confusing, so I think we
>     should fix that.

There was a bug in setting term->charset_list.  I've just
installed a fix.

>   . Composite characters are displayed as question marks regardless of
>     the setting of glyphless-char-display-control.  I think this is
>     because term.c:produce_composite_glyph does not consider the new
>     glyphless-char display feature.  I think users will expect that
>     composite characters behave like un-encodable characters on a TTY.

I think composite characters should be sent to a tty as is
(i.e. just sending encoded characters), then terminal may
correctly compose them.

By the way, with the latest trunk code, on tty terminal,
Emacs positions cursor incorrectly (at column 1) on empty
lines except for end-of-buffer.  I don't know which code is
wrong but, at least, it didn't happen when I committed the
big changes for glyphless-char-display.

(Continue reading)

Kenichi Handa | 17 Nov 08:14

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <tl78w0sqyo8.fsf <at> m17n.org>, Kenichi Handa <handa <at> m17n.org> writes:
> >   . Composite characters are displayed as question marks regardless of
> >     the setting of glyphless-char-display-control.  I think this is
> >     because term.c:produce_composite_glyph does not consider the new
> >     glyphless-char display feature.  I think users will expect that
> >     composite characters behave like un-encodable characters on a TTY.

> I think composite characters should be sent to a tty as is
> (i.e. just sending encoded characters), then terminal may
> correctly compose them.

Ah, no, I misunderstood what you mean.  If the tty doesn't
support those characters, we should consult
glyphless-char-display.  I'll fix it soon.

---
Kenichi Handa
handa <at> m17n.org

Eli Zaretskii | 17 Nov 13:21
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Cc: emacs-devel <at> gnu.org
> Date: Wed, 17 Nov 2010 14:58:15 +0900
> 
> By the way, with the latest trunk code, on tty terminal,
> Emacs positions cursor incorrectly (at column 1) on empty
> lines except for end-of-buffer.  I don't know which code is
> wrong but, at least, it didn't happen when I committed the
> big changes for glyphless-char-display.

This is probably bug#7417.  It was introduced when I fixed the problem
with cursor positioning when displaying glyphless characters using the
zero-width method.

I will try to fix this as soon as I can.

Eli Zaretskii | 17 Nov 20:20
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Eli Zaretskii <eliz <at> gnu.org>
> Date: Wed, 17 Nov 2010 07:21:50 -0500
> Cc: emacs-devel <at> gnu.org
> 
> > From: Kenichi Handa <handa <at> m17n.org>
> > Cc: emacs-devel <at> gnu.org
> > Date: Wed, 17 Nov 2010 14:58:15 +0900
> > 
> > By the way, with the latest trunk code, on tty terminal,
> > Emacs positions cursor incorrectly (at column 1) on empty
> > lines except for end-of-buffer.  I don't know which code is
> > wrong but, at least, it didn't happen when I committed the
> > big changes for glyphless-char-display.
> 
> This is probably bug#7417.  It was introduced when I fixed the problem
> with cursor positioning when displaying glyphless characters using the
> zero-width method.
> 
> I will try to fix this as soon as I can.

Done.

Eli Zaretskii | 13 Nov 14:44
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Date: Mon, 01 Nov 2010 17:15:13 +0900
> Cc: emacs-bidi <at> gnu.org, emacs-devel <at> gnu.org
> 
> Actually, glyphless-char-control should be a customizable
> variable with proper set: function.  But, I don't know how
> to write the correct 'defcustom' form.  Could someone DTRT?

I didn't do this; volunteers are welcome.

> (2) For Windows port:
> 
> I wrote x_draw_glyphless_glyph_string_foreground (in
> w32term.c) by referring to x_draw_glyph_string_foreground in
> the same file.  But, it doesn't draw acronym nor hexa-code.
> I couldn't figure out what is wrong.  Please someone who
> knows Windows code fix it.

I fixed this.  The problem was that w32_draw_rectangle wiped out what
was drawn inside it; I switched the order, so the box is drawn before
the glyphs inside it.  Jason, could you please take a look and see if
I did everything right?

Assuming that on X the order doesn't matter (does it?), perhaps we
should switch the order there as well, for uniformity.

> (3) For mac port:
> 
> As I have no idea how to write a code for mac, the 'case
> GLYPHLESS_GLYPH:' part in ns_draw_glyph_string (in nsterm.m)
(Continue reading)

Eli Zaretskii | 13 Nov 15:07
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> Date: Sat, 13 Nov 2010 15:44:55 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: jasonr <at> gnu.org, emacs-devel <at> gnu.org
> 
> In addition, I did this:
> 
>  . renamed glyphless-char-control to glyphless-char-display-control
> 
>  . renamed hexa-code to hex-code.
> 
>  . changed TTY display top enclose U+nnnn and "empty box" displays in
>    "[]", to simulate a box and make the display easier to read.
> 
>  . documented this feature in NEWS and in the manual.

Oh, and one other thing: there was a bug with positioning the cursor
on characters for which the zero-width method was specified in
glyphless-char-display-control.  (The cursor would jump to the end of
the line.)  I fixed that as well.

Kenichi Handa | 17 Nov 04:57

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <83bp5te3s8.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> > (2) For Windows port:
> > 
> > I wrote x_draw_glyphless_glyph_string_foreground (in
> > w32term.c) by referring to x_draw_glyph_string_foreground in
> > the same file.  But, it doesn't draw acronym nor hexa-code.
> > I couldn't figure out what is wrong.  Please someone who
> > knows Windows code fix it.

> I fixed this.  The problem was that w32_draw_rectangle wiped out what
> was drawn inside it; I switched the order, so the box is drawn before
> the glyphs inside it.

Are, I see.  Thank you for fixing it.

> Assuming that on X the order doesn't matter (does it?), perhaps we
> should switch the order there as well, for uniformity.

Perhaps.

> In addition, I did this:

>  . renamed glyphless-char-control to glyphless-char-display-control

>  . renamed hexa-code to hex-code.

Thank you.

>  . changed TTY display top enclose U+nnnn and "empty box" displays in
(Continue reading)

Eli Zaretskii | 17 Nov 13:26
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Cc: emacs-devel <at> gnu.org, jasonr <at> gnu.org
> Date: Wed, 17 Nov 2010 12:57:27 +0900
> 
> >  . changed TTY display top enclose U+nnnn and "empty box" displays in
> >    "[]", to simulate a box and make the display easier to read.
> 
> For U+NNNN, I decided not to use "[]" because it takes too
> many columns.  I thought underline or some background color
> (customizable through a face) was enough.  Don't you think 8
> columns (instead of 6 columns) is annoying?

It is annoying allright, but [U+1234][U+5678] is more readable than
U+1234U+5678.  However, if others disagree, I can change that back.

> >  . documented this feature in NEWS and in the manual.
> 
> I dared not write that because I have not yet made my mind
> which is better; current glyphless-char-display or extending
> the normal display-table.

By the way, one other issue is that display tables take precedence
over glyphless-char-display, in the sense that characters for which
there are non-trivial entries in the current display table are
displayed using the display table, disregarding any
glyphless-char-display-control settings.  If this is what we want, we
should probably document that.

Andreas Schwab | 17 Nov 13:55

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Kenichi Handa <handa <at> m17n.org>
>> Cc: emacs-devel <at> gnu.org, jasonr <at> gnu.org
>> Date: Wed, 17 Nov 2010 12:57:27 +0900
>> 
>> >  . changed TTY display top enclose U+nnnn and "empty box" displays in
>> >    "[]", to simulate a box and make the display easier to read.
>> 
>> For U+NNNN, I decided not to use "[]" because it takes too
>> many columns.  I thought underline or some background color
>> (customizable through a face) was enough.  Don't you think 8
>> columns (instead of 6 columns) is annoying?
>
> It is annoying allright, but [U+1234][U+5678] is more readable than
> U+1234U+5678.  However, if others disagree, I can change that back.

\u1234\u5678 is even better.

Andreas.

--

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Eli Zaretskii | 17 Nov 18:54
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Andreas Schwab <schwab <at> linux-m68k.org>
> Cc: Kenichi Handa <handa <at> m17n.org>,  emacs-devel <at> gnu.org
> Date: Wed, 17 Nov 2010 13:55:07 +0100
> 
> > It is annoying allright, but [U+1234][U+5678] is more readable than
> > U+1234U+5678.  However, if others disagree, I can change that back.
> 
> \u1234\u5678 is even better.

I'm fine with that, too.  If no one objects, I will move to \uNNNN in
a couple of days.

Btw, how to display code points smaller than 256? \uNN or \u00NN?

Stefan Monnier | 18 Nov 00:59
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> Btw, how to display code points smaller than 256? \uNN or \u00NN?

I think \u00NN is the clear winner.

        Stefan

Eli Zaretskii | 18 Nov 21:04
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
> Cc: Andreas Schwab <schwab <at> linux-m68k.org>, emacs-devel <at> gnu.org,
>         handa <at> m17n.org
> Date: Wed, 17 Nov 2010 18:59:15 -0500
> 
> > Btw, how to display code points smaller than 256? \uNN or \u00NN?
> 
> I think \u00NN is the clear winner.

It's according to RFC 5137, so yes.

But what about code points above MAX_UNICODE_CHAR?  Currently we
display them as [E+NNNNNN].  Use \eNNNNNN?

Stefan Monnier | 18 Nov 23:15
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

>> > Btw, how to display code points smaller than 256? \uNN or \u00NN?
>> I think \u00NN is the clear winner.

> It's according to RFC 5137, so yes.

> But what about code points above MAX_UNICODE_CHAR?  Currently we
> display them as [E+NNNNNN].  Use \eNNNNNN?

I think \xNNNNNN would be preferable, since that's what we
use elsewhere.

        Stefan

Eli Zaretskii | 19 Nov 12:31
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
> Cc: schwab <at> linux-m68k.org, emacs-devel <at> gnu.org, handa <at> m17n.org
> Date: Thu, 18 Nov 2010 17:15:44 -0500
> 
> >> > Btw, how to display code points smaller than 256? \uNN or \u00NN?
> >> I think \u00NN is the clear winner.
> 
> > It's according to RFC 5137, so yes.
> 
> > But what about code points above MAX_UNICODE_CHAR?  Currently we
> > display them as [E+NNNNNN].  Use \eNNNNNN?
> 
> I think \xNNNNNN would be preferable, since that's what we
> use elsewhere.

Elsewhere where?

Eli Zaretskii | 20 Nov 16:06
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
> Cc: schwab <at> linux-m68k.org, emacs-devel <at> gnu.org, handa <at> m17n.org
> Date: Thu, 18 Nov 2010 17:15:44 -0500
> 
> >> > Btw, how to display code points smaller than 256? \uNN or \u00NN?
> >> I think \u00NN is the clear winner.
> 
> > It's according to RFC 5137, so yes.
> 
> > But what about code points above MAX_UNICODE_CHAR?  Currently we
> > display them as [E+NNNNNN].  Use \eNNNNNN?
> 
> I think \xNNNNNN would be preferable, since that's what we
> use elsewhere.

Done.

Andreas Schwab | 19 Nov 10:53

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

Eli Zaretskii <eliz <at> gnu.org> writes:

> But what about code points above MAX_UNICODE_CHAR?  Currently we
> display them as [E+NNNNNN].  Use \eNNNNNN?

\U00NNNNNN

Andreas.

--

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Eli Zaretskii | 19 Nov 12:31
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Andreas Schwab <schwab <at> linux-m68k.org>
> Cc: Stefan Monnier <monnier <at> IRO.UMontreal.CA>,  emacs-devel <at> gnu.org,  handa <at> m17n.org
> Date: Fri, 19 Nov 2010 10:53:57 +0100
> 
> Eli Zaretskii <eliz <at> gnu.org> writes:
> 
> > But what about code points above MAX_UNICODE_CHAR?  Currently we
> > display them as [E+NNNNNN].  Use \eNNNNNN?
> 
> \U00NNNNNN

That'd be a lie, wouldn't it?  These are not Unicode code points.

Andreas Schwab | 19 Nov 12:47

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

Eli Zaretskii <eliz <at> gnu.org> writes:

>> From: Andreas Schwab <schwab <at> linux-m68k.org>
>> Cc: Stefan Monnier <monnier <at> IRO.UMontreal.CA>,  emacs-devel <at> gnu.org,  handa <at> m17n.org
>> Date: Fri, 19 Nov 2010 10:53:57 +0100
>> 
>> Eli Zaretskii <eliz <at> gnu.org> writes:
>> 
>> > But what about code points above MAX_UNICODE_CHAR?  Currently we
>> > display them as [E+NNNNNN].  Use \eNNNNNN?
>> 
>> \U00NNNNNN
>
> That'd be a lie, wouldn't it?  These are not Unicode code points.

Does it matter?

Andreas.

--

-- 
Andreas Schwab, schwab <at> linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

Stefan Monnier | 17 Nov 14:24
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

>> It is annoying allright, but [U+1234][U+5678] is more readable than
>> U+1234U+5678.  However, if others disagree, I can change that back.

Even more so for chars followed by digits where the display could be
fairly confusing: U+123456 rather than [U+1234]56.

> \u1234\u5678 is even better.

I largely agree, since it's more in keeping with the Emacs tradition,
but again if digits follow, this is ambiguous: \u123456 (tho this could
also be displayed as \u1234\ 56).

        Stefan

Eli Zaretskii | 18 Nov 21:07
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Stefan Monnier <monnier <at> IRO.UMontreal.CA>
> Cc: Eli Zaretskii <eliz <at> gnu.org>, emacs-devel <at> gnu.org,
>         Kenichi Handa <handa <at> m17n.org>
> Date: Wed, 17 Nov 2010 08:24:57 -0500
> 
> I largely agree, since it's more in keeping with the Emacs tradition,
> but again if digits follow, this is ambiguous: \u123456 (tho this could
> also be displayed as \u1234\ 56).

I think it's extremely unlikely to have \u1234 followed by 56, and we
already have that problem with octal escapes anyway.  Also, don't
forget that \u1234 is displayed in a special face, which makes it
stand out when followed by normal characters.

Eli Zaretskii | 17 Nov 20:39
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Eli Zaretskii <eliz <at> gnu.org>
> Date: Wed, 17 Nov 2010 07:26:15 -0500
> Cc: emacs-devel <at> gnu.org
> 
> By the way, one other issue is that display tables take precedence
> over glyphless-char-display, in the sense that characters for which
> there are non-trivial entries in the current display table are
> displayed using the display table, disregarding any
> glyphless-char-display-control settings.  If this is what we want, we
> should probably document that.

And another issue: the character group c0-control includes the
newline.  So if the display of this group is set to anything at all,
the newlines are not displayed as such.  I doubt that users would
expect or want that when they customize the display of c0-control.  So
maybe we should exempt newline (and perhaps also TAB) from this group;
users who really want that can always set glyphless-char-display
directly.

WDYT?

Eli Zaretskii | 26 Nov 13:20
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

Ping!

> Date: Wed, 17 Nov 2010 21:39:49 +0200
> From: Eli Zaretskii <eliz <at> gnu.org>
> Cc: 
> 
> And another issue: the character group c0-control includes the
> newline.  So if the display of this group is set to anything at all,
> the newlines are not displayed as such.  I doubt that users would
> expect or want that when they customize the display of c0-control.  So
> maybe we should exempt newline (and perhaps also TAB) from this group;
> users who really want that can always set glyphless-char-display
> directly.
> 
> WDYT?

Kenichi Handa | 26 Nov 13:29

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

Sorry for the late response on this matter.

In article <834obfd9iy.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> > From: Eli Zaretskii <eliz <at> gnu.org>
> > Date: Wed, 17 Nov 2010 07:26:15 -0500
> > Cc: emacs-devel <at> gnu.org
> > 
> > By the way, one other issue is that display tables take precedence
> > over glyphless-char-display, in the sense that characters for which
> > there are non-trivial entries in the current display table are
> > displayed using the display table, disregarding any
> > glyphless-char-display-control settings.  If this is what we want, we
> > should probably document that.

IF that is what we want, we surely should document that.
But, as I wrote before, I'm still hesitating over which is
better; keeping glyphless-char-display separate from display
table, or integrating that functionality to display table.

> And another issue: the character group c0-control includes the
> newline.  So if the display of this group is set to anything at all,
> the newlines are not displayed as such.  I doubt that users would
> expect or want that when they customize the display of c0-control.  So
> maybe we should exempt newline (and perhaps also TAB) from this group;
> users who really want that can always set glyphless-char-display
> directly.

> WDYT?

(Continue reading)

Eli Zaretskii | 27 Nov 09:42
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> From: Kenichi Handa <handa <at> m17n.org>
> Cc: emacs-devel <at> gnu.org
> Date: Fri, 26 Nov 2010 21:29:43 +0900
> 
> But, as I wrote before, I'm still hesitating over which is
> better; keeping glyphless-char-display separate from display
> table, or integrating that functionality to display table.

What are the pros and cons, which make you hesitate?

> > And another issue: the character group c0-control includes the
> > newline.  So if the display of this group is set to anything at all,
> > the newlines are not displayed as such.  I doubt that users would
> > expect or want that when they customize the display of c0-control.  So
> > maybe we should exempt newline (and perhaps also TAB) from this group;
> > users who really want that can always set glyphless-char-display
> > directly.
> 
> > WDYT?
> 
> I agree on exempting TAB and NL from c0-control group.

Done.

Kenichi Handa | 29 Nov 07:35

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

In article <83wrnz5fac.fsf <at> gnu.org>, Eli Zaretskii <eliz <at> gnu.org> writes:

> > But, as I wrote before, I'm still hesitating over which is
> > better; keeping glyphless-char-display separate from display
> > table, or integrating that functionality to display table.

> What are the pros and cons, which make you hesitate?

Both glyphless-char-display and display table control the
displaying of each character.  I think such a control should
be done by a single mechanism.  At least it will benefit
users in a long run.

But, the biggest concern is about the backward
compatibility.  Currently, a display table element is nil or
vector of characters.  If there's a code that assumes that a
non-nil element is a vector of characters, it will be
broken.  Next, a display table is not inherited.  So, if
buffer-display-table or a window-specific display is set,
standard-display-table is not looked up.  At last, there are
several standard-display-XXX functions
(e.g. standard-display-8bit).  At the moment, I don't know
how to make the functionality of
glyphless-char-display-control go with them.

> > I agree on exempting TAB and NL from c0-control group.

> Done.

Thank you.
(Continue reading)

Stefan Monnier | 29 Nov 19:06
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

> Both glyphless-char-display and display table control the
> displaying of each character.  I think such a control should
> be done by a single mechanism.  At least it will benefit
> users in a long run.

Agreed.

> Currently, a display table element is nil or vector of characters.
> If there's a code that assumes that a non-nil element is a vector of
> characters, it will be broken.

Let's not worry about that case for now.  Such code is probably rare and
easy to fix.

> Next, a display table is not inherited.  So, if
> buffer-display-table or a window-specific display is set,
> standard-display-table is not looked up.

IIRC that's a bug in itself that should be fixed.

> At last, there are several standard-display-XXX functions
> (e.g. standard-display-8bit).  At the moment, I don't know how to make
> the functionality of glyphless-char-display-control go with them.

Most of those functions are largely historical baggage whose precise
meaning has evolved over time, so I think it's OK to change it yet a bit
further (and to deprecate them more if needed).

        Stefan

(Continue reading)

Eli Zaretskii | 20 Nov 15:38
Picon

Re: [emacs-bidi] Treatment of LRE,RLE,LRO,RLO,PDF,LRM,RLM

I now made glyphless-char-display-control a proper defcustom.


Gmane