The escape sequences

Implicit vs. explicit

Implicit mode: CSI 8 h
Explicit mode: CSI 8 l

ECMA 48 defines BDSM (Bi-Directional Support Mode): SM 8 for implicit mode and RM 8 for explicit mode. Our default would be the high state, therefore a generic “reset” operation would “set” this flag. (There are already some other “upside down” modes, so this is not a problem.)

I recommend that we stick to the BDSM escape sequence defined in ECMA 48, except that the default would be the high (implicit) state.

The ECMA standards don’t specify how these sequences apply on existing or subsequently arriving contents. See soon for my recommendation here.

Apps that wish to use implicit mode shouldn’t tamper with this setting (or they can force implicit mode if they really wish to). Apps that wish to use explicit mode should switch to explicit mode on startup (plus set the direction, see below), and switch to implicit mode on exit.

LTR vs. RTL

LTR: CSI 1 SPACE k
RTL: CSI 2 SPACE k
Default: CSI 0 SPACE k (or CSI SPACE k)

ECMA 48 defines two similar escape sequences: SPD (Select Presentation Directions) and SCP (Select Character Path). Based on the specs and TR/53’s examples, they both set the same concept of “character path” (LTR or RTL); it’s not that they are stacked in a particular order, nor that one specifies a sub-mode within the other. Whichever was last received of these two kinds specifies the “character path”, that is, the direction.

There are two differences between them. The first difference is that SPD also specifies the line orientation (e.g. horizontal) and line progression (e.g. top-to-bottom), something that this specification doesn’t deal with. The second difference is the scope where they take effect: SPD is immediately applied on the entire contents (unsure about scrollback buffer, ECMA is unaware of this concept) while SCP is applied on the current and subsequent lines only. (Unclear to me if “subsequent” refers to the ones below or the ones printed later on; I assume the ones printed later on. No details given what it means exactly.) Purely the second difference wouldn’t justify introducing separate sequences; since their second parameter is about how they apply anyway, they could’ve just added more possible values there. So I believe the difference is more intrinsic. Line orientation and line progression can’t be defined per-line or per-paragraph, they must be global due to their nature. Only the character path can be local (i.e. per-line or per-paragraph). So SPD was designed to specify global properties (applicable to the entire terminal) while SCP sets the property on per-line or per-paragraph level. According to this conceptual difference, SCP is the one that matches our design.

ECMA 48 mandates that SCP takes two parameters that have no defaults, however, the very similar SPD defaults them to 0, so I assume this is an oversight in the specification.

ECMA TR/53’s “Worked example 4” suggests that this escape sequence is only meant to work in explicit mode, for which limitation I don’t see a good reason.

I don’t want to break the symmetry, yet I find it crucial to have a means of saying “return to the terminal emulator’s default direction”. Some terminals might only implement LTR as the default, some might take it from the locale or user settings.

I recommend that we choose the SCP escape sequence with the following changes and clarifications:

  • The first parameter receives a new possible value: 0 means “the terminal’s default”.
  • The first parameter defaults to 0.
  • The second parameter defaults to 0. The exact behavior how the new value is applied is no longer implementation-dependent, but specified later in this document. As opposed to values 1 and 2, no cursor movement is involved.
  • Values 1 and 2 for the second parameter don’t need to be supported. (The value of 2 should definitely not be supported as we never update the model from the view.)
  • The sequence affects the direction of both implicit and explicit paragraphs (including of course the fallback direction of implicit paragraphs with autodetection).

Apps mustn’t rely on the default being LTR, as it’s not necessarily the case. If an app wishes to have LTR or RTL mode, it should emit SCP 1 or SCP 2 on startup, and SCP 0 to restore the default on exit. In particular, if an app switches to explicit mode, it should definitely specify LTR or RTL too.

I’m a bit uncertain what to do with SPD.

Apps that implement BiDi according to this specification should not emit it. Still, terminals might support it to some extent, in case a legacy app emits them. For example, SPD 0 (CSI 0 SPACE S) and SPD 3 (CSI 3 SPACE S) could be interpreted as aliases to SCP 1 and SCP 2, respectively.

For fullscreen applications it would be really handy to have an escape sequence which sets the direction for their entire canvas (normal or alternate screen, excluding the scrollback; splitting in two a paragraph crossing the boundary between the normal screen and its scrollback if necessary). It’s tempting to use SPD for this. The problem is that all the other modes (BDSM, autodetection etc.) would also have to have such counterparts, but they don’t. It would be ugly for SPD to spread the current BDSM, autodetection etc. modes too across all the lines. For consistent spreading of all these modes across all the lines, apps have to find other means (e.g. clear the screen). So SPD doesn’t really help here.

See also “Other writing directions” below for SPD.

Other output modes

Box mirroring: CSI ? 2 5 0 0 h (temporary)
No box mirroring: CSI ? 2 5 0 0 l (temporary)

Autodetection: CSI ? 2 5 0 1 h (temporary)
No autodetection: CSI ? 2 5 0 1 l (temporary)

For all these new output modes not defined by ECMA, I recommend to pick new numbers within the DECSET/DECRST space. They have the advantage that apart from setting to high or low, there’s also a slot for saving and restoring the value (alas not implemented by all terminal emulators). (I’m also tempted to introduce the trailing letter d for restoring the default.) Another advantage over custom new sequences is that all emulators recognize the DECSET/DECRST format, and at least silently ignore the unrecognized values without potentially emitting garbage or executing some other operation. A disadvantage is that all the current DEC private modes are global for the terminal emulator, this might be a reason to choose something else.

For autodetection I have a few weak arguments for picking the disabled state as the default. This is the only mode known by ECMA, they don’t mention autodetection. This mode is the more backwards compatible choice with keyboard arrow swapping. This is the mode that results in fewer surprises, such as right-aligned text when the data happens to be RTL, for users that don’t care about RTL at all. Autodetection significantly improves the behavior in many use cases, but doesn’t fully fix them. E.g. in RTL mode, the command cat /etc/services produces quite unexpected result without autodetection; with autodetection it’s mostly as expected but lone # characters still appear on the wrong side. I don’t think advocating such an “almost fixed” solution here is a good idea over forcing users to find a proper one (e.g. in this case switch to LTR mode for that command). Autodetecting is subject to further experiments and improvements, such as looking at the entire output of a command instead of each paragraph separately. That being said, if strong arguments arise, we might change the default. We might also say that the default is unspecified, terminal emulators might pick their favorite one, or offer a setting to their users. Utilities that require one particular mode or the other should save+set the mode at startup and restore upon exit.

At this moment, VTE’s work in progress patch uses DECSET/DECRST 2500 for mirroring the box drawing characters, and DECSET/DECRST 2501 for toggling autodetection. The choice for these decimal numbers is a total misuse of the hexadecimal U+2500, the beginning of box drawing characters in Unicode. These numbers are not meant to be final, suggestions welcome.

Keyboard arrow swapping

Arrow keys swapping: CSI ? 1 2 4 3 h (perhaps temporary)
No arrow keys swapping: CSI ? 1 2 4 3 l (perhaps temporary)

Unlike the output modes, the keyboard mode is global. Here all the advantages for using a DEC private mode still apply, while the disadvantage does not. All the current modes that affect the keyboard are DEC private modes (e.g. 1, 66, 1050–1053, 1060–1061). For these reasons I recommend that we introduce a new DEC private mode, not necessarily using a number near to the existing keyboard related ones, or the newly introduced output BiDi modes.

For keyboard arrow swapping, to be consistent with the BDSM escape sequence that switches between implicit and explicit mode, I recommend that we go for a “default high” property. The high value enables keyboard swapping (conditionally to the cursor being inside an RTL paragraph at the time the key is pressed), the low value disables it.

At this moment, VTE’s work in progress patch uses DECSET/DECRST 1243 for swapping the arrow keys. The number represents the permutation that is applied to the four arrow keys (in the alphabetical order of their generated escape sequences: up, down, right, left). The choice of this number is not necessarily final.

Levels

There should be no escape sequence for the conformance level (e.g. whether BiDi control chars are remembered). Emulators should always use the highest level they implement.

How these escape sequences apply

[This is in draft state, subject to discussions and improvements.]

The escape sequence for arrow key swapping sets a global property, takes effect immediately. This subsection is about the escape sequences that modify per-paragraph display properties.

The BiDi mode is tracked for each paragraph. Furthermore, there is always a current mode of the terminal, defined by the latest relevant escape sequences seen. (This current mode is not directly visible to the user, similarly to how e.g. an \e[31m on its own doesn’t make anything appear in red.)

When receiving a BiDi-related escape sequence, the emulator’s current mode is updated accordingly.

When receiving a BiDi-related escape sequence, if the cursor stands at the first logical position of a paragraph then the received BiDi property value is immediately applied on that paragraph (even if the escape sequence reinforces a previously set current value, which might be different from the paragraph’s). The other BiDi-related properties of the given paragraph remain intact. If the cursor is not at the first logical position of a paragraph (including when it’s at the first logical column of a paragraph’s second or subsequent lines) then the new mode is not applied on the paragraph. (See in “Rationale and dropped ideas” for explanation of this design.)

Whenever a newline (linefeed) character is received and handled as appropriate, if the cursor moves into a new paragraph, the emulator’s current state is applied to that paragraph at the cursor’s new location. (If the cursor remains inside the same paragraph, no action is taken. Note: receiving a newline character does not convert the preceding line to a hard terminated one in most of the emulators I’ve tested.)

Whenever two adjacent paragraphs are joined (a letter is printed, causing a wrap to the next line), the first old paragraph’s BiDi properties live on for the joined paragraph, the second old one’s properties are forgotten.

The sequence ED (Erase in Display) is the one used by ncurses and probably many other fullscreen apps to initially clear the display. It is crucial that this sequence spreads the current BiDi modes into the entire cleared area. I propose that its behavior is refined the following way. It’s enforced that the following lines:

  • ED 0: all the lines below the cursor, excluding the cursor’s line,
  • ED 1: all the lines above the cursor, excluding the cursor’s line and the scrollback,
  • ED 2: all the lines, excluding the scrollback

are one-line paragraphs on their own (that is, both their preceding line as well the line in question ends in hard newline). (Note that the last scrolled out line also needs to change its ending to hard newline in modes 1 and 2. If ED 1 is received while the cursor is in the topmost row, no lines are erased but the boundary between the scrollback and the writable buffer is sill converted to a hard newline.) These lines (paragraphs) listed in the bullet points all receive the current BiDi parameters.

There are a couple of other ways a paragraph can split into two, including sequences that operate within a single line (e.g. DCH, EL) as well as ones that operate across lines (e.g. SD, SU, possibly within a scroll region). My recommendation:

  • For SD and SU, wherever lines are removed or inserted, we ensure that the boundaries are all hard newlines.
  • Newly added empty lines (by SD, SU) are all one-line paragraphs (similarly to ED) and receive the terminal’s current BiDi parameters.
  • With SD and SU, for every content that moves (or remains) on the screen, the BiDi properties move along (or remain) with that content.
  • For DCH, EL and possibly other similar sequences, if a paragraph is split in two, both new paragraphs receive the old one’s BiDi properties.

Whenever a new line appears at the bottom due to scrolling (e.g. a linefeed is printed), and that new line is a paragraph on its own (that is, the scrolling did not occur due to wrapping), it receives the terminal’s current BiDi properties.