ECMA TR/53 Review

I have many problems with ECMA TR/53. Still, it’s something that people have put a lot of work in, it deserves to be evaluated and the few good parts kept. Pointing out the problems also serves the purpose of explaining why I didn’t adopt the rest.

Some parts have been adopted by ECMA-48. In order to fully understand TR/53 (e.g. the exact behavior of the control functions), both documents have to be read simultaneously.

They are dated 1990–1992, that is, 26–28 years ago as of writing this document. The computer world has changed a lot since then.

Actually, TR/53’s first edition is dated Dec 1990, ECMA-48’s current latest (fifth) edition is dated June 1991, and TR/53’s second edition is dated June 1992. That is, ECMA-48 must have incorporated the first edition of TR/53. I studied the second edition because the first one is hard to read and unsearchable (scanned pages). For the same reason, I don’t know what the differences are.

Predates the Unicode BiDi algorithm (first versions: 1998–1999) by many years. Our solution has to build around the well-established BiDi algorithm.

No known implementation. This suggests that the standard is either unclear/incomprehensible, or is bad or at least impractical, or is too old for today’s requirements, or that there’s no demand for what it aims to address. Partial ad-hoc BiDi support in certain emulators, and having feature request present in many others’ bug tracker excludes the last possibility.

What it comes up with is complicated. Even if it was easily and clearly understandable, I doubt any terminal emulator and terminal-based application would implement it.

Makes many changes to the data component, that is, amends the definition of many escape sequences, making them more complex (and in turn, harder to test, more likely to contain implementation bugs).

Only minimal amount of design decisions, rationale.

No notion of paragraphs, so probably (although it’s not specified) the implicit mode applies to lines.

No rewrap-on-resize, I-beam shaped cursor or other new features in mind.

At the beginning of 6.1.2 and 6.1.3 it gives definitions:

“The data component is used to store the information received from the input component and to make it available to a presentation process […]”

and

“The presentation component is used for receiving the information from the data component through the presentation process and for producing the graphic image output.”

This is depicted in figure 2.2. Later in 7.1 it defines the device component select mode (DCSM):

“there must be a means of specifying whether the control functions apply to the presentation component or to the data component.”

The presentation component mode contradicts the earlier definitions and the figure. My best understanding is that incoming printable characters always operate on the data layer, but control functions (escape sequences) might operate on either of them. This still contradicts the earlier definitions and the figure, though.

Absolutely unclear to me, and no rationale given why “there must be a means […]”. I have no idea how a mode where input characters operate on the data component but control sequences operate on the presentation component was expected to be useful, what demand it was expected to satisfy.

It is unclear how the data component is supposed to be updated when a control function other than moving the cursor (that is, one that changes the text contents) operates on the presentation component. E.g. in 7.3 delete character (DCH) says

“Whether characters are deleted in the data component or in the presentation component depends now on the setting of the [DCSM].”

So on one hand they are deleted from the presentation component while the data component is left unchanged, on the other hand the presentation component is still generated from the data component by the presentation process, thus reverting the effect of that control character? Or is the data component also updated using the inverse of the presentation process?

7.1 “The BI-DIRECTIONAL SUPPORT MODE (BDSM)” is the best part of this proposal, it grasps the need for two significantly distinct modes (implicit and explicit) and what kind of applications they’d be used for.

The fragment “because the data stream does not contain the control functions” makes it clear that TR/53’s implicit mode corresponds to my implicit mode level 1.

Its implicit mode doesn’t support my level 2 (the concept of BiDi control characters probably didn’t exist back then).

Neither TR/53, nor 48 specifies the scope where this mode applies (unlike for SPD and SCP where this is specified). Does it affect the entire screen, and what happens to earlier onscreen contents upon a mode change then? Or is it more fine-grained?

7.1 “The DEVICE COMPONENT SELECT MODE (DCSM)”: I’ve mentioned it before, but let’s emphasize it once again, operating on the presentation component (and keeping the data component underneath in sync? or letting it drift away?) is a terrible idea. Making it the default mode is even worse.

End of 7.1 defines explicit mode as the default (reset state). We’ve seen that the default needs to be the implicit mode.

In explicit mode, it allows the cursor to walk backwards in the data layer as characters are received, see 6.3.1 “If the direction of the implicit movement is opposite to that of the character progression” and see the SIMD control function in 7.2. It allows explicitly marking segments of text to be reversed for display purposes, see SRS and SDS in 7.2. These all extend (complicate) the emulation behavior much more than I would like to. They modify the emulation behavior in incompatible ways, hence if an app incorrectly believes that the emulator supports them, the output becomes broken. If a BiDi-aware application needs to work on emulators not supporting these features then it needs to be able to emit visual order, making these new control functions absolutely unnecessary. They split the responsibility, BiDi is done partially on the application side, partially in the terminal emulator, rather than having the entire responsibility clearly at one place. They aren’t flexible enough to support all the use cases required by BiDi, e.g. when two independent strings of foreign directionality happen to be placed next to each other. An application having convenient access to a BiDi library can much more easily just emit the visual order rather than figure out how to use these sequences so that the logical order results in the desired layout (which is not even always possible). They offer 3 substantially different ways to achieve a desired visual order: to emit visual order, to make the cursor walk backwards, and to use embedding levels. Only one of these is demonstrated in the examples. Not a word about the upsides and downsides of each approach, and how applications should pick one. It’s an overkill, I don’t see why ECMA introduced these at all, what need they were supposed to satisfy as opposed to emitting the visual order.

It took me a really long time to figure out how SRS and SDS are remembered on the data layer (there’s not a word about this in the document), and how they are updated on subsequent changes. This was because I kept thinking of them being analogous to the Unicode BiDi control characters, so that the positions where these control functions themselves were received would be remembered. By studying the examples I had to rule out this possibility. It seems they work like escape sequences that change the color or similar attributes. Their parameters apply to all the subsequently received characters. Each onscreen character carries their SRS and SDS values, just as they carry their color and other attributes. The document could’ve been explicit on this.

Therefore SRS and SDS do not really correspond to Unicode BiDi control characters, they correspond to the resolved embedding level in the middle of the BiDi algorithm. Apparently the overall design of TR/53’s explicit mode is that the application runs the first half of the BiDi algorithm (which didn’t yet exist back then), and instead of the final visual layout, it transmits the logical data along with the resolved embedding levels, from which point the terminal emulator completes the algorithm. Whenever the application needs to crop the data to fit, it needs to run the entire BiDi algorithm, crop, then undo its second phase (which cannot always be done correctly) and the terminal emulator will redo it. It’s a worse design than passing the visual order.

As seen in the section “BiDi control characters”, transferring the resolved embedding levels has the problem that it cannot separate two adjacent pieces of text having the same resolved embedding level from each other and thus cannot reverse them independently. The conceptual fix to this problem that isolates (or embeddings and marks) provide cannot be backported to TR/53’s design.

Worked example 4 (implicit mode) doesn’t mirror parentheses.

According to the end of “Worked example 4”, implicit mode ignores control characters, even SCP. I don’t see a reason for ignoring it and thus not being able to specify the paragraph direction.

Worked example 4’s “Presentation component (left-to-right presentation direction)” example rendering is faulty, contains spaces that aren’t present in the data component, and contains too few underscores. (It is correct in TR/53’s first edition.)

Does not talk about the input component, does not consider keyboard arrow swapping.

All in all, unfortunately, TR/53 couldn’t contribute anything to my proposed design. The need for the two vastly different modes was obvious way before getting familiar with this document. Without TR/53, the naming “implicit” and “explicit” would most likely be different, the escape sequences would definitely be different, but I’m certain that the rest of this recommendation would look the same.