BiDi is display-only
Terminal emulators have an in-memory concept about their contents. This is called “data layer” by ECMA TR/53, I tend to call it “model”. All the incoming data (characters, escape sequences) operate on this layer. My BiDi proposal only makes a few necessary additions and tiny corrections to this layer, but leaves the emulation behavior essentially unchanged, backwards compatible.
Then every once in a while (conforming to the human eye’s perception) terminal emulators update their display. (Updating the display after every change in the model would be unbearably expensive, and BiDi would probably make it even significantly worse.)
Traditionally, without BiDi support, the in-memory cells are simply displayed from left to right. This is what we’re about to change. With BiDi support, the cells can be shuffled in various ways for display purposes, plus might get mirrored. This operation is called “presentation process” in TR/53, I tend to call it “transformation”, “mapping” or “shuffling” (the latter one being loose wording because shuffling is the key component, but there’s more to this story such as mirroring certain glyphs).
The result of this transformation is called the “presentation layer” by TR/53, I’ll often call this “view”, although see soon for a better definition.
Note that I’m not making any distinction between the “character view” consisting of characters (Unicode codepoints) along with attributes (such as the mirroring property) after the BiDi algorithm laid out the model’s cells in their visual order, versus the “pixel view” as the rendered canvas with beautiful glyphs visible to the user; and apparently neither does TR/53 for its “presentation layer”. I’m sure it won’t cause any confusion. It’s up for terminal emulators to come up with internal terminology for these two if they wish to.
The result of the transformation is sufficient for displaying the contents to the user. Some meta operations, however, such as selecting with the mouse (for copy-pasting), might need to look back at the model and the transformation (to operate on the model (logical order) rather than the view (visual order) in our forthcoming implicit mode). As such, it’s probably unfortunate to think of the view as just the result of the model undergoing some transformation. It’s probably better to redefine the view as the model plus transformation (that is, automatically having access not just to the result, but also to everything contained in the model and everything done by the transformation).
Various pieces of existing documentation about terminals often use the words “left” or “right”. To be pedantic, from now on they should be interpreted as “towards preceding columns” and “towards subsequent columns” since the in-memory model doesn’t have intrinsic left or right directions; or if you prefer to still think of in-memory order being represented from left to right, it should be emphasized that these “left” and “right” words of the specifications from now on operate on the model, and actually may end up being displayed differently.
On conformance level 1, my recommendation essentially leaves the model, i.e. the emulation logic unmodified. The only tiny changes are that the concept of paragraph needs to be introduced (in the unlikely case that it isn’t already), in some new cases a paragraph needs to be broken into two, and a few new bits need to be tracked and remembered for each paragraph.
On to-be-designed conformance level 2, some new emulation requirements will be introduced to track BiDi control characters (similarly to combining accents) or other additional information, but this still leaves the emulation behavior backward compatible.
The majority of the BiDi feature goes to how the model is transformed to the view. The model’s cells are no longer necessarily displayed from left to right, we define several possible modes.
Each paragraph is in one of these modes. As mentioned in the generic BiDi introduction, the cells are always shuffled within a single line (and some might even get mirrored), although how they are shuffled (and mirrored) might depend on the contents of the entire paragraph.
In all the modes, whenever a mouse event is reported back to an app, it undergoes the inverse transformation (i.e. the reported column is the model’s column corresponding to the mouse position).