Better poly than sorry!

Hacking Erlang shell to support Alt+Left/Right cursor movement

Last updated on:

My console setup

I spend a lot of time in the console. It's still a very productive environment, even though the need to preserve the compatibility with a character-oriented TeleTYpe[1] of the '60s prevents any radical improvements. It even works well out of the box until, at one point, it doesn't. This is a story of one such case.

I use URxvt[2] (terminal emulator), TMUX[3] (Terminal MUltipleXer), and ZSH[4] (shell) as my console. I switched to URxvt for the unicode support which wasn't common at the time. A relatively small number of issues with display and input with commonly used applications was a nice bonus.

The only problem I had with it is that pressing Alt+Left/Right inserts C or D characters instead of jumping over a word. Apparently, URxvt sends a sequence of key codes which is not, by default, mapped to anything in most programs. There's a very simple fix for it, at least for programs that use the Readline[5] library. Readline is configurable, and you can define your own bindings in the ~/.inputrc file (see the docs); this is what I use, for example:

  "\e[1;3D": backward-word ### Alt+Left
  "\e[1;3C": forward-word  ### Alt+Right

With just two lines, line editing becomes mostly a solved (for me) problem. Mostly. Though few and far between, some programs don't use Readline or emulate it half-heartedly (thereby ignoring the config file). One of such programs is the Erlang (and, by extention, Elixir) shell.

BTW: Yes, I do know about M-b and M-f. I use Emacs. I just really like my arrow keys, ok?

The journey begins

One Saturday evening, after a decade of putting up with it, I decided to try fixing the problem. I'm honestly not sure what I was thinking, I must've been bored out of my mind. I mean, honestly, who'd want to dig through entirely undocumented, complex internals of a multi-process beast with job control and remote capabilities...? (No, please don't answer, thanks.)

When I resolved myself to go down the rabbit hole, I searched on the net for docs or posts explaining how to customize anything more than the prompt in the shell. It took quite a few tries to get anything at all, but I finally found two (just two) relevant posts. One is a post from 2013 (by the author of "Learn You Some Erlang"[6], about the shell's architecture) and the other a StackOverflow post answering the question of how to add Readline features to Elixir's IO.gets and friends.

The TL;DR from these two is basically "go look at edlin.erl". On a diagram (borrowed from the first post) of processes (and modules) the shell consists of it's the part marked in red:

edlin.erl is part of a stdlib application and can be found in ./lib/stdlib/src/edlin.erl, starting in Erlang build directory root. You can see the whole file here. I made a fresh clone of the Erlang/OTP sources to avoid breaking my system-wide installation in case something went wrong. Compilation took some time, but it finished without problems.

Inside edlin.erl there's a state machine[7] used for parsing the incoming key codes into atoms denoting actions. It's a common Erlang pattern, where Tail Call Elimination is exploited to encode state transitions as normal function calls. It looks roughly like this:

edit([C|Cs], P, {Bef,Aft}, Prefix, Rs0) ->
    case key_map(C, Prefix) of
        meta ->
            edit(Cs, P, {Bef,Aft}, meta, Rs0);
        meta_o ->
            edit(Cs, P, {Bef,Aft}, meta_o, Rs0);
        meta_csi ->
            edit(Cs, P, {Bef,Aft}, meta_csi, Rs0);
        meta_meta ->
            edit(Cs, P, {Bef,Aft}, meta_meta, Rs0);
        {csi, _} = Csi ->
            edit(Cs, P, {Bef,Aft}, Csi, Rs0);
        meta_left_sq_bracket ->
        % ... more cases ...
        {undefined,C} ->
        Op ->
            case do_op(Op, Bef, Aft, Rs0) of
                {blink,N,Line,Rs} ->
                    edit(Cs, P, Line, {blink,N}, Rs);
                {Line, Rs, Mode} -> % allow custom modes from do_op
                    edit(Cs, P, Line, Mode, Rs);
                {Line,Rs} ->
                    edit(Cs, P, Line, none, Rs)

With the key_map function defined like this (out of order excerpts):

key_map($\^A, none) -> beginning_of_line;
key_map($\^B, none) -> backward_char;
key_map($\^D, none) -> forward_delete_char;
% ... more clauses ...
key_map($B, meta) -> backward_word;
key_map($D, meta) -> kill_word;
key_map($F, meta) -> forward_word;
key_map($T, meta) -> transpose_word;
% ... even more clauses ...

To be perfectly honest, the fact that it's a state machine wasn't obvious to me at first. My Erlang is a bit rusty, and one-letter identifiers don't make it the most reable code ever. It's also not trivial to see what sequence of codes will be actually sent to the edit function. I had to dig deeper.

Debugging the shell

First of all, this may be obvious, but in this case tried-and-true debugging with prints doesn't work. From withing edlin you get no direct access to the terminal, which makes sense, given it's itself a part of the terminal handling code. This tripped me up in the beginning a bit.

Fortunately, Erlang has an excellent graphical debugger, which you can attach to any process. To actually make use of it, you need to first reload a module you want to debug with it's instrumented, interpreted version. This is done with the int (or :int in Elixir) module[8]. Unfortunately, when I tried, it didn't work:

  -▶ ./bin/erl
  Erlang/OTP 24 [RELEASE CANDIDATE 1] [erts-11.2] [source-444144870c] [64-bit]

  Eshell V11.2  (abort with ^G)
  1> int:ni(edlin).
  =ERROR REPORT==== 29-Mar-2021::20:37:21.516194 ===
  Can't load module 'edlin' that resides in sticky dir

  ** exception error: no match of right hand side value {error,sticky_directory}
       in function  int:'-load/2-fun-0-'/3 (int.erl, line 531)
       in call from int:load/2 (int.erl, line 527)

Apparently, Erlang code server has a list of "sticky dirs" - modules living in them are not to be reloaded. Makes sense, most of the time. There has to be a way of disabling it though, right? Yes, there is - you can disable it globally with -nostick flag, or per directory or module, like this:

  2> code:unstick_mod(edlin).

Unfortunately, that's still not enough. Apparently, to be interpretable, a module has to be compiled in a special way to include some debug data. If it isn't, you will get the following error:

  1> int:ni(edlin).
  ** Invalid beam file or no abstract code: edlin

You can do this from the shell:

  2> compile:file("./lib/stdlib/src/edlin.erl", [debug_info]).

But then you have to remember to put the compiled file in the correct ebin directory yourself. Alternatively, you can pass a +debug_info to the erlc invocation (as you can see in its help message, +term "passes the term unchanged to the compiler"):

  -▶ ./bin/erlc +debug_info -o ./lib/stdlib/ebin/ ./lib/stdlib/src/edlin.erl

Now you should be able to unstick and instrument the module, and then start the debugger in one go:

  3> code:unstick_mod(edlin), int:ni(edlin), debugger:start().

Working with the debugger

In the newly opened window, click on Module menu and select edlin -> View. Then scroll down to the line you want to break on and double-click it (anywhere on the line). It looks like this on my computer (click to enlarge):

Now, when you switch back to the terminal and press a key you're interested in... nothing will happen! Instead, the process will reach the breakpoint and will stop. This is indicated by the break value showing up in the status column in the main window:

To actually start debugging, you need to use Process -> Attach menu item with the edlin process selected. It will open a window with the code, list of local variables with their values, and buttons for stepping over and into calls. Just remember that for the debugger to work, the module you want to debug has to be instrumented. If you try to step outside of the edlin module you won't see anything.

This is how the debugger looks like in action (click to enlarge):

Getting back to the code

After stepping through the execution of edit/5 function a couple times I was able to guess a few things. Here's the function head again:

edit([C|Cs], P, {Bef,Aft}, Prefix, Rs0) ->
  • The first argument is a list of keycodes (as integers, which also happens to be how Erlang encodes strings, which helps with the preview of values). C is the current code, while Cs contains the rest of the codes to be parsed. This argument is the first part of a state machine and represents state transitions.
  • The second argument is a prompt, as a string. It's not used much and can be ignored in this case.
  • The third argument is a pair of strings. They are the result of splitting the current line at cursor position: Bef keeps the part on the left of the cursor, and Aft the part from the cursor to the end of line. These change when inserting or deleting characters, but in this case they stay constant, so the argument can be ignored.
  • The third argument, Prefix, is an atom (or a tuple of an atom and a string, as we'll see in a moment) which says in what state the parser is currently. This may be none - a starting state; meta - after a modifier key was pressed; meta_meta - if we found two escape characters in a row - and quite a few other values. This is the second part of the state machine.
  • The last argument is, I think, a list of low-level commands (called "requests") for the TTY driver to add or remove characters, move the cursor, blink, and so on. Since I don't need to add any new functionality here, it is also safe to ignore for now.

The key_map function takes the next key code and the current state. It then returns the next state. The edit function interprets the new state and either loops to parse the rest of the codes list, or returns a list of commands for the driver to execute.

Recall my .inputrc: the terminal should send the following key codes sequence when Alt+Left is pressed (BTW: you can use Control+V[9] in the shell to quickly check what keycodes are sent):


Looking at the values of C and Cs variables in the debugger proves that, indeed, this is what edlin receives. For the record: \e numeric value is 27, which you can see in the screenshot. Here is the sequence of key_map function calls (in other words, states of the parser) when given this sequence of codes:

key_map($\e, none) -> meta;
key_map($[, meta) -> meta_left_sq_bracket;
key_map($1, meta_left_sq_bracket) -> {csi, "1"};
key_map($;, {csi, "1"}) -> {csi, "1;"};
key_map($5, {csi, "1;"}) -> {csi, "1;5"};
key_map($C, {csi, "1;5"}) -> forward_word;
key_map($D, {csi, "1;5"}) -> backward_word;

The first character - \e - puts the parser in the meta state, the next - [ - in (aptly named) meta_left_sq_bracket. I can't guess what the csi atom is supposed to be an abbreviation of, but it collects the key codes starting after \e[. Then finally, if all the codes in between match, we get to forward_word and backward_word states, which are passed to do_op function in in the last case of edit.

Once there are no more codes to parse, edit returns a list of rendering commands, tagged with either blink (self explanatory), more_chars (Enter was not pressed), or done (along with the full text of the line).

The problem and the fix

As you can see from the above code, edlin recognizes \e[1;5C and \e[1;5D as valid sequences of key codes, while my terminal sends codes with 3 instead of 5.

To fix this, the only thing needed is to add three new states to the key_map function, like this:

   key_map($5, {csi, "1;"}) -> {csi, "1;5"};
+  key_map($3, {csi, "1;"}) -> {csi, "1;3"};
   key_map($~, {csi, "3"}) -> forward_delete_char;
   key_map($C, {csi, "5"}) -> forward_word;
   key_map($C, {csi, "1;5"}) -> forward_word;
+  key_map($C, {csi, "1;3"}) -> forward_word;
   key_map($D, {csi, "5"})  -> backward_word;
   key_map($D, {csi, "1;5"}) -> backward_word;
+  key_map($D, {csi, "1;3"}) -> backward_word;
   key_map($;, {csi, "1"}) -> {csi, "1;"};

First, we make encountering $3 while in {csi, "1;"} state valid, and make it transition to a csi tuple (with 3 appended) as the next state. After that, we only need to handle the $C and $D characters when in the {csi, "1;3"} state by returning the forward_word and backward_word atoms. That's it!

Turns out that yes, the three lines above is all that it took to scratch my decade-old itch. As usual on such occasions, the solution is much less interesting than the path to it... Well, being able to use arrow keys is still nice, at least.


Despite some early setbacks, the fix turned out quite simple. When I realized that there's a finite state machine in the code, the rest was relatively straightforward. Erlang encoding of a state machine using just function calls and pattern-matching on states (codes so far) and transitions (current key code) is elegant and lightweight. It's quite similar to how one would write a state machine in Prolog, especially given the Prolog-based syntax of Erlang.

I used the word "parse" a few times in the post. This isn't an accident: parsers can be often represented by a state machine. The edit function is a stateful parser as well as a state machine. It parses a grammar consisting of (among others) \e, [, 1, ;, 3 terminal tokens, with state atoms being the production rules.

It's correct to say that the code illustrated here either "interprets" (if you focus on the state machine being run), or "parses" (if you focus on the allowed sequences of states) the input codes list.

That's it, I hope you enjoyed the post as much as I enjoy my arrow keys working properly in the Erlang shell :-)