Hacking Erlang shell to support Alt+Left/Right cursor movement ¶
My console setup
I spend a lot of time in the console. It's still a very productive environment, even though the need to preserve the compatibility with a character-oriented TeleTYpe[1] of the '60s prevents any radical improvements. It even works well out of the box until, at one point, it doesn't. This is a story of one such case.
I use URxvt[2] (terminal emulator), TMUX[3] (Terminal MUltipleXer), and ZSH[4] (shell) as my console. I switched to URxvt for the unicode support which wasn't common at the time. A relatively small number of issues with display and input with commonly used applications was a nice bonus.
The only problem I had with it is that pressing Alt+Left/Right
inserts C
or D
characters instead of jumping over a word. Apparently, URxvt sends a
sequence of key codes which is not, by default, mapped to anything in most
programs. There's a very simple fix for it, at least for programs that use
the Readline[5] library. Readline is configurable, and you can define your own
bindings in the ~/.inputrc
file (see the docs);
this is what I use, for example:
With just two lines, line editing becomes mostly a solved (for me) problem. Mostly. Though few and far between, some programs don't use Readline or emulate it half-heartedly (thereby ignoring the config file). One of such programs is the Erlang (and, by extention, Elixir) shell.
BTW: Yes, I do know about M-b
and M-f
. I use Emacs. I just really like
my arrow keys, ok?
The journey begins
One Saturday evening, after a decade of putting up with it, I decided to try fixing the problem. I'm honestly not sure what I was thinking, I must've been bored out of my mind. I mean, honestly, who'd want to dig through entirely undocumented, complex internals of a multi-process beast with job control and remote capabilities...? (No, please don't answer, thanks.)
When I resolved myself to go down the rabbit hole, I searched on the net for
docs or posts explaining how to customize anything more than the prompt in
the shell. It took quite a few tries to get anything at all, but I
finally found two (just two) relevant posts. One is
a post from 2013 (by
the author of "Learn You Some Erlang"[6], about the shell's architecture) and the
other a StackOverflow post answering
the question of how to add Readline features to Elixir's IO.gets
and friends.
The TL;DR from these two is basically "go look at
edlin.erl
". On a diagram (borrowed from the first post) of processes (and
modules) the shell consists of it's the part marked in red:
edlin.erl
is part of a stdlib
application and can be found in
./lib/stdlib/src/edlin.erl
, starting in Erlang build directory root. You
can see the whole file here.
I made a fresh clone of the Erlang/OTP sources to avoid breaking my
system-wide installation in case something went wrong. Compilation took some
time, but it finished without problems.
Inside edlin.erl
there's a state machine[7] used for parsing the incoming key codes into atoms denoting
actions. It's a common Erlang pattern, where Tail Call Elimination is
exploited to encode state transitions as normal function calls. It looks
roughly like this:
With the key_map
function defined like this (out of order excerpts):
To be perfectly honest, the fact that it's a state machine wasn't obvious to
me at first. My Erlang is a bit rusty, and one-letter identifiers don't make
it the most reable code ever. It's also not trivial to see what sequence of
codes will be actually sent to the edit
function. I had to dig deeper.
Debugging the shell
First of all, this may be obvious, but in this case tried-and-true debugging
with print
s doesn't work. From withing edlin
you get no direct access to
the terminal, which makes sense, given it's itself a part of the terminal
handling code. This tripped me up in the beginning a bit.
Fortunately, Erlang has an excellent graphical debugger, which you can
attach to any process. To actually make use of it, you need to first reload
a module you want to debug with it's instrumented, interpreted version. This
is done with the int
(or :int
in Elixir) module[8]. Unfortunately, when I tried, it didn't work:
Apparently, Erlang code server has
a list of "sticky dirs" - modules living in them are not to be reloaded.
Makes sense, most of the time. There has to be a way of disabling it though,
right? Yes, there is - you can disable it globally with -nostick
flag, or
per directory or module, like this:
Unfortunately, that's still not enough. Apparently, to be interpretable, a module has to be compiled in a special way to include some debug data. If it isn't, you will get the following error:
You can do this from the shell:
But then you have to remember to put the compiled file in the correct ebin
directory yourself. Alternatively, you can pass a +debug_info
to the
erlc
invocation (as you can see in its help message, +term
"passes
the term
unchanged to the compiler"):
Now you should be able to unstick and instrument the module, and then start the debugger in one go:
Working with the debugger
In the newly opened window, click on Module
menu and select edlin ->
View
. Then scroll down to the line you want to break on and double-click
it (anywhere on the line). It looks like this on my computer (click to
enlarge):
Now, when you switch back to the terminal and press a key you're interested
in... nothing will happen! Instead, the process will reach the breakpoint
and will stop. This is indicated by the break
value showing up in the
status
column in the main window:
To actually start debugging, you need to use Process -> Attach
menu item
with the edlin
process selected. It will open a window with the code, list
of local variables with their values, and buttons for stepping over and into
calls. Just remember that for the debugger to work, the module you want to
debug has to be instrumented. If you try to step outside of the edlin
module you won't see anything.
This is how the debugger looks like in action (click to enlarge):
Getting back to the code
After stepping through the execution of edit/5
function a couple times I was
able to guess a few things. Here's the function head again:
-
The first argument is a list of keycodes (as integers, which also happens
to be how Erlang encodes strings, which helps with the preview of values).
C
is the current code, whileCs
contains the rest of the codes to be parsed. This argument is the first part of a state machine and represents state transitions. - The second argument is a prompt, as a string. It's not used much and can be ignored in this case.
-
The third argument is a pair of strings. They are the result of splitting
the current line at cursor position:
Bef
keeps the part on the left of the cursor, andAft
the part from the cursor to the end of line. These change when inserting or deleting characters, but in this case they stay constant, so the argument can be ignored. -
The third argument,
Prefix
, is an atom (or a tuple of an atom and a string, as we'll see in a moment) which says in what state the parser is currently. This may benone
- a starting state;meta
- after a modifier key was pressed;meta_meta
- if we found two escape characters in a row - and quite a few other values. This is the second part of the state machine. - The last argument is, I think, a list of low-level commands (called "requests") for the TTY driver to add or remove characters, move the cursor, blink, and so on. Since I don't need to add any new functionality here, it is also safe to ignore for now.
The key_map
function takes the next key code and the current state. It
then returns the next state. The edit
function interprets the new state
and either loops to parse the rest of the codes list, or returns a list of
commands for the driver to execute.
Recall my .inputrc
: the terminal should send the following key codes
sequence when Alt+Left
is pressed (BTW: you can use Control+V
[9] in the shell to quickly check what keycodes are sent):
Looking at the values of C
and Cs
variables in the debugger proves that,
indeed, this is what edlin
receives. For the record: \e
numeric value is
27, which you can see in the screenshot. Here is the sequence of key_map
function calls (in other words, states of the parser) when given this
sequence of codes:
The first character - \e
- puts the parser in the meta
state, the next -
[
- in (aptly named) meta_left_sq_bracket
. I can't guess what the
csi
atom is supposed to be an abbreviation ofcsi
stands for
"Control Sequence Indicator", which is a special code which causes the
terminal to interpret the following codes instead of passing them to a
program; it collects the key codes starting after \e[
. Then finally, if
all the codes in between match, we get to forward_word
and backward_word
states, which are passed to do_op
function in in the last case of edit
.
Once there are no more codes to parse, edit
returns a list of rendering
commands, tagged with either blink
(self explanatory), more_chars
(Enter
was not pressed), or done
(along with the full text of the line).
The problem and the fix
As you can see from the above code, edlin
recognizes \e[1;5C
and \e[1;5D
as valid sequences of key codes, while my terminal sends codes with 3
instead of 5
.
To fix this, the only thing needed is to add three new states to the
key_map
function, like this:
First, we make encountering $3
while in {csi, "1;"}
state valid, and
make it transition to a csi
tuple (with 3
appended) as the next state.
After that, we only need to handle the $C
and $D
characters when in the
{csi, "1;3"}
state by returning the forward_word
and backward_word
atoms. That's it!
Turns out that yes, the three lines above is all that it took to scratch my decade-old itch. As usual on such occasions, the solution is much less interesting than the path to it... Well, being able to use arrow keys is still nice, at least.
Afterword
Despite some early setbacks, the fix turned out quite simple. When I realized that there's a finite state machine in the code, the rest was relatively straightforward. Erlang encoding of a state machine using just function calls and pattern-matching on states (codes so far) and transitions (current key code) is elegant and lightweight. It's quite similar to how one would write a state machine in Prolog, especially given the Prolog-based syntax of Erlang.
I used the word "parse" a few times in the post. This isn't an accident:
parsers can be often represented by a state machine. The edit
function is
a stateful parser as well as a state machine. It parses a grammar consisting
of (among others) \e
, [
, 1
, ;
, 3
terminal tokens, with state atoms
being the production rules.
It's correct to say that the code illustrated here either "interprets" (if you focus on the state machine being run), or "parses" (if you focus on the allowed sequences of states) the input codes list.
That's it, I hope you enjoyed the post as much as I enjoy my arrow keys working properly in the Erlang shell :-)
- Yes, that's what the TTY means. ↵
- https://github.com/exg/rxvt-unicode ↵
- https://github.com/tmux/tmux ↵
- https://www.zsh.org/ ↵
- https://en.wikipedia.org/wiki/GNU_Readline ↵
- https://learnyousomeerlang.com/ ↵
- Also called Finite State Automaton, more details here -↵
- http://erlang.org/doc/man/int.html ↵
- It invokes quoted-insert function in Readline ↵