POLYGLOT PROGRAMMING

Better poly than sorry!

Weekend with Elixir ΒΆ

So I was playing with Elixir in the last couple of days. I didn't write anything significant, but I scribbled enough code, I think, to get a feel of the language. I know Erlang rather well, so I suppose most "pain points" of Elixir were non-issues for me. I already mastered things like async message passing, OTP behaviors, as well as pattern matching, recursion as iteration and the like. Thanks to this I was able to concentrate on Elixir itself, instead of learning many unfamiliar concepts.

The syntax

One difference from Erlang which is highly visible is syntax. I don't hate Erlang's syntax; it's small and clean, probably owing to its Prolog roots. However, it's not like it couldn't be improved. Elixir has completely different syntax, closer to Ruby than to anything else I know of, but it turns out it works well.

For Erlangers, some interesting points are:

Dedicated alist syntax. With strict arity enforcement it's common for functions to accept a list of two-tuples (pairs) as an options list. First element of the tuple is a key, and the second is its value. It's natural to recurse and pattern-match over such a structure, so it's used a lot. Elixir comes with a special syntax for creating these, as long as the key is an atom (which it almost always is). It looks like this:

[key: val,
key2: val2,
...]

It's just a bit of syntactic sugar, you could equivalently write:

[{:key, val},
{:key2, val},
...]

and it would look similarly in Erlang. The sugared version is easier to read thanks to reduced number of parentheses, which clutter a non-sugared version.

Elixir syntax is also very consistent, with only a couple of pitfalls you need to look out for (mainly, a comma before do in statements in alist style). Every block and every instruction (macro call or special form) follow the same rules. For example, you can use both abbreviated and normal block syntax everywhere there is a block required. The syntaxe look like this:

def fun() do
    ...
end

or you can write it like this:

def fun(), do: (...)

You may get suspicious, as I did, seeing the second form. As it turns out, it's exactly as you think it is: it's the same syntax sugar that alists use. So, you can equivalently write:

def fun(), [{:do, val}]

It's nice, because it works uniformly, for every call, as long as the alist is the last parameter of the call. No matter if it was a macro or function call. Ruby has this in the form of hashes sugar.

Other syntactic features are much less interesting. For example, in Erlang all the variables has to begin with upper-case letter, while Elixir reverses this and makes variables begin with only lower-case letters. This means that atoms need a : prefix. What's fun is that capitalized identifiers are also special in Elixir and they are read as atoms (with Elixir. prefix). Things like this are very easy to get used to. Also, in most situations it's optional to enclose function arguments in parentheses. You can call zero-arity functions by just mentioning their name. To get a reference to a function you need to "capture" it with a &[mod.]func_name/arity construct. It's also easy to get used to, and not that different from Erlang fun [mod:]func_name/arity.

The module system

Erlang includes a module system for code organization. In Erlang, a module is always a single file, and the file name must match the module name. Also, the namespace for modules is flat, which means you can have conflicts if you're not careful. Elixir does away with the first restriction entirely and works around the second with a clever hack. You can define many modules inside a single file and you can nest the modules. It looks like this:

defmodule A do
    defmodule B do
        ...
    end
end

Inside the A module you can refer to B simply, but on the outside you have to use A.B (which is still just a single atom!). Moreover, you can unnest the modules, too (order of modules doesn't matter):

defmodule A do
end

defmodule A.B do
end

So, the modules are still identified by a simple atom in a flat namespace, but they give an illusion of being namespaced with a convenient convention. By the way, it makes calling Elixir code from Erlang not too hard:

'Elixir.A.B':func()

It's not that pretty, but it works well.

The other part of the module system is an ability to export and import identifiers to the module scope. In Erlang you can do both, but it's limited, because there is no import ... as ... equivalent, like in Python. Elixir, on the other hand, provides both an import and alias macros. Import works by injecting other module function names directly to the current lexical scope (this is another difference from Erlang, where you can only import at module level) and alias lets you rename a module you'd like to use. For example:

alias HTTPoison.Response, as: Resp

makes it easy to refer to the nested module without the need to write the long name every time. It also works in the current lexical environment. There's also require form, meant for importing macros (similar to how -include is used in Erlang).

These two features make modules in Elixir cheap. You're likely to create heaps more of modules in Elixir than you would in Erlang. That's a good thing, as it makes organizing the code-base easier.

There is more to the module system, like compile-time constants and a use directive, which make it even more powerful (and sometimes a bit too magical).

The tooling

Elixir comes with Mix, a task runner similar to Grunt or Gulp in the JS/Node.js-land. Every library can extend the list of available tasks by just defining appropriate module. One of such task providers is Hex, a package manager, also built-in.

The two work together to make pulling the dependencies (locally, like npm or virtualenv do) easier than most other solutions I've seen. You can install packages from central Hex repository or directly from GitHub. No manual fiddling involved. In Erlang this is also possible, but it's not as streamlined or convenient. Starting a new project, installing dependencies and running a REPL is ridiculously easy:

$ mix new dir
...edit dir/mix.exs, deps section...
$ iex -S mix

You can do mix run instead to run the project itself without a REPL. Every time your deps change, the change is automatically picked up and your deps get synced. Also, mix automatically compiles all your source code if it changes; in Erlang I was more than once wondering why the heck my changes aren't visible in the REPL, only to realize I didn't compile and load it. Of course, in the REPL you have both compile and load commands available, along with reload, which compiles and loads a module in one go. Mix is also used for running tests, and unit-test support is built into the language with ExUnit library, by the way.

The REPL itself is... colorful. That's the first difference I noticed compared to Erlang REPL. It supports tab-completion for modules and functions, which is the same for Erlang, but it also supports a help command, which displays docstrings for various things. This is useful, especially because the Elixir devs seem to like annotating modules and functions with docstrings. Everything else works just as in Erlang, I think.

The standard library

The library is not that big, and it doesn't have to, because there's Erlang underneath it. Instead it focuses on delivering a set of convenient wrappers which follow consistent conventions. There's and Enum module, which groups most collection-related operations, String module for working with binaries (strings in Elixir are binaries by default) and so on.

I said they follow consistent conventions. It's linked to the use of a threading macro (known as thread-first or -> in Clojure and |> in F#) - wherever possible, functions take the subject of their operations as a first argument, with other arguments following. This makes it easy to chain calls with the macro, which lets you unnest calls:

response
    |> String.split
    |> Enum.take(10)
    |> Enum.to_list
    |> Enum.join("\n")

Nothing fancy, and it has it's limitations , but it does make a difference in practice. It's a macro, so it's not as elegant as it would be in some ML, but it still works, so who cares? Of course, the problem is that not every function follows the convention, which makes you break the chain of calls for it. It also doesn't work with lambdas, which is a pain in the ass, because it doesn't let you (easily) do something like flip(&Mod.func/2). You could do it with macros, but that's the direction I have yet to explore.

Overall, Elixir standard library is rather compact, but provides most of the tools you'd expect, along with convenient OTP wrappers. And if you find that something is missing, calling Erlang functions is really easy (tab-completion in the shell works here, too):

> :erlang.now()
{1448, 491309, 304608}

The macros

Elixir sports a macro system based on quasiquote and unquote, known from Lisps. They are hygienic by default, work at compile time and can expand to any (valid) code you want. A lot of Elixir itself is implemented with macros.

You can treat macros simply as functions which take unevaluated code as arguments and return the final piece of code that's going to be run. I didn't investigate much, but one macro I found useful was a tap macro: https://github.com/mgwidmann/elixir-pattern_tap You can read its code, it's less than 40 lines of code, with quite a bunch of whitespace thrown in. This is also the reason why I believe that flip-like macro should be possible and easy to make.

I will probably expand on macros and other meta-programming features of Elixir after I play with it some more.

Structures and protocols

In Erlang you have records, but they are just a bit of (not that well looking, by the way) syntactic sugar on top of tuples. From what I understand, records (called structures) in Elixir are a syntactic sugar over Maps (the ones added in Erlang/OTP 17), however they act as a building block for protocols, which provide easy polymorphism to the language. Actually, it's still the same as in Erlang, where you can pass a module name along with some value, to have a function from that module called on the value. The difference is that in Erlang you need to do this explicitly, while protocols in Elixir hide all this from the programmer.

First, you need to define a protocol itself. It has to have a name and a couple of functions. Then you implement the functions for a structure of your choosing (built-in types, like Lists or even Functions, are included too). Then you can call the protocol functions (treating protocol name as module name) on all the structures which implement required functions. I think the protocols are nominative and not structural, so it's not enough to implement the functions in the same module as the structure, you need to explicitly declare them as an implementation for some protocol.

Protocols are used for a Dict module, for example, which allows you to treat maps, hashes, alists and other things like simple dictionaries, letting you manipulate them without worrying about the concrete type. However, the dispatch happens at run-time, so it's better to use implementation-specific functions if there's no need for polymorphism.

That's it

I mean, I only played with Elixir over a single weekend; I still have quite a few corners to explore. This is why I refrain from putting my perceived "cons" in this post - I still need to get used to the Elixir way of doing things some more.

I already decided, however, to use Elixir for programming Erlang in my next project. At a glance Elixir looks like it improves on Erlang in some areas, while not making it much worse at other places. I have a high expectations for macros which are actually usable (you have parse transforms in Erlang, but...).

Comments