Better poly than sorry!

Clojure and Racket: Origins

Most programming languages are created with some goals in mind, which the new language tries to meet. Over time, the goals may (but don't have to) shift as languages evolve. Either way, knowing current goals and history of a language should make understanding the bad and good sides of the language easier. I prepared a short introduction to both Clojure and Racket, before getting to the main issue in the following posts.

Racket: in the beginning, there was (PLT) Scheme

Racket started its life in 1994, as an implementation of Scheme language[1]. Scheme is a rather minimal Lisp-1[2] dialect popular with Programming Languages researchers, Computer Science professors, and teachers because of its simplicity and elegance.

Matthias Felleisen initially created a research group with a goal of providing better teaching material for novice programmers. The group was called PLT, and after deciding to write a new language as part of their efforts, they decided to base it on Scheme, creating first versions of PLT Scheme. The initial goal was consequently pursued for many years, resulting in a couple of books and many papers, but there were other groups of users of the Scheme - the language proved to be a good platform for implementing novel PL concepts, so researchers took a liking to it. From that point, the design and implementation of what would become Racket tried to meet the two goals: to be a beginner-friendly pedagogical environment and to be a powerful, easily extendable general purpose language.

On the one hand, Racket included delimited continuations[3], more powerful macro systems[4] and soft typing and eventually statically typed dialect[5] were implemented in Racket, improving both the language and the state of research. On the other hand, the need for more beginner-friendly language features focused on making teaching and learning on different levels easier, resulted in a graphical IDE and support for restricted and contracted[6] subsets of the primary language. Between the two extremes was the often overlooked rest - a natively-looking, cross-platform GUI toolkit (used to write Racket's IDE), a lot of utility libraries distributed via a central repository[7] and the infrastructure like the JIT compiler, garbage collector and so on.

PLT Scheme changed its name to Racket sometime around 2010, when it became apparent that it evolved way past the Scheme specification and showed no signs of stopping its further development and evolution. Current Racket is a product of seven more years of cutting-edge language research, teaching material preparation[8], adding requested features and abstractions and implementing even more libraries.

If it sounds like a kitchen sink, that's because it is. Some parts of Racket are explicitly designed, but some others evolved on their own. When meeting all the challenges during the years Racket not only added tools for dealing with them but also perfected the meta-tools to create the tools, which made it into a perfect kitchen sink: one where implementing new linguistic features is not only possible but easy by design.

Clojure: quite a different story

Rich Hickey wanted a simple, powerful and successful Lisp for a long time. During the 00's such a Lisp arguably didn't exist. Common Lisp was in a decline which started a decade earlier[9]. Scheme was impractical (because of problems with portability between implementations, also Scheme's minimalist approach to stdlib didn't help). Other Lisps - other than the embedded ones, like Emacs Lisp or a dialect used in AutoCAD - hardly had any following at all.

Before writing Clojure, Rich was familiar with Common Lisp and tried integrating CL with Java on a couple of occasions, but ultimately decided to create an entirely new Lisp, using JVM as a runtime environment. His goals were: symbiotic relation with the hosting platform, Functional Programming, and concurrency. FP and concurrency implied immutability: almost all native data structures in Clojure are immutable and persistent[10], and the mutable ones only allow mutation in certain, safeguarded contexts. The desire for expressive power resulted in an extended syntax, used for data structure literals and pattern matching[11] and dereferencing shortcuts. The standard library is not too big because Clojure can use all of the Java stdlib directly; the largest part of the library are functions for dealing with collections, which are an abstraction similar to iterable in Python. The four basic data structures - string, vector, set and map - are all collections and may be operated by the same set of functions.

In the first couple of years after launch, Clojure marketing was focused on Java interop and concurrency. Clojure indeed offered a lot of options for safe concurrency and safe parallelism (leveraging JVM threads and other constructs, but hiding them behind novel APIs) and was definitely more productive than Java (but comparable to Groovy or Jython, I think). It was quickly recognized as a language well suited for server-side web development. Later, when ClojureScript appeared[12], Clojure steered even more towards web development, now promising performant servers and almost effort-free (i.e., reusing the server-side code) frontend. It didn't work that well at first: ClojureScript was lacking features and Clojure to ClojureScript interop wasn't easy to setup. Things improved over the years, and in 2015, according to Wikipedia, 66% of surveyed Clojure users also used ClojureScript.

Over the years, the language became one driven by the community, with Rich serving as its BDFL, analogous to Python's Guido van Rossum. The community produces new versions of the language in 1-2 year spans; the new features are generally few in number, but powerful, with a broad impact on how the code is supposed to be written[13]. The tooling and library ecosystem, initially dependent on Java, also improved and matured over the years, resulting - among other things - in a good build tool and package manager[14]. The core language remained relatively small and focused, but its expressivity allowed for many features to be added as libraries.

Cross-pollination and differences

As both languages were rapidly evolving around the same time (and still being developed), some features from one language were implemented in the other. For example, in Clojure, core.typed is based on TypedRacket, while Racket sequences idea is suspiciously similar to Clojure's collections (especially when using a helper library, because in stdlib support for sequences is very basic).

That doesn't really make the languages similar, though - they are based on completely different philosophies and were developed very differently. Honestly, I'm tempted to say that Lispiness-1 of the dialects is the only thing they have in common (I'm exaggerating a bit): they offer similar functionality but deliver it in different ways, and the general feel of using them is very different.

Of course, how the language feels when used is entirely subjective, but language designers and developers often optimize languages to "feel natural" or "feel right," and language users often evaluate them based on this vague notion. That is to say: even if two languages are technically equivalent, if they "feel" different, they attract different users, cover different use-cases and specialize in different things.

The differences between Racket and Clojure are just enough to make it worth learning both of them, I think... But, well, as a programming languages nerd I may be a bit biased. Still, it's worth seeing at least some of the capabilities and features they provide - I plan to present a few of them in the third post in the series. But before that, in the next post, I'll tell you how and why my work on a Clojure-based project was a nightmare.


Upgrading Fedora 20 to 25

Apparently, I can't directly upgrade to whichever version I choose, I need to go one by one, doing fedora-upgrade and rebooting in a loop as needed.

I realize that my case is somewhat extreme, but watching the same set of packages getting downloaded five times (~640MB each time) feels wrong somehow.

The good news are that the upgrade took its time, but in the end finished successfully. Quite a feat for that large defference in versions: default package manager and system upgrade tools changed in the meantime, but still managed to produce a working system with minimal fuss.

Damn, I'm getting back to Linux as fast as I find a decent laptop; working on Mac OS - although, with effort, made vaguely ok - is tiring as I need to cope with all the "oh, it doesn't work that way here" moments...


Better blog generator - with RSS feed!

You can now subscribe to RSS feed for this blog!

RSS feed

As you can probably guess with a quick glance, this blog is self-hosted by me and is a result of "static page generation", which means there's a bunch of files and a few scripts which output all the needed HTML/CSS/JS to run the site, you just upload it to the server and you're done. This approach is very popular lately and many bloggers do the same.

One - slightly - unusual bit is that I wrote all of the generators and scripts myself, from scratch (using mainly Python). This means, among other things, that I have no pile of already written plugins or modules I could resort to in case I found my generator lacks an important feature...

For a long time I was pretty much the only user of both the generator and the blog, so I didn't mind. Recently, though, quite a few people who said they'd like to read my next post (my prospective readers! Finally, after four years, I have a chance to get my very own readers! ...err, sorry, got a bit too happy there for a moment...) suggested adding RSS feed to the blog.

I had known nothing about RSS and it took a couple of hours, but I finally added RSS feed generation to my scripts. Now every time I publish a post, a new /statics/feed.xml gets generated and, hopefuly, every subscriber will be notified![1] Even better, you will be even notified about post updates (probably)!

Thank you! (Or: Rkt vs. Clj is comming!)

Ok, so I know simply adding RSS feed is hardly important enough to write a post about it, so here is the other part: thank you very much! All of you who upvoted my comment on Hacker News or even sent me an email - only to encourage me to write the promised post, thank you. It's thanks to you that I got motivated enough to finally start writing that post about Clojure and Racket comparison.

Please stay tuned and be patient: it's a lot of work, especially because I want to stay as fair as possible and give as many detailed examples as possible (without copyright-related problems).

So, once again, thank you for expressing your interest in my future post and please stay patient while waiting for me to finish; I'll try my best, on my part, to write it all well and on time!

  • Let me know if it doesn't work for you or your reader! I don't know much about RSS and could have messed up).

How to quickly get over the parens problem of Clojure syntax.

NOTE: When first starting to use Clojure, get Nightcode. The experience is going to be much better.

It's normal, I think, to get overwhelmed by the sheer amount of parens you need to manage when learning a Lisp. When taken on its own, the syntax really is simple, but introducing another strange thing on top of already unfamiliar concepts doesn't help learning.

Some people will tell you that you will "get used to it" soon enough and that it's just a matter of practice. While there is some truth to these claims, I'm a pragmatist and so I prefer another approach: simply set up a decent environment for working with Lisp before you begin. There are many tools which make reading and writing Lisp code much easier: you need to either configure your editor to enable them or change the editor. There are some Clojure-specific editors out there, you can simply use one of them (my thoughts on them below).

Main things/features useful when working with Lisps:

  • syntax highlighting for your Lisp (obviously!)
  • automatic indentation (and re-indent) for your Lisp
  • ability to highlight matching paren
  • ability to jump to matching paren
  • configurable syntax coloring and/or rainbow-style coloring of parens
  • ability to wrap a text selection with parens
  • automatically inserting closing paren
  • ability to delete parens in pairs automatically
  • par-edit or par-infer style structural editing
  • auto-completion for module/function names[1]
  • quick access to the docs for a name under cursor
  • "Go to definition..." is good to have, but you can usually make do with grep or "Find in files..." editor command

See the screenshots and a video for visual demonstration of what I'm talking about (click on the image to get a bigger version):

Also see here for a very good general introduction to editing Lisp along with explanation of what Parinfer is.

Clojure specific editors

After writing the above I realized that it would be good to give a couple of examples of beginner-friendly editors, which implement the features mentioned above. To my surprise there are some editors which target Clojure, some of them even maintained. Here's a short summary of what I found out:

Nightcode - I only played with it a little but I'm impressed at what it can do. I tried using parinfer some time back and it is much friendlier and easier to use than Paredit[2]. In short: you never need to worry about parens when using Nightcode. Place a cursor where you need it and start typing: the parens will magically appear. It worked out of the box for me on Mac, however it refused to run on Fedora.

Clooj - inactive since 2012, has nearly none of the useful features, looks ugly and is slow. Forget it.

LightTable - supports many things mentioned above as plugins, but they tend to be disabled by default and enabling them is not as easy as it should be. The plugin-based architecture makes it interesting for polyglot projects, but it needs some configuration to get started and Nightcode needs none.

Cursive - a Clojure plugin for IntelliJ IDEA from JetBrains. I don't have any of their IDEs installed and so I didn't try it. Its feature list looks decent, though, so if you like IntelliJ this may be the best option for you.

Emacs and CIDER - that's what I use. I'd recommend you to only try this route if you already know some Emacs, otherwise it's going to be frustrating for a good while until you internalize all the strange names and such. CIDER itself is great, though, and integgrates with Leiningen, offers inline code evaluation, auto-completion, connecting to a running REPL and so on.

In short: give Nightcode a try if you can, otherwise use a Lisp/Clojure plugins for your current editor, like Sublime, IntelliJ or Eclipse. Come over to Emacs side once you get bored with those.

  • A full "IntelliSense" - context aware auto-completion - is always nice to have, but you can live without it. For a while.
  • I stick with Paredit because I already got used to it and am efficient enough with it; I'd go for parienfer had I started learning about Lisps now.

Scripting Nginx with Lua - slides

I gave a talk at work yesterday about OpenRESTY, which I think is the easiest way to start scripting Nginx. Here are the slides:


One thing I'd like to add is that I don't trust OpenRESTY as an app platform yet. The project is being actively developed and the author writes tons of good code, but he is but a single man. This means there is severe lack of tools and libraries in the ecosystem: they simply don't exist yet.

On the other hand, OpenRESTY is a painless way to script Nginx with Lua and that's something you can do with just the modules provided by OpenRESTY. Being able to communicate asynchronously with anything (literally: anything that wants to talk over sockets), including databases, external services and so on, is very nice thing indeed.

There's one caveat though, which is that you can't use any blocking (non-async) constructs, which includes using normal Lua sockets. This is not the problem in your code, but if you happen to need a Lua library which comes as a C extension, and if the functions in the library block, you will probably need to fix the C code. At this point it's purely theoretical, because I didn't encounter any such library yet.


My adventure in X screen locking - slock re-implementation in Nim, part 1

NOTE: This part covers project motivation and setup
NOTE: The full source code is on GitHub

TL;DR or What is this post about?

In short: it's about me exploring - or at least getting into contact with - a couple of interesting things:

  • X Window System APIs via Xlib bindings
  • low-level Linux APIs for elevating privilages, checking passwords
  • GCC options and some C
  • and of course the Nim language


At my work we strongly discourage leaving logged-in accounts and/or unlocked screens of computers. I happen to agree that locking your computer is a good habit to have, so I've had no problems with this rule... Up to the point when I switched from KDE to StumpWM (I wrote about it some time ago, in this, this and this posts) and my trusty Ctrl + Alt + L stopped working.

The only idea that came to my mind was to use the venerable xscreensaver, but: a) I didn't really need any of the 200+ animations (I just wanted a blank screen) and b) I didn't like how the unlock dialog looked like[1].

XScreenSaver and its dialog straight from the 80ties.

I needed something more lightweight (xscreensaver is ~30k loc of C[2]), simpler and either better looking or without any UI altogether.

slock to the rescue

There's a site called suckless.org where you can find some impressively clean and simple tools, implemented mostly in C. You can read more about their philosophy here, which I recommend as an eye opening experience. Anyway, among the tools being developed there, there is also slock - a very simple and basic screen locker for X. It's 310 lines of code long and it's all pretty straightforward C.

The program suited my needs very well: minimal, fast, and good looking. Well, the last part it got through cheating, as slock simply has no interface at all - but this means there's no ugly unlock dialog, so it's all good.

Why Nim?

As I used slock I read through its code a couple of times. It seemed simple and learnable, despite the fact that I knew nothing about X and didn't use C seriously in 15 years. Fast forward to last week and I finally found some free time and decided to learn some Xlib stuff. Re-implementing slock in Nim looked like a good way of learning it: this may sound a bit extreme, but it allowed me to gain exp points in two stats simultaneously[3] - in Xlib usage and Nim knowledge!

Nim is a very interesting language. I'd call it "unorthodox" - not yet radical, like Lisp or Smalltalk, but also not exactly aligned with Java and the like. For one tiny example: return statement in functions is optional and if omitted, a function returns its last expression's value. That's rather normal way to go about it; hovewer, in Nim you can also use a special variable name result and assign to it to have the assigned thing returned. It looks like this:

  proc fun1() : int =
    return 1

  proc fun2() : int =

  proc fun3() : int =
    result = 1

  assert fun1() == fun2() == fun3()

At a first glance this may look strange, but consider this in Python:

  def fun():
      ret = []
      for x in something:
      return ret

It's practically an idiom, a very common construct. Nim takes this idiom, adds some sugar, adapts it to the world of static typing and includes in the language itself. We can translate the above Python to Nim with very minor changes[4]:

  proc fun4() : seq[int] =
    result = @[]
    for x in something:

As I said, it's not a groundbreaking feature, but it is nice and good to have, and it shows that Nim doesn't hesitate much when choosing language features to include. In effect Nim includes many such conveniences, which may be seen both as a strength and as a weakness. While it makes common tasks very easy, it also makes Nim a larger language than some others. It's not bad in itself, rather, depending on how well the features fit together it's harder or easier to remember and use them all. Nim manages well in this area, and also it most definitely is not like C++ with its backwards compatibility problem, so I think even Pythonistas with taste for minimalism will be able to work with Nim.

Being Nimble - the project setup

Many modern languages include some kind of task runner and package manager either as part of stadard distribution or as downloadable packages. Nim has Nimble, which takes care of installing, creating and publishing Nim packages. Assumming that you have Nim already installed[5], you can install Nimble with:

  $ git clone https://github.com/nim-lang/nimble.git
  $ cd nimble
  $ git clone -b v0.13.0 --depth 1 https://github.com/nim-lang/nim vendor/nim
  $ nim c -r src/nimble install
  $ export PATH="$PATH:~/.nimble/bin/"

Note the addition to the PATH variable: ~/.nimble/bin/ is where Nimble installed itself and where it will place other binaries it installs. Make sure to have this directory in your PATH before working with nimble.

Creating a project

Creating a project is easy, similar to npm init:

  $ mkdir slock
  $ cd slock
  $ nimble init
  In order to initialise a new Nimble package, I will need to ask you
  some questions. Default values are shown in square brackets, press
  enter to use them.
  Enter package name [slock]:
  Enter intial version of package [0.1.0]:
  Enter your name [Piotr Klibert]:
  Enter package description: Simplest possible screen locker for X
  Enter package license [MIT]:
  Enter lowest supported Nim version [0.13.0]:

  $ ls
  total 4.0K
  -rw-rw-r--. 1 cji cji 188 13/02/2016 01:02 slock.nimble
  $ cat slock.nimble
  # Package

  version       = "0.1.0"
  author        = "Piotr Klibert"
  description   = "Simplest possible screen locker for X"
  license       = "MIT"

  # Dependencies

  requires "nim >= 0.13.0"

  $ mkdir src
  $ touch src/slock.nim

The generated slock.nimble is a configuration file for Nimble; it's written in a NimScript, which looks like a very recent development in Nim and it replaces the previous INI-style config files. This means that many examples and tutorials on the Internet won't work with this format. The most important difference is the format of dependencies list for your app: it now has to be a seq. For example, to add x11 library to the project:

  requires @["nim >= 0.13.0", "x11 >= 0.1"]

The dependencies should be downloaded by Nimble automatically, but you can also dowload them manually, like in the example below. You'll notice that I also install a c2nim package - I will say more about it later.

  $ nimble install c2nim x11
  Searching in "official" package list...
  Downloading https://github.com/nim-lang/c2nim into /tmp/nimble_18934/githubcom_nimlangc2nim using git...
  Cloning into '/tmp/nimble_18934/githubcom_nimlangc2nim'...
  ...etc, etc, etc...

Workflow and tooling

Nim is a compiled language, but working with it proved to be comparable to working with a dynamic language, mainly thanks to type inference and blazingly fast compiler. You dcan ommit type declarations where they are obvious from the context and Nim will deduce the correct type for you. It's not a full program type inference like in OCaml, but rather a local type inference as used in Scala or modern C++ or Java. Even with this limitation it's immensely useful and reduces code verbosity by a lot.

Compiler speed is important, because it encourages frequent testing. If compilation is going to take a long time you tend to "batch" changes in your code together and only compile and run once in a while instead of after every single change. This, in turn, makes it harder to find and fix regressions if they appear. Dynamic languages work around this issue by rejecting compilation step completely, at the cost of run-time safety and performance. Nim - which is similar to Go in this respect - makes compilation effortless and nearly invisible. The command for compile-and-run looks like this:

  $ nim c -r module.nim

The problem with this command is that it doesn't know about Nimble and dependencies, so in reality I used a bit different commands:

  $ nimble build && sudo ./slock

  # or, if you need to pass some additional parameters to Nim compiler:
  $ nimble c --dynlibOverride:crypt --opt:size --passL:"-lcrypt" -d:release src/slock.nim  -o:./slock && sudo ./slock

While Nim's compiler and Nimble are very good tools, they're not enough to work comfortably on more complex codebases. Nim acknowledges this and provides a couple of additional tools for working with code, like nimsuggest. However, nimsuggest is rather new and it SEGFAULTed on me a couple of times[6]. I used it via - of course - Emacs with nim-mode, and again, I encountered a couple of irritating bugs, frequent "Mismatched parens" errors when trying to jump a word or expression forward. However, when nimsuggest works, it does a good job, particularly its "Go to definition" feature works and is very helpful.

Nim's definitive source of documentation is The Index, which does a surprisingly well as a reference. Ctrl + f works just as well as search boxes other documentation sites provide. Nim docs are also nice in that they link to the source: you get a link to the function implementation beside its documentation. I like this trend and I'm happy that more documentation is like this - a quick peek at the source can sometimes save hours of unneeded work.

In this project, I had to work with Xlib and turns out its documented in man pages, and the pages were already installed on my system. I don't remember installing them, so maybe they come with X by default on Fedora. Anyway, for Xlib tutorial I used Xlib Programming Manual and for reference I simply used the man pages: just type, for example, man XFreePixmap and you get XCreatePixmap (3) man page. The same is true for Linux/POSIX functions - try, for example, man getpwuid.

That's it, for now

In the next part I'm going to show some Nim code, translated - both manually and automatically - from C. I'll focus on Nim's language features that C doesn't have and I'll show how they can be used to write shorter, safer and more readable code. In the later, last part I'm going to write about Nim's interop with C and various ways of connecting the two, using Xlib as an example.

  • And it seems that JWZ doesn't like people modifying the looks of this dialog, so I didn't even try.
  • Very well written C.
  • Every power gamer will understand how nice this is!
  • But note the lack of unnecessary return statement in Nim.
  • I recommend installing from GitHub - it's always good to have the sources for your compiler and stdlib around, and you won't get it if you install just the binaries. And master branch is stable: checkout devel branch for the in-development code.
  • This might be because of my own incompetence, of course.

Python interoperability options

Recently I was writing a little set of scripts for downloading and analyzing manga meta-data, for use with my Machine Learning pet project. True to a polyglot style, the scripts were written in two languages: Python and Elixir. Elixir is a language that works on Erlang VM (see my previous blog post on it, Weekend with Elixir) and Python doesn't need any introductions I think. I used Elixir for network related stuff, and Python with pandas for analysis.

The biggest problem with such a setup (and with polyglot approach) is passing the data around between the languages. At first I just used a couple of temporary JSON files, but then I remembered a project I once saw, called ErlPort, which is a Python implementation of the Slang's external term protocol. In short, it enables seamless integration between Python and Erlang and - by extension - Elixir. ErlPort not only lets you serialize data, but lets you also call functions across language boundaries. It also supports Ruby besides Python.

In general, using ErlPort was a success in my case, but it's limited as it can only connect three languages. I'd normally say it's good enough and leave it, but a couple of days later, in "the best of Python lib of 2015" thread on HN, I discovered another project called python-bond, which provides similar interoperability features for Python and three more languages: PHP, Perl, JavaScript. The two libraries - ErlPort and python-bond - make it embarrassingly easy to integrate a couple of different languages in a Python project. Along with Cython, which lets you easily call C-level functions, it makes Python a very good base language for Polyglot projects.

The hardest part with polyglot style, and in particular architecture, is deciding which language should be a base one, the one which runs and integrates the rest of the languages. In my case, I ended up with most of the control flow inside Elixir, because of the very neat supervision tree feature it provides (thanks to Erlang/OTP). It remains to be seen if Elixir is capable of filling the role of a main implementation language in a polyglot project. I'll make sure to write a post about it in the future.


Jumping to next/previous occurrence of something at point via iedit

For a very long time I used normal search to get to a next occurence of something. However, it wasn't very comfortable. For this to work, I need to:

  • move point (cursor) to the beginning of a word I'd like to jump to
  • press C-s
  • press C-w, more than once if necessary, to copy current word into search box
  • keep pressing C-s to search for occurences

I found a better way. iedit is a mode for Emacs for editing all the occurences of something at once. I alternate between iedit and multiple-cursors mode when I need to do something simple to a word in many places in the code. However, iedit also provides an iedit-next-occurence, which by default is bound to TAB.

Using iedit I only need to:

  • move point to anywhere inside a word
  • press C-; for iedit
  • press TAB to jump to next and S-TAB (shift + tab) to jump to previous occurence

One more feature of iedit I find useful sometimes is a toggle-unmatched-lines-visible command. It duplicates occur mode functionality a bit, but it hides unmatched lines in your current buffer. This makes it easy to quickly switch between a global occurences list and a local occurence context.


Weekend with Elixir

So I was playing with Elixir in the last couple of days. I didn't write anything significant, but I scribbled enough code, I think, to get a feel of the language. I know Erlang rather well, so I suppose most "pain points" of Elixir were non-issues for me. I already mastered things like async message passing, OTP behaviors, as well as pattern matching, recursion as iteration and the like. Thanks to this I was able to concentrate on Elixir itself, instead of learning many unfamiliar concepts.

The syntax

One difference from Erlang which is highly visible is syntax. I don't hate Erlang's syntax; it's small and clean, probably owing to its Prolog roots. However, it's not like it couldn't be improved. Elixir has completely different syntax, closer to Ruby than to anything else I know of, but it turns out it works well.

For Erlangers, some interesting points are:

Dedicated alist syntax. With strict arity enforcement it's common for functions to accept a list of two-tuples (pairs) as an options list. First element of the tuple is a key, and the second is its value. It's natural to recurse and pattern-match over such a structure, so it's used a lot. Elixir comes with a special syntax for creating these, as long as the key is an atom (which it almost always is). It looks like this:

[key: val,
key2: val2,

It's just a bit of syntactic sugar, you could equivalently write:

[{:key, val},
{:key2, val},

and it would look similarly in Erlang. The sugared version is easier to read thanks to reduced number of parentheses, which clutter a non-sugared version.

Elixir syntax is also very consistent, with only a couple of pitfalls you need to look out for (mainly, a comma before do in statements in alist style). Every block and every instruction (macro call or special form) follow the same rules. For example, you can use both abbreviated and normal block syntax everywhere there is a block required. The syntaxe look like this:

def fun() do

or you can write it like this:

def fun(), do: (...)

You may get suspicious, as I did, seeing the second form. As it turns out, it's exactly as you think it is: it's the same syntax sugar that alists use. So, you can equivalently write:

def fun(), [{:do, val}]

It's nice, because it works uniformly, for every call, as long as the alist is the last parameter of the call. No matter if it was a macro or function call. Ruby has this in the form of hashes sugar.

Other syntactic features are much less interesting. For example, in Erlang all the variables has to begin with upper-case letter, while Elixir reverses this and makes variables begin with only lower-case letters. This means that atoms need a : prefix. What's fun is that capitalized identifiers are also special in Elixir and they are read as atoms (with Elixir. prefix). Things like this are very easy to get used to. Also, in most situations it's optional to enclose function arguments in parentheses. You can call zero-arity functions by just mentioning their name. To get a reference to a function you need to "capture" it with a &[mod.]func_name/arity construct. It's also easy to get used to, and not that different from Erlang fun [mod:]func_name/arity.

The module system

Erlang includes a module system for code organization. In Erlang, a module is always a single file, and the file name must match the module name. Also, the namespace for modules is flat, which means you can have conflicts if you're not careful. Elixir does away with the first restriction entirely and works around the second with a clever hack. You can define many modules inside a single file and you can nest the modules. It looks like this:

defmodule A do
    defmodule B do

Inside the A module you can refer to B simply, but on the outside you have to use A.B (which is still just a single atom!). Moreover, you can unnest the modules, too (order of modules doesn't matter):

defmodule A do

defmodule A.B do

So, the modules are still identified by a simple atom in a flat namespace, but they give an illusion of being namespaced with a convenient convention. By the way, it makes calling Elixir code from Erlang not too hard:


It's not that pretty, but it works well.

The other part of the module system is an ability to export and import identifiers to the module scope. In Erlang you can do both, but it's limited, because there is no import ... as ... equivalent, like in Python. Elixir, on the other hand, provides both an import and alias macros. Import works by injecting other module function names directly to the current lexical scope (this is another difference from Erlang, where you can only import at module level) and alias lets you rename a module you'd like to use. For example:

alias HTTPoison.Response, as: Resp

makes it easy to refer to the nested module without the need to write the long name every time. It also works in the current lexical environment. There's also require form, meant for importing macros (similar to how -include is used in Erlang).

These two features make modules in Elixir cheap. You're likely to create heaps more of modules in Elixir than you would in Erlang. That's a good thing, as it makes organizing the code-base easier.

There is more to the module system, like compile-time constants and a use directive, which make it even more powerful (and sometimes a bit too magical).

The tooling

Elixir comes with Mix, a task runner similar to Grunt or Gulp in the JS/Node.js-land. Every library can extend the list of available tasks by just defining appropriate module. One of such task providers is Hex, a package manager, also built-in.

The two work together to make pulling the dependencies (locally, like npm or virtualenv do) easier than most other solutions I've seen. You can install packages from central Hex repository or directly from GitHub. No manual fiddling involved. In Erlang this is also possible, but it's not as streamlined or convenient. Starting a new project, installing dependencies and running a REPL is ridiculously easy:

$ mix new dir
...edit dir/mix.exs, deps section...
$ iex -S mix

You can do mix run instead to run the project itself without a REPL. Every time your deps change, the change is automatically picked up and your deps get synced. Also, mix automatically compiles all your source code if it changes; in Erlang I was more than once wondering why the heck my changes aren't visible in the REPL, only to realize I didn't compile and load it. Of course, in the REPL you have both compile and load commands available, along with reload, which compiles and loads a module in one go. Mix is also used for running tests, and unit-test support is built into the language with ExUnit library, by the way.

The REPL itself is... colorful. That's the first difference I noticed compared to Erlang REPL. It supports tab-completion for modules and functions, which is the same for Erlang, but it also supports a help command, which displays docstrings for various things. This is useful, especially because the Elixir devs seem to like annotating modules and functions with docstrings. Everything else works just as in Erlang, I think.

The standard library

The library is not that big, and it doesn't have to, because there's Erlang underneath it. Instead it focuses on delivering a set of convenient wrappers which follow consistent conventions. There's and Enum module, which groups most collection-related operations, String module for working with binaries (strings in Elixir are binaries by default) and so on.

I said they follow consistent conventions. It's linked to the use of a threading macro (known as thread-first or -> in Clojure and |> in F#) - wherever possible, functions take the subject of their operations as a first argument, with other arguments following. This makes it easy to chain calls with the macro, which lets you unnest calls:

    |> String.split
    |> Enum.take(10)
    |> Enum.to_list
    |> Enum.join("\n")

Nothing fancy, and it has it's limitations , but it does make a difference in practice. It's a macro, so it's not as elegant as it would be in some ML, but it still works, so who cares? Of course, the problem is that not every function follows the convention, which makes you break the chain of calls for it. It also doesn't work with lambdas, which is a pain in the ass, because it doesn't let you (easily) do something like flip(&Mod.func/2). You could do it with macros, but that's the direction I have yet to explore.

Overall, Elixir standard library is rather compact, but provides most of the tools you'd expect, along with convenient OTP wrappers. And if you find that something is missing, calling Erlang functions is really easy (tab-completion in the shell works here, too):

> :erlang.now()
{1448, 491309, 304608}

The macros

Elixir sports a macro system based on quasiquote and unquote, known from Lisps. They are hygienic by default, work at compile time and can expand to any (valid) code you want. A lot of Elixir itself is implemented with macros.

You can treat macros simply as functions which take unevaluated code as arguments and return the final piece of code that's going to be run. I didn't investigate much, but one macro I found useful was a tap macro: https://github.com/mgwidmann/elixir-pattern_tap You can read its code, it's less than 40 lines of code, with quite a bunch of whitespace thrown in. This is also the reason why I believe that flip-like macro should be possible and easy to make.

I will probably expand on macros and other meta-programming features of Elixir after I play with it some more.

Structures and protocols

In Erlang you have records, but they are just a bit of (not that well looking, by the way) syntactic sugar on top of tuples. From what I understand, records (called structures) in Elixir are a syntactic sugar over Maps (the ones added in Erlang/OTP 17), however they act as a building block for protocols, which provide easy polymorphism to the language. Actually, it's still the same as in Erlang, where you can pass a module name along with some value, to have a function from that module called on the value. The difference is that in Erlang you need to do this explicitly, while protocols in Elixir hide all this from the programmer.

First, you need to define a protocol itself. It has to have a name and a couple of functions. Then you implement the functions for a structure of your choosing (built-in types, like Lists or even Functions, are included too). Then you can call the protocol functions (treating protocol name as module name) on all the structures which implement required functions. I think the protocols are nominative and not structural, so it's not enough to implement the functions in the same module as the structure, you need to explicitly declare them as an implementation for some protocol.

Protocols are used for a Dict module, for example, which allows you to treat maps, hashes, alists and other things like simple dictionaries, letting you manipulate them without worrying about the concrete type. However, the dispatch happens at run-time, so it's better to use implementation-specific functions if there's no need for polymorphism.

That's it

I mean, I only played with Elixir over a single weekend; I still have quite a few corners to explore. This is why I refrain from putting my perceived "cons" in this post - I still need to get used to the Elixir way of doing things some more.

I already decided, however, to use Elixir for programming Erlang in my next project. At a glance Elixir looks like it improves on Erlang in some areas, while not making it much worse at other places. I have a high expectations for macros which are actually usable (you have parse transforms in Erlang, but...).


Hidden Emacs command that's a real gem

I have no idea how it happened, but somehow for the last 3 years I missed a very useful command, called finder-by-keyword.

The command is almost undocumented, there's only a short explanation of what the module (finder.el) does:

;; This mode uses the Keywords library header to provide code-finding
;; services by keyword.

By default it's bound to C-h p. It let's you browse built-in packages by keyword, like "abbrev", "convenience", "tools" and so on. It's great for discovering packages you didn't know existed!

The module seems to come from 1992 though, which makes it ignore all the libraries installed via package.el. It shouldn't be that hard to make it search all the package directories too. Actually, that's my main problem with package.el - it only provides a simple, flat list view of the package. This little tool is much better for browsing packages lists, and it's even already written.


A simple Lens example in LiveScript

After reading Lenses in Pictures I felt rather dumb. I understood what is the problem, but couldn't understand why all the ceremony around the solution. I looked at OCaml Lens library and became enilghtened: it was because Haskell! Simple.

Anyway, the code looks like this:

z = require "lodash"            # <code>_</code> is a syntactic construct in LS
{reverse, fold1, map} = require "prelude-ls"

pass-thru = (f, v) --> (f v); v
abstract-method = -> throw new Error("Abstract method")

# LensProto - a prototype of all lens objects.
LensProto =
    # essential methods which need to be implemented on all lens instances
    get: abstract-method (obj) ->
    set: abstract-method (obj, val) ->
    update: (obj, update_func) ->
        (@get obj)
        |> update_func
        |> @set obj, _

    # convenience functions
    add: (obj, val) ->
        @update obj, (+ val)

# Lens constructors
make-lens = (name) ->
    LensProto with
        get: (.[name])
        set: (obj, val) ->
            (switch typeof! obj
                | \Object => ^^obj <<< obj
                | \Array  => obj.slice 0  )
            |> pass-thru (.[name] = val)

make-lenses = (...lenses) ->
    map make-lens, lenses

# Lenses composition
comp-lens = (L1, L2) ->
    LensProto with
        get: L2.get >> L1.get
        set: (obj, val) ->
            L2.update obj, (obj2) ->
                L1.set obj2, val

# Lensable is a base class (or a mix-in), which can be used with any object and
# which provides two methods for obtaining lenses for given names. The lens
# returned is bound to the object, which allows us to write:
#   obj.l("...").get()
#   obj.l("...").set new_val
# instead of
#   make-lens("...").get obj
#   make-lens("...").set obj, new_value
Lensable =
    # at - convenience function for creating and binding a lens from a string
    # path, with components separated by slash; for example: "a/f/z"
    at: (str) -> @l(...str.split("/"))

    l: (...names) ->
        # create lenses for the names and compose them all into a single lens
        lens = reverse names
            |> map make-lens
            |> fold1 comp-lens

        # bind the lens to *this* object
        lens with
            get:  ~> lens.get this
            set: (val) ~> lens.set this, val

to-lensable = (obj) -> Lensable with obj

Some tests for the code:

# Tests

o = Lensable with
        bobr: "omigott!"
        dammit: 0

[prop, dammit] = make-lenses "prop", "dammit"
prop-dammit = comp-lens dammit,  prop

console.log z.is-equal (prop.get o),
    { bobr: 'omigott!', dammit: 0 }

console.log (prop-dammit.get o) == 0

console.log z.is-equal (prop-dammit.set o, 10),
    { prop: { bobr: 'omigott!', dammit: 10 } }

    |> (.set o, "trite")
    |> (.l("prop", "bobr").set -10)
    |> z.is-equal { prop: { bobr: -10, dammit: 'trite' } }, _
    |> console.log

out = o
    .at("prop/bobr").set "12312"
    .at("prop/argh").set "scoobydoobydoooya"
    .at("prop/lst").set [\c \g]
    .at("prop/dammit").add -10
    .l("prop", "lst", 0).set \a
    .l("prop", "lst", 2).set \a

console.log z.is-equal out, {
    prop: {
        bobr: '12312', dammit: -10,
        argh: 'scoobydoobydoooya',
        lst: ["a", "g", "a"]}}

out = o
    .at("prop/bobr").set "12312"
    .at("prop/argh").set "scoobydoobydoooya"
    .at("prop/dammit").add -10

console.log z.is-equal out,
    { prop: { bobr: '12312', dammit: -10, argh: 'scoobydoobydoooya' } }

transform =
    (.at("prop/bobr").set "12312") >>
    (.at("prop/argh").set "scoobydoooya") >>
    (.at("prop/dammit").add -10)

console.log z.is-equal  (transform o),
    { prop: { bobr: '12312', dammit: -10, argh: 'scoobydoooya' } }

This showcases many of the ways you can use lenses in LiveScript. LS has many ways of creating functions and it has syntax for pipe'ing and composing functions, so it's a natural fit for everything FP-looking. LS does not do "curried by default" like Haskell or OCaml, but it gives you a partial application syntax and allows defining curried functions. It's much like Scala in this regard.

Anyway, this is what lenses are supposed to do - they should support functional updates over data sctructures. They should offer get, set and update methods, and also there should be a compose operator for them. And that's all - you can read the linked OCaml implementation to see that it's really that simple. Most of that implementation are convenience methods for creating lenses for various OCaml structures; the core code is really short.


Code Mesh '15 conference

The first day of the conference is nearing an end right now, I don't have much time, but I decided to at least enumerate the talks I attended. I'll probably try expending on them later in the evening.

  1. Reducing the Slippery Surface of Failure with Dependent Types - a talk about dependent types in Scala, and their applications. Examples given were: command line parsing, string internationalization.
  2. Concurrency + Distribution = Scalability + Availability, a Journey architecting Erlang Systems - very interesting talk about the higher-level things you need to think about when desigining any kind of distributed system.
  3. The Pendulum - batch processing vs. interactive computing in the eyes of Bruce Tate. Some interesting anecdotes about the history of computing, Java and scalability.
  4. Function-Passing, A New Model for Typed, Asynchronous and Distributed Programming - another Scala-themed talk, this time about a "new" model of distributed programming. If I understand the concept correctly it's not exactly new, for example PicoLisp uses something rather similar. But, this being Scala, this approach guarantees static, compile-time checking for functions which are not suitable for passing around.

TXR mode for Emacs

NOTE: txr-mode.el on GitHub

Working with TXR without syntax highlighting was getting frustrating, so I finally decided to write a mode for Emacs. It's very simple and offers syntax highlighting only - ie. it scratches my own itch. It may be useful for others anyway.

I will probably expand it as I work with TXR. TXR proves to be a really nice language - its matching rules have some quirks, but once you learn them you can parse nearly anything without much hassle.


Avail and Articulate Programming

NOTE: Avail - a new language focused on articulate programming

Avail is a very interesting language. I only started working on it this week, just to take a look. I read a couple of examples and I was generally pleasantly surprised by what I saw. So here are my first impressions.

There is a problem with Avail syntax highlighting. Well, not exactly a problem, it just doesn't exist; not on the project page, not in Emacs, not anywhere I looked.

This is a shame, because even simple highlighting would help beginners. On the other hand, the language is built up so that, when you get used to it, you don't need any syntax highlighting. This is - I think - the articulate thing Avail's page talks about.

As usual I started with trying to write some tiny piece of code. It's easier to show an example, so here goes (you can hover your mouse over words in the listing to see a short explanation):

Module "Hello World"
    "Greet Router"

Method "Greet Router" is [
    socket ::= a client socket;
    target ::= a socket address from <192, 168, 1, 1> and 80;
    http_request ::= "GET / HTTP/1.1\n\n"→code points;

    Connect socket to target;
    Write http_request to socket;
    resp_bytes ::= read at most 440 bytes from socket;

    Print: "Router says: " ++ resp_bytes→string ++ "\n";

The most important things you can see here are: relatively lightweight syntax and weird function names. Avail gives you full control over how your program is going to be parsed. When you declare a method you're building a piece of Avail's parser - this means that you can make your function application look however you wish. For example, to have your function be called like read at most... you do this:

Method "read at most _ bytes from _" is [

At this point Avail looks like a cross between Inform 7 and Perl - cute!

But Avail doesn't end there. It supports multiple dispatch of methods, for example; also a sophisticated module system for organizing code, multiple inheritance...

Oh, and did I mention that Avail is actually a statically typed language? One with a very sophisticated type system, which is both powerful and non-imposing. As you can see, when you write something script-like, you may do so without being bothered by types.

Avail runs on JVM and comes with a Workbench - simple graphical code runner. Once inside you can compile and load modules and evaluate arbitrary expressions using Availuator (simple REPL).

Avail standard library is not big, and mostly focused on Avail itself. From what I can see, Avail is mostly written in itself and many of its inner parts are normal modules you can load, import and manipulate (see RPN calculator example).

That's all I know for now, I will report back later.


Pharo 64bit dependencies on Fedora

Pharo is a modern, clean Smalltalk-like language with beautiful, programmable GUI environment. I think Pharo is the easiest way of developing multiplatform, desktop-style GUI apps. I have one such GUI app in mind, so I downloaded it hoping to play with it for a bit.

The problem is that there is no 64bit version of Pharo and I'm running (obviously) 64bit OS. This means that I needed to download additional liraries for 32bit programs, even though I already had 64bit versions of the libraries.

The real problem, however, was that I didn't find any info on what libs exactly are needed. The only list, posted on a Pharo forum, was written with Ubuntu in mind and Fedora's packages are often differently packaged than in Ubuntu.

After a bit of googling and experimenting, I found the deps, installed them and Pharo launched and worked. Here they are, for those in a situation similar to mine:

sudo dnf install -y \
    mesa-libGL-devel.i686 \
    mesa-libGLU-devel.i686 \
    libgcc.i686 freetype.i686

Using jq for parsing Chrome bookmarks

So I started looking at jq a few days ago and it proved to be a nice tool. It does have it's problems, and a few rough corners, but it works for the most part.

Every once in awhile I feel like doing something with my collection of bookmarks. It's rather large, with many thousands of entries, almost completely unstructured and only exportable to an ancient, very-old-HTML-based format which chokes lxml.

Fortunately I found out that Chrome keeps bookmarks internally in a simple JSON file. Neat! You can find it in your Chrome profile directory as a file called Bookmarks.

First, I wanted to know how many bookmarks I have. Bookmarks are stored in a tree, with branches being "folders" and leaves being "url". A bookmark (of both kinds) is represented a simple JSON object:

    "date_added": "13088592023600468",
    "id": "1939",
    "name": "explainshell.com - match command-line arguments to their help text",
    "sync_transaction_version": "186",
    "type": "url",
    "url": "http://explainshell.com/"

Folders additionally have "children" field which is a list of folders and urls.

To know how many bookmarks there are we need to visit every node in a tree, check if it's of url type, incrementing some counter if it is.

With jq it looks a bit differently

    cat Bookmarks | jq '[ .roots[] | recurse(.children?[]?) ] | length'
    # 1932

Neat! The terseness is sometimes appreciated, especially for interactive work.

Converting all the bookmarks to a more convenient format is easy, too:

    cat Bookmarks | jq '.roots[] | recurse(.children?[]?) | {n:.name?, u:.url}'
    # {
    #   "n": "Futures of text | Whoops by Jonathan Libov",
    #   "u": "http://whoo.ps/2015/02/23/futures-of-text"
    # }
    # {
    #   "n": "SQLFormat - Online SQL Formatter",
    #   "u": "http://sqlformat.org/"
    # }
    # ...

It's easy to save intermediate results by just redirecting output to a file, ie. cat fname | jq 'cmd' > out.json.

I have another source of bookmarks, which is OneTab. I parsed its export format with a bit of Python (laziness on my part - I could have done it in jq) and dumped the results to another file.

To join them:

    jq '[ .[] ]' flat_bookmarks.json onetab.json > all_flat_bm.json

    cat all_flat_bm.json | jq 'select(.n | contains("emacs"))'
    # {
    #   "u": "https://github.com/emacs-tw/awesome-emacs#interface-enhancement ",
    #   "n": " emacs-tw/awesome-emacs"
    # }
    # {
    #   "u": "http://oremacs.com/page2/ ",
    #   "n": " (or emacs ? irrelevant)"
    # }
    # ...

I think it works quite well for what it's designed to do. It actually has an interesting purely functional language in it, but it's not documented (or I couldn't find any docs at least). With "advanced features" like Variables and defining Functions (filters, actually) it should be possible to express even the more complex JSON queries and transforms.


Lovely Scheme for Python (and C#)

I found curious specimen today, something called Calysto Scheme. It's a full Scheme compiler implementation, written in itself and compiled to Python. That is - a fully bootstraped Scheme compiler (and supporting runtimes in Python in C#). It's apparently only "a few orders of magnitude slower than Python", but hey, it's a complete language (THE language, according to SICP). It's going to be a great fun digging through the internals. Working with bootstraped compilers is always a bit weird, but it's interesting because of this.

How did I find it? Well, of course by reading Hy mailing list!

OK, so what's Hy?

Hy is a wildly popular Lisp-like language that compiles (no trans there, thank you) to Python AST. It has a simple runtime and almost 1-to-1 correspondence to Python. There is no interop to speak of - Hy objects are Python objects and nothing more.

At the moment Hy does little more than providing a couple of wrappers over built-ins and a s-exp based, macro-enabled syntax. Which is not bad at all: with macros it can easily grow into anything you'd like.


Ahhh... Fresh meat! (I mean, languages!)

NOTE: Thanks to kazinator - TXR author - and his advice I rewrote TXR example. Its much prettier now.

In the last couple of weeks I found (or started learning) two rather interesting languages. They are not exactly new, each of them has a few years of development behind it. These are:

TXR - text and data munging

This one I found I think half a year ago, concluded that it looks nice and left it for later. Now, last weekend I had some command line text transformations on config files and such, so I decided to ditch my trusty zsh and AWK and try TXR.

Unexpectedly it worked. TXR is a templating engine which tries to work both ways. Like Boomerang, but it's much easier to start with. Well, this is subjective. I'm in favour of Lisp in the eternal conflict between the Lisp and the ML, which makes me accept Lisp-looking things quickly. Someone else could say it's the opposite for him.

Anyway, TXR does look like Lisp because it has a Lisp built-in. In general TXR code is interpreted in one of three modes:

  • template with incoming data (destructuring input data)
  • template with output data (format input data)
  • a Lisp code

As you probably guessed already the third option can do everything the other two can and much more. Also, you can mix the modes relatively freely.

The only "problem" with TXR is that it uses @ sign as escape character. It's also partially whitespace sensitive. It looks a bit weird, and it only has syntax highlighting for Vim (I'm working on Emacs mode). In a word: it's not perfect, but is really solid tool for the job of text mangling. For example, I had to change my .gitmodules file:

[submodule "@mod"]
    path = @mod
    url = @submodule_url
@;execute a command and use its output in matching
@   (next <code>!cd @mod && git remote -v</code>)
@_  @remote_url (fetch)
@   (output)
[submodule "@mod"]
    path = @mod
    url = @remote_url
@   (end)

jq - structured data processing and transformations

While TXR handles both free-form as well as structured input, jq focused on the latter kind. It's meant as a tool for querying and transforming large JSON documents, but it of course includes a full-blown programming language too.

I didn't use it yet, but looking at RosettaCode examples it looks like it's somewhat similar to Tulip in its goals: to be a very convenient command-line tool which happens to be a powerful language. I can't say anything about how this idea is actually implemented, unfortunately.

It also looks like it borrows a bit of semantics from concatenative languages, which makes it easy to use point-free style (like in Haskell or J).


Literate J - another stab at readability

I was browsing for some cool Emacs packages and I encountered Swiper. Its author has a blog, and I started reading his archive. There I found one post about J: Try J

I enjoyed reading the post, not so much pasted examples. One of the examples looked like this:

createTable =: [:|:,.&a:
union =: ([:{.[),:[:([:<[:;]#~~:&a:)"1([:{:[),.([:{:]){~([:{.])i.[:{.[
insert =: [union[:({.,:([:([:<[:(]$~[:1&,#)>)"0{:))[:|:[:>]
rows =: {.([:<,:)"1 1[:|:[:([:<"1[:,.>)"0{:
value =: [:(]`{.@.([:1&=#))[:,[:>[((([:<[)=[:{.])#[:{:])[:>]
pr =: [:,[:(([:>[)(([:<([:>[),.[:>])"0 0)"0 1[:>])/[:([:<[:rows>)"0]
join =: 1 : 0
    ((([:{.[:>{.),:[:([:<(>"0))"1[:{:[:1 2 0&|:[:>([:,u"0)#]) (pr y))

It's an absolutely perfect example if you want to scare someone to death, but it's absolutely impenetrable for anyone outside of J community.

Most J code is written in this style, as it is faster to write for the initiated and it's somewhat readable for them too. But J can be written differently, too.

This is my try at making J code easy to read. It wasn't easy to write, though...

GA    =: noun define
Four score and seven years ago, our forefathers brought forth upon this
continent a new nation, dedicated to the proposition that all men were created

box         =: <
join_by     =: ;
all         =: "0
rows        =: "1
floor       =: <.
group_by    =: ]/. ~
running_sum =: +/\

mod =: dyad define
    floor x % y

splitstr =: dyad define
    <;.1 (x , y)                NB. v;.1 splits an array on its first elem as
                                NB. delimiter and calls v on each group

wrap =: dyad define
    len         =. x
    words       =: ' ' splitstr y
    lengths     =: # each words
    sums        =: box all (running_sum >lengths)
    row_numbers =: box all (>sums) mod len

    NB. big_table   =: words ,. lengths ,. sums ,.  row_numbers
    NB. echo ('word';'length';'sums';'row number') , 10 {. big_table

    val =: join_by rows (words group_by row_numbers)

80 wrap GA

pprint.pprint versus betterprint.pprint

Built-in Python pprint module provides a way of dumping arbitrary complex objects to output in a way that is readable for humans. It mostly works well, but I wanted something better - something that supports colors.

Turns out there already is such a module already: betterprint. Here is how it looks compared to normal pprint:

The only problem with betterprint is that I couldn't make it work under Python 3. I didn't spend much time on it, though, so it's possible that it's not very complicated to fix this.


Making the most out of man pages

Reading documentation is something that programmers do all the time. This makes it worthwhile to optimize the process of finding and reading the docs. There are various sources of documentation and many different formats docs use. You can use docs put online as web pages, for example, or your editor may provide inline documentation for you. You can get even fancier with things like dash for example. There is however one source of documentation which is very simple to use, almost always available and which covers many interesting things. Unfortunately, many people don't use it, because it looks ugly.

Well, that's somewhat true - manpages are meant to be read in the console, so their formatting is rather minimal. It's not that bad, actually, but it could be better. The good news is that they can be made a bit better looking without much hassle! So first, side by side comparison:

I know, the difference isn't exactly staggering, but adding a few colors here and there really helps when you're scanning a loooong manpage looking for some specific information. So how do you make this happen?

First, install most. It's a pager program, supposedly an improvement over less. It offers some interesting functions, one of which is recognizing that it's a manpage being displayed and coloring it. You can probably install most via a package manager of your distro without any problems, for example in Fedora:

$ sudo yum install -y most

Then you need to set most as a PAGER which man command will use. You could just use most as your default pager for everything, of course, but I opted for using it for man pages only (I don't remember why, though). It's enough to add the following to your .bashrc or .zshrc:

# Use <code>most</code> as a pager for man pages if installed, fallback to <code>less</code>
# otherwise
function man () {
    if $(hash most 2>/dev/null); then
        PAGER="most" /usr/bin/man "$*"
        /usr/bin/man "$*"

And that's it - next time you invoke man you'll get your documentation colored. Note that most comes with a help on itself built in: once inside most you can press h key and you'll get a list of all the commands available to you. Press space to show next page of the list.

Also, it's worth reading man most - kind of meta - most is reasonably configurable and all the options are listed there.

I hope this tip will save you some time and/or frustration when dealing with manpages. It did the trick for me, at least.


Changes on the blog

As you can probably guess I'm using a static site generator for creating and publishing this blog. That's pretty normal, but I decided to write my own generator instead of using a pre-built one. Normally it's ok, as I tweak and (hopefully) improve it bit by bit.

The problem begins when the improvement I want to implement is bigger. Finding enough time to do this is hard and in effect many rather important features take ages to appear. Well, nothing unusual here, it's just the way personal pet projects work.

Ok, so here is why I'm talking about this right now: after a couple of weeks (maybe months) I finally finished implementing two features I wanted to have!

First: every post, in addition to being displayed on the front page, has also a separate page for itself. Right now you can get to this page by either clicking on a link in posts list to the right or by hovering over post title and clicking "¶" character that appears at the end of the title.

Thanks to this I can now have comments, kindly (and for free) powered by Disqus. I also added Disqus comment box to "Contact" page, where it's intended to serve as an easy and convenient (although not the fastest) way of getting my attention. For very urgent business using email is still the best option.

Coming up next: pagination on the main page, many CSS fixes focused on improving readability, complete redesign of "Languages" section and more. We'll see how long will these things take, I wouldn't hold your breath for this, but I hope it will happen in September.

I won't stop writing posts in the meantime, so I hope you can bear with less then optimal design until then.


LightTable is dead, long live...

If you read some of the posts on my blog you should probably know what I'd like to place instead of ellipsis. Indeed, I see no alternative for this editor right now, but that's not the point of this post. Rest assured, though, I will get back to this subject near the end of the post.

First, I'd like to say that I'm very sad about what happened to LightTable. I was hoping for it to grow into a real editor sooner or later and I really wanted to migrate to it at some point. It promised to retain all the important ideas from my current editor(s) while adding a modern look&feel and some innovative techniques. I would be really happy were these promises fulfilled.

However, I was a bit skeptical since the beginning. In my comments I warned that it's impossible to create an editor on par with my current one(s) in just a few years. It's impossible to create thousands of plugins with literally millions of lines of code in such a short time.

I was worried about APIs, too. And the tooling. While Chrome dev-tools are generally great I was afraid that they wouldn't fit well into the domain of text editors. Also, the docs - there are whole books about the editors I use, on top of many, many thousands of pages of online docs. It would take LT a long time to catch up to this.

Still, I wanted to be as optimistic as possible, I talked about LT with friends and tried out some of its versions. It wasn't that bad, really, and I believed for a long time that things are going in a good direction.

But every time I used LT I was becoming more and more worried. That is because my way of using editors is to first and foremost to learn their customization and scripting capabilities. LT tried to be scriptable, but the APIs were all undocumented. And customization wasn't its greatest strength, either.

Still, I believed that it's all going to get fixed sooner or later. It's just a ton of work, I thought, and it may take a long time to do this. But as long as there are contributors and there's a will to work on the project I thought it's going to be fine.

I stopped following LT development for awhile. Because of this I only today learned that no, it won't ever work, because the main author lost interest in LT somewhere along the road. He honestly admitted this in a post to a LT mailing list, in April.

It turns out Chris wasn't interested in creating a good, innovative editor for programmers. He wanted to transform the whole practice of programming, or so he says. Once it became clear that LT is not going to succeed any time soon, Chris turned his attention to other, even more ambitious projects. I understand his motivation.

I felt somewhat betrayed. I really wanted this new editor. I thought that Chris knows what is needed for an editor to become really good and is prepared for a long march and developing LightTable for many years. Turns out he wasn't.

In the mentioned post to the mailing list Chris enumerates a couple of reason for which he decided to abandon the project. They're mostly technical and I'd love to go over them in a detailed post. Maybe some other day.

Anyway, here it is, the word missing from the title. Yup, it's Emacs - the only editor which is capable of implementing all the magnificent features LT promised to provide. There's also Atom, but again - it needs at least a couple of years of development before it reaches the extensibility of Emacs. Will the authors stick with Atom long enough for this to happen? And also, when Atom reaches the point where every feature of Emacs could be implemented in it the real fun begins - writing plugins. Lots and lots of them. Both Emacs and Vim (and NeoVIM by extension) have insane numbers of plugins. It would take another couple of years to get anywhere near this level.

Meanwhile I still wish for a better UI, embedded (real) browser and many other features which are very hard to do in Emacs with its 30+ years of code in it. My hope for this was seriously shaken, though, and I'm not going to hold my breath anymore for this. It's a pity, but for now there's simply no alternative to Emacs to me.

I think some of my next posts will be about detailing features I'd like to have in an ideal editor.


StumpWM - warn about low battery

My work laptop I use right now is rather old (the new one broke, and I'm waiting for the replacement), and it can't last as long on the battery as it should. A few posts back I posted my StumpWM mode-line configuration. You can see that there already is a battery capacity meter there.

The problem with this is that I don't look at it frequently. I can go on using my computer for hours without looking at it a single time. You can probably guess where this is going: the computer died a couple of times because the battery was empty, and I didn't realize it is.

I thought that I should have some alert that would remind me that I should plug the power cord. I couldn't find any ready-made solution for StumpWM, so I decided to roll my own. It wasn't that hard, actually, only a couple lines of code.

Here it is:

(defvar *current-battery-status* 100)

(defun low-battery-alert (&optional bat_percentage)
  ;; use a global value if the caller didn't provide its own value
  ;; (for ease of testing in the REPL)
  (when (not bat_percentage) (setf bat_percentage *current-battery-status*))
      ((*message-window-gravity* :center) ; make alert appear at the at the center of screen
       (*timeout-wait* 6))              ; wait 6 sec before making alert disappear
    (message "Your battery is running low!~%Only ~s% remaining..." bat_percentage)))

(defun get-battery-status ()
  "Needs the <code>upower</code> utility to be installed in the system, i.e. on Fedora:

$ sudo dnf install -y upower

  Also, try running:

$ upower --enumerate

  to learn what your battery is called. "

  (let* ((command (concat
                   "upower -i /org/freedesktop/UPower/devices/battery_BAT1 "
                   "| grep perc "
                   "| awk '{print $2}'"))
         ;; the command above returns a string looking like this: "11%\n"
         ;; we need to get rid of a percent sign and a newline char then convert
         ;; it into a number, which READ-FROM-STRING does for us
         (new-battery-status (->> (run-shell-command command t)
                                  (string-trim (string #\newline))
                                  (string-trim "%")

    (when (and (<= *current-battery-status* 15)
               ;; show alert only when battery status changed since last check,
               ;; but don't do this if the new value is greater than the last
               ;; (this means we're probably plugged in and charging already)
               (> *current-battery-status* new-battery-status))
      (low-battery-alert new-battery-status))

    (setf *current-battery-status* new-battery-status)
    (format nil "~s%" new-battery-status)))

It works well enough for me. I'm putting it here in case someone has a similar problem. If you're interested in making this a StumpWM contrib module, please let me know, I'll try packaging it properly.

I'm not doing it right now because the code is simple, and I have no need for it to be accessible anywhere outside of my .stumpwm.d. I also don't know yet how do you declare and use packages in Common Lisp, but that's just a detail...


The shortest implementation of circular Linked List (LiveScript)

Linked list is one of the basic data structures. It's very simple conceptually and is easy to implement, so people sometimes overlook it. Linked list, however, has a couple of interesting features, one of which I'd like to explore here.

Some time ago I had to implement a simple 3-state button. It's easy: you have an array of state names, and an index that you increment on each click. You either use a modulo operation or a simple if statement to make the 1st state come as "next" state for a 3rd state.

So, how could a linked list help us here? It's easy: linked lists can be circular. All it takes is to link last node to the first, and it's done. You can then bind your linked list "next" method directly to some event. It will just work, even if you later decide that you need to add a couple of new states to your list.

Unfortunately, there is no linked list implementation in JavaScript standard library (if these several objects can be called that). And implementing a new data structure for a thing as simple as a button seems like overkill. But is it really?

Let's get to the main point then: no it's not hard, it's not a lot of code. I'm using LiveScript here, which makes the code more succinct, but you could implement a circular list in plain JS with only a couple more lines of code I think. My implementation looks like this:

{first, last} =  require "prelude-ls"

make-circular = (...values) ->
    objects = (for let val, i in values => {val: val, next: -> objects[i+1]})
    (last objects).next = -> first objects
    first objects

circular = make-circular 1, 2, 3

for _ to 10
    process.stdout.write circular.val + ", "
    circular = circular.next!

# prints:
# 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2,

What happens here? We take a number of values, then we wrap them with linked list nodes. We connect the last node to the first and return the first node.

There are some noteworthy details in this implementation. First is the use of for let construct - it's fairly standard nowadays, but still important for the implementation, as it wouldn't work at all with the let part removed.

The second thing is the fact that next is a method and not an attribute. In a lazy language we wouldn't need to do this, but LiveScript is evaluated eagerly. Unlike Haskell, LiveScript would try to evaluate the expression objects[i+1] immediately. But at that point the objects array doesn't exists yet (or rather, is not initialized yet), so evaluating the expression would cause an error. By making next a method, we defer evaluation of this expression to a later time, when the objects array already exists, thus avoiding the problem.

Back to the previous question, then: is it overkill to use a circular linked list in JS, even for a simple, 3-state buttons? No! It's just 4 lines of code! At that line count you can just copy&paste this snippet between your projects without worrying too much, in my opinion.


A simple chat example in Racket

NOTE: All the code is available on GitHub

A few posts back I wrote a very minimal chat server in Racket. It went almost too well and I decided to do some more work on it. After a couple of evenings, it transformed into something bigger, although I'm not sure if it's better than before.

The thing is built with Racket on the backend and LiveScript, Ractive and Bacon.js. It's interesting (at least to me) because it uses a feature of Racket that's not easy to do in other languages: continuations. Implementation of a long-polling strategy was much easier thanks to this. Actually, to switch from polling to long-polling, I just had to move a single line of code: a single sleep function call. I'm not joking - for each connection (request) the client needs only to accept an url as a response and immediately query that url, in a loop. The rest, including a spin-lock for waiting on data, is implemented in backend Racket code.

The core of the implementation looks like this:

(define messages (box (list)))

(define (add-message! msg-data)
      ([msgs-list (unbox messages)]
       [new-msgs (cons msg-data msgs-list)])
    (match (box-cas! messages msgs-list new-msgs)
      [#f (add-message! msg-data)]
      [#t #t])))

(define (check-messages req)
  ;; The client just connected so we need to send him all messages to initialize
  ;; its state. (NOTE: this is thread-safe as it's just a read and the data is
  ;; immutable (and in a box))
  (send-resp! url
    (make-json-response (get-messages) url))

  ;; Then we loop endlessly, polling for changes in messages. When a change is
  ;; detected, we send the new list of messages to the client. We also make the
  ;; client reconnect once every 30 seconds, because of some weird "security"
  ;; features in Chrome.
  (let loop ([local-messages (get-messages)])

    (define new-messages null)
    (define time-elapsed 0)
    (define sleep-time 0.2)

    (let inner-loop ()
      ;; A simple spin-lock of a kind. This should be rewritten to use some
      ;; kind of event bus/queue.
      (sleep sleep-time)
      (set! time-elapsed (+ time-elapsed sleep-time)) ; not good! move to rec call

      (if (or (> time-elapsed 30)       ; there are some timeouts in FF and
                                        ; Chrome for XHR connections, we need to
                                        ; close ours before that happens
              (not (equal? local-messages (get-messages))))

          ;; then
            (when (debug-enabled?) (pretty-display req)) ; should be a logger call instead
            (set! new-messages (get-messages)))

          ;; else

    (send-resp! url (make-json-response new-messages url))
    (loop new-messages)))

(define (posts-list req)
  (match (request-method req)
    [#"GET" (check-messages req)]
       (add-message! (parse-request req))
       (resp! #"OK"))]))

The front-end implementation to cope with this was rather hard, because I wanted it to be general enough to work with many possible polling strategies, and also I wanted to make it compatible with Bacon.js for use as an EventEmitter. The code for this is less-than-elegant and I had to use the dreaded type annotations (in comments, ofc: Every sufficiently advanced Lisp implementation contains an ad-hoc, badly-specified, bug-ridden implementation of half of Haskell's type system. and all that). Otherwise I couldn't figure what the hell is going on.

Please note that the retry implementation came latter, after the main hurdle was overcome, which is why it's not annotated. It's also interesting in its own right, though. It looks like this:

### A bacon.js-compatible EventEmitters, or rather EventEmitters builders.

require! {q, qajax, baconjs}
qxJSON  = -> qajax(it).then(qajax.toJSON)

# config :: {url: <string>, ...}
# checker :: (data -> config -> config)
# simple_checker :: (data -> unit) -> checker
simple_checker = (callback) ->
    (data, config) ->
        config.url = data.url

retry = (url, base ? url) ->
    tries-count = 0
    res = q.defer()

    _inner = (url) ->
            .then res~resolve
            .fail (reason) ->
                tries-count := tries-count + 1
                if tries-count < 5
                    setTimeout (-> _inner(base)), 5000

# meta_checker :: string -> checker -> (config -> promise)
meta_checker = (callback) ->
    # _checker is going to be called many times with different urls, but it
    # should remember its first url as a "base" for use with retry
    original-url = null

    _checker = ({url}) ->
        unless original-url
            original-url := url

        unless url
            return callback(new baconjs.Error("No URL provided!"))

        retry(url, original-url)
            .fail ->
                callback(new baconjs.Error("The server died"))
            .then (callback _, config)
            .then _checker      # connect again and wait for more data

exports.meta_checker = meta_checker
exports.simple_checker = simple_checker

It wasn't all rainbows and unicorns, unfortunately. In fact, I encountered a problem which I wasn't able to solve yet. The problem is that the Racket backend uses too much CPU time. Despite the fact that there's nothing happening, sometimes CPU consumption spikes. The spikes become higher and higher, eventually reaching 100% and blocking one of your cores. I set up a very basic script which probed CPU utilization of Racket process each second over some time, the results look like this:

Looks a bit strange, right? Like it maybe has something to do with garbage collection, or maybe some other essential periodic background task. I don't know if ever get the time to properly debug this problem. I'll try asking for help on a mailing list.

Still, the most important part is that I had a lot of fun coding this thing. The rest of the code is on GitHub, it's reasonably well-written (if I can say so myself), but not thouroughly commented. It could be of use for anyone who'd like to see how to integrate JavaScript (frontend, in general) with a continuations-based server. They may not be very popular, but they do exist, for example as Seaside in Smalltalk or Nagare in Python, so creating a strategy for dealing with them on a frontend may be valuable.

I'd like for the code to also be useful for learning one or both langauges, but I'm not sure if it's working. You'd need to read for yourself.


Better (mode-line) for StumpWM

As I was using StumpWM for the past week or so, I realized that a desktop completely devoid of any "widgets" is not my kind of thing. I don't want to run six separate programs for monitoring different things like display backlight, network connection and so on. I want this information to be always available somewhere. Vanilla StumpWM comes with no GUI at all, save for a simple input box and echo area, but you can enable something called "mode-line" (the concept is borrow from Emacs).

Well, the mode-line by itself is just a simple gray bar on top or at the bottom of a screen. You need to configure it for it to display anything useful.

So now I have, on the left side of a modeline: a clock, groups (desktops) and windows in current desktop.

On the right side I crammed a couple of indicators: volume, backlight intensity, WiFi signal strength and battery status:

And here's the code for all this, you can paste it directly to your .stumpwmrc

(load-module "cpu")
(setf *window-format* "%m%n%s%c")
(setf *mode-line-timeout* 0.7)

(defun get-volume ()
   (string #\newline)
   (run-shell-command "amixer sget Master | awk '/^ +Front L/{print $5}'" t)))

(defun get-backlight ()
  (car (split-string (run-shell-command "xbacklight" t) ".")))

(defun get-signal-strength ()
   (concat "." (string #\newline))
   (run-shell-command "awk 'NR==3{print $3}' /proc/net/wireless" t)))

(defun get-battery-status ()
  "Needs the upower utility to be installed in the system, i.e.

$ sudo dnf install -y upower"
   (string #\newline)
   (run-shell-command (concat
                       "upower -i /org/freedesktop/UPower/devices/battery_BAT0 "
                       "| grep perc "
                       "| awk '{print $2}'") t)))

(setf stumpwm:*screen-mode-line-format*
      (list "^2%d^]    ^1[|>^] %g    ^1[|>^] %W "
            " ^> "                      ; remaining elements become left-aligned
            "^2%c %t^]  "               ; CPU usage indicators, from load-module
            ;; my own indicators:
            "   Vol:"   '(:eval (get-volume))
            "   Disp:[" '(:eval (get-backlight))
            "]  WiFi:[" '(:eval (get-signal-strength))
            "]  BAT:["  '(:eval (get-battery-status)) "]"))


Replacing KDE with a lightweight WM in Common Lisp

Many exciting things have been happening recently; I have another post on Racket web server and of course Walkfiles Challenge. But I feel that this is the most unexpected thing that happened.

It was like this. In my Emacs I have C-M-k keys bound to kill-sexpr, which is rather important. A default configuration of KDE on my Fedora 22, however, decided that it wants this particular key for itself. Pressing the keys would pop up a keyboard layout switcher. I tried unmapping and remapping this binding in KDE, but without much success. The best I could do was to disable it for one session and then again after every restart.

I decided that it's enough and started looking for an alternative for KDE. On an Arch Wiki, I stumbled upon a "StumpWM" project, which apparently was written in Common Lisp. I was contemplating learning some Common Lisp recently, too, so I thought I may as well try to use it.

First five hours were excruciating. The damned thing comes without anything resembling "normal" configuration. When you start it - I had to add a file to /usr/share/xsessions/ - it offers you an empty screen. Nothing more. Nothing at all. If you don't know that you must press "C-t h" to start learning the commands you're screwed.

But, after some initial struggle I managed to connect SLIME to a running StumpWM instance. Nice! Now I have an REPL and can hack directly on my running WM! I can also search for docs, inspect values and more.

Being able to search doesn't help if you don't know what you're looking for, so it took me a couple of hours to find all the fundamental commands. A working "Go to definition" was a huge help.

So, after the initial struggle I started to think how to replicate the functionality I wanted in StumpWM. I need a working "Alt+Tab" (window switcher) and some virtual desktops for grouping windows. Turns out StumpWM doesn't have a fixed set of virtual desktops; you create them as needed. They are called "groups" and you give them a name, and then you add windows to it. The only problem is that they are not ordered in any particular way - I liked a grid-like metaphor of vdesktops in KDE before.

Binding "Alt+Tab" was rather easy, too. What is not obvious to me right now is how to make StumpWM cycle though possible frame configurations. But I'm sure I'll get there eventually...

UPDATE: A few days later...

I'm mostly satisfied with the experience. It does deliver on its promise of a live, Lisp-based environment. It's completely hackable during runtime. The ability to connect SLIME to a running instance of your Window Manager is somewhat magical, especially if you know of slime-connect.

StumpWM did die on me a couple of times, but it was a direct result of me evaluating something stupid in an REPL. The lack of documentation is somewhat remedied by fabulous introspection tools and a nice amount of docstrings. The codebase is not tiny, but not yet big: don't expect to read it in one sitting, but you can do this over the weekend.

Other than that, it does what it's supposed to do: it manages windows. After writing some 200 sloc, I've got it do be mostly compatible with how I use a split-screen functionality in both Emacs and tmux.


Racket - still as beatiful as ever!

Some time ago I learned and used Racket for a bit. It was very pleasant experience, even if we take into account that it was my first lisp ever. It has tons of very interesting features, which make it one of the most expressive and elegant languages and environments.

Last weekend I wanted to write something, anything, in Racket. In the previous week, I was working on Chicken Scheme project, and I came to miss Racket conveniences (although Chicken has many of them, too, as packages).

In the end, I wrote a "simple chat server". It's very basic stuff: we have a list. Each time a request comes in it is added do the global queue. Additionally we have an endpoint that returns the current contents of the list.

Writing something like this is trivial with Flask, for example. Doing this in Racket is not very hard either, and we get a couple of nice things like multicore parallelism and continuations.

The code is without comments, but it's cleaned up and should be readable by anyone with a bit of lisp familiarity. It looks like this:

#lang racket
(require json)
(require rackjure/threading)
(require web-server/servlet

(define *messages* (box '()))

(define (add-message! msg-data)
      ([msgs-list (unbox *messages*)]
       [new-msgs (cons msg-data msgs-list)])
    (match (box-cas! *messages* msgs-list new-msgs)
      [#f (add-message! msg-data)]
      [#t #t])))

(define (create-response/text content)
  (define headers null)
  (response/full 200 #"OK" (current-seconds)
    (list content)))

(define (my-dispatch req)
  (match (request-method req)
    [#"GET" (~> (unbox *messages*)

    [#"POST" (begin
                (~> (request-post-data/raw req)
               (create-response/text #"OK"))]))

(serve/servlet my-dispatch
  #:launch-browser? #f
  #:servlet-path "/post/" #:listen-ip "" #:port 8081
  #:extra-files-paths '("/home/cji/poligon/lanchat/frontend/")
  #:server-root-path "/home/cji/poligon/lanchat/backend/")

You run it normally, with Racket:

    $ racket server.rkt
    Your Web application is running at http://localhost:8081.
    Stop this program at any time to terminate the Web Server.

Power over intuition

NOTE: This post is about text editors. You were warned.
NOTE: I uploaded missing image on 2015-08-01

There is a reason why most programmers don't use Notepad for writing code. While Notepad is perfectly fine text editor, it lacks many features programmers expect from their editor, specialized in editing source code.

What is a difference between Notepad and, say, Emacs? It's relatively simple: Emacs is extensible, while Notepad is not. The same can be said for Sublime, Komodo, PyCharm - they all expose APIs for use with custom scripts, allowing for extending their base functionality.

What is a difference between Other Editors and Emacs, then? I'd say ease of extension and a huge set of already created plugins. While the second claim can be verified by visiting melpa.org we still need some evidence for the former claim.

Case study, then: I wanted to be able to quickly open a named buffer with given extension. In other editors you usually get this feature via File->New... menu, where you can choose file type. I don't like menus, so I'm going to map it to some key combination, but you could create a menu for these inside Emacs, too.

I decided to use C-n C-[something] key combination. [something] represents a type of the file, so the mnemonic is: n(ew) p(python), n(ew) t(ext), etc. for other types.

Without worrying about the details, the new bindings are defined like this:

(defvar my-new-buffer-map (make-sparse-keymap))
(global-set-key (kbd "C-n") my-new-buffer-map)

;;                   type            key       file ext     default path
(make-buffer-opener text         (kbd "C-n")    ".txt"     "~/todo/")
(make-buffer-opener org          (kbd "C-o")    ".org"     "~/todo/")
(make-buffer-opener python       (kbd "C-p")    ".py"      "~/poligon/python/")
(make-buffer-opener emacs-lisp   (kbd "C-l")    ".el"      "~/.emacs.d/")
(make-buffer-opener artist       (kbd "C-a")    ".art.txt" "~/poligon/")
(make-buffer-opener livescript   (kbd "C-j")    ".ls"      "~/poligon/lscript/")

The make-buffer-opener is a little macro I wrote, nothing interesting there. It works well, I can press C-n C-a to get a fresh artist buffer for example. I can also press C-n C-h to get a list of options:

    Global Bindings Starting With C-n:
    key             binding
    ---             -------

    C-n C-a         my-new-artist-buffer
    C-n C-j         my-new-livescript-buffer
    C-n C-l         my-new-emacs-lisp-buffer
    C-n C-n         my-new-text-buffer
    C-n C-o         my-new-org-buffer
    C-n C-p         my-new-python-buffer

But that's not ideal. We can do much better, and without much hassle. First, I had to change the macro slightly, so that it stores openers in a list. This list will be used as a list of choices provided to helm. The code looks like this:

(setq my-new-buffer-helm-source
      `((name . "HELM at the Emacs")
        (candidates . ,my-openers)
        (action . (lambda (candidate) (funcall candidate)))))

(defun my-new-buffer-helm ()
  (helm :sources '(my-new-buffer-helm-source)))

(global-set-key (kbd "M-n") 'my-new-buffer-helm)

And it looks like this in action:

So, what happened here? I just extended my editor to include a feature it didn't have earlier. I added a bit of an UI to the command so that selecting proper buffer type is easier (with helm you can filter the list and navigate through it).

The most important part is how easy it was to do.


Walkfiles Challenge: OCaml!

But before that a bit of explanation of what's going on.

Writing OCaml version was hard. I know the language much better than Nim. I'm also biased towards functional programming. This made me want to show the prettiest OCaml code ever. I wrote at least three different versions of the program. One optimized for speed, the other trying to be as short as possible and the last, finally, focused on the "feel" of the language. It took much more time than writing a Nim version. With Nim, I was more or less satisfied - because I didn't know better, not because the code was that good! - with the first version of the code so I just posted it after a bit of cleanup.

Because of all this I don't have the code commented as much as I'd like it to be. I will, I think, come back to it later, after I do some other solutions. Honestly, I've got enough of OCaml for a couple of days (maybe even a few weeks) :-)

Still, OCaml is a beautiful language. It's a pragmatic tool which tries to support you no matter the style you're using. It has imperative constructs and an object system. I didn't use these. I wanted the code to be easily comparable to the Nim version. I don't know if it worked, we'll see after a couple implementations more.

(* -*- mode: tuareg -*- *)
open Unix;;                     (* low-level OS and filesystem related stuff *)
open Printf;;                   (* easy to guess what it is, right? *)

(* like an option type, but also holds some info about what broke (useful for
debugging in this case) *)
type 'a either = Success of 'a | Fail of string;;

(* a simple record to hold directory children along with their count *)
type dset = { files : string list; count : int };;

(* this is safe because of immutability of dset type instances - there's no
way to modify this after its definition. *)
let empty_dset = {files = []; count = 0};;

let dset_add d path = {files = (path :: d.files); count = (succ d.count)};;
let dset_merge d1 d2 = {files = (d1.files @ d2.files);
                        count = (d1.count + d2.count)};;

let is_real_path = function
  | "." |  ".." -> false
  | _           -> true ;;

(* an option is easier to deal with than exceptions *)
let stream_pop stream =
  match Stream.peek stream with
  | Some value -> Stream.junk stream;
                  Some value
  | None       -> None ;;

(* we don't really *need* the stream, we're going to consume it whole anyway.
It's just that a Stream is a convenient abstraction for how open-, read- and
closedir work. *)
let stream_to_list s =
  let rec _loop output =
    match stream_pop s with
    | None -> output
    | Some v -> _loop (v :: output)
  _loop [] ;;

let dirstream dirname =
    let dir = opendir dirname in
    let generator _ =
      try Some (readdir dir)
      with End_of_file -> closedir dir;
                          None (* meaning there's no more elements *)
    Success (Stream.from generator)
  with Unix_error(code, func_name, arg) ->
    Fail (sprintf "Unix_error: %s (arg: %s) in %s"
                  (error_message code) arg func_name) ;;

                                Actual solution

let ls path =
  match dirstream path with
  | Fail msg         -> empty_dset
  | Success children ->
     (stream_to_list children)
         |> List.filter is_real_path |> List.map (Filename.concat path)
         |> List.fold_left dset_add empty_dset;;

let descendants root =
  let rec _loop remaining output =
    match remaining with
    | [] -> {output with files=(List.sort String.compare output.files)}
    | current :: tail ->
       let current = ls current in
       _loop (tail @ current.files)
             (dset_merge output current)
  _loop [root] empty_dset;;

let children = descendants (Sys.getcwd ()) in
    children |> (fun x -> x.files) |> List.iter print_endline;
    print_int children.count; print_newline ();;

Walkfiles Challenge: Nim!

NOTE: I realized that I'm just bad at writing prose, and decided to leave this post as a prettiest Nim code, but without any commentary.
import os                       # filesystem-related stuff
import algorithm                # sorting
import strutils                 # string formatting

const MAX_PATH_LEN = 40         # completely arbitrary number!

  DirSet = tuple[res: seq[string], count: int]

# straight from C lib
proc getcwd(cstring, int) : cstring {.importc: "getcwd", header: "<unistd.h>".}

proc isRealPath(fname : string) : bool = fname != "." and fname != ".."

# both get the first element and delete it from a collection. This works with
# anything that defines [] and del operations.
template pop(s : expr) : expr =
  let tmp = s[0]

proc currentDirectory() : string =
  var char_buf: array[MAX_PATH_LEN, char] # a chunk of memory like in C,
                                          # allocated on stack and auto-freed
  discard getcwd(char_buf, MAX_PATH_LEN)  # getcwd writes into supplied buffer

  return $( char_buf ) # $ for type array[char] (alias type cstring) returns a
                       # memory-safe, freshly allocated string

# Main code:

proc listDir(path:string): DirSet =
  if not path.dirExists:        # if path is invalid (not existing or not a dir)
    return (res: @[], count: 0) # it can't have any children

    count = 0
    output = newSeq[string]()

  for kind, subpath in walkDir(path): # walkDir(p) is an iterator from os
      if subpath.extractFilename().isRealPath:
      count += 1

  result = (res: output, count: count)

proc descendants(root: string) : DirSet =
    count : int
    output : seq[string] = @[]
    remaining = @[root]         # let's start with the dir we got from caller

  while len(remaining) > 0:
    var current = remaining.pop()
    # note: add with empty seq is a no-op
    let (kids, c) = listDir(current)
    if c > 0:
      remaining.add(kids)       # save all the files for visiting later
      count += 1                # also count dirs while we're at it

  result = (res: output, count: count)

when isMainModule:              # only compiled if not to dll
  var (results, dircount) = currentDirectory().descendants
  algorithm.sort(results, cmp)    # cmp is from system module

  discard results.map(proc (x):string = echo x)
  echo "We've counted $1 directories." % [$(dircount)]

Walkfiles - some introductory explanation

NOTE: All the code is already on github: GitHub repo

What are we going to build

The thing I want to build is a CLI program which gives the same results as a following command:

find $(pwd) | sort

You can see an example of results we want to get here:

Such task and requirements make it easy to automatically check if all the programs are correct, just by running a diff over each program output and find results for the same directory.

In the implementations I don't want to use ready-made functions, like Python os.walkfiles(), because not all languages have them. I found that the only API that is universally supported everywhere is opendir(), readdir() and closedir() set of functions from libc, so I tried to use those everywhere either literally or via language-provided (thin) wrappers.

What happens when we're done

Each language has its strengths, but also weaknesses. I'd like to compare the languages and think about which of them is best suited for what tasks. It's not about which language is better in general: they are all cute and I love them all!

What are we going to compare, anyway?

Coolness comparison

By far the most important metric, saying how fun developing a piece of code was. It's very subjective, of course, but fun is very important, I don't want to use something which feels dull. Spoiler: sorry, Haxe.

Expressiveness comparison

This is about how much code you have to write to solve a given problem or how much boilerplate code you need to write to implement some common idiom. It's of course highly subjective too, but here at least I can paste a graph:

Performance comparison

How long it takes a program to execute once, from cold start to finish.

This, at last, is objective... well, mostly. It's a terrible benchmark, particularly hard on JITed implementations, but it corresponds well with the way CLI apps are used. Another nice graph follows (run in a directory with 4.5k descendant elements):


Observed differences on these metrics between languages are not that huge, but they are visible. I tried coding as idiomatically as possible in all languages, except for C++11, where I just went with what felt good and it worked.

For many languages I implemented the app more than once, trying to optimize them a little, or - in one case - trying to make them slow down a little. But that doesn't mean that any of the implementations actually is representative for the language! Everything I'm saying is only about this particular case


Pretty plotting

I wanted to generate some kind of a graph based on a few numbers. My first thought was to try Pythn and matplotlib, but I was unable to quickly - in less than 5 minutes, say - find a relevant example.

After another 10 minutes I had a working script, doing what I wanted. I remembered that Racket has this absolutely great plotting package built-in and that it was rather fun to work with it, so I gave it a go, and the results were ok. I then spent another couple of minutes cleaning the script a bit, here it is:

#lang racket
(require plot)

(define (pt-cmp x y)
  (< (vector-ref x 1) (vector-ref y 1)))

(define-syntax-rule (sorted-points . pts)
  (sort (list . pts) pt-cmp))

(define (plot-times data)
  (parameterize ([plot-x-tick-label-anchor 'top-right]
                 [plot-x-tick-label-angle 30])
    (plot (discrete-histogram data #:y-min 0 #:y-max 0.5)
          #:y-label "time (s)" #:x-label #f)))

  #(main-haxe-cpp 0.0764)
  #(main-dylan 0.35676)
  #(main-nim 0.04616)
  #(main-ocaml 0.05328)
  #(main-ocamlopt 0.01532)
  #(main-haxe-neko 0.32184)
  #(main-py 0.05476)))

I'm pasting it here as a preview of what's comming in the next post: analysis of what are the interesting parts of the programs and how I'm going to compare them. As for performance - I'm puzzled by Python score, but the rest is more or less as expected.


Walkfiles - different languages, different approaches, same effect (mostly)

NOTE: This is a work in progress, it will be updated with all the details later.
NOTE: this is going to be a series of posts
NOTE: the code is available on GitHub
NOTE: click on "<->" (top-right corner of mainsection) to get fullscreen mode

I lost my Internet connection for full 3 days. Moreover, those were days off from work. I was quickly becoming bored and tried to think of something fun to do, which could be done without help of Google. Unexpectedly it turned out that programming may be such a task... If you've got the docs downloaded earlier, of course. I had a bunch, for some reason, so I decided to - surprise! - learn some programming languages.

Learning about differences and similarities between languages is the most fun way of learning new languages, and it also helps you better utilize the tools your main language gives you, so I went that route.

Implementation languages

  • Python
  • Nim
  • OCaml
  • Dylan
  • Haxe
  • Io, Pharo, LiveScript

I wanted to experiment with strongly, statically typed languages, because of both performance and maintainability. I especially wanted to try Nim and Haxe, then remembered about OCaml, so I included it too. There are more languages on the list, they're there just for fun. Some introduction to the languages, with reasons of why I chose them:


Currently my main language, so it was an obvious choice for a reference implementation. There is one interesting thing about this implementation: it's unexpectedly fast, faster than some of the compiled solutions!

  • compiled to single binary
  • able to link libraries statically
  • static typing with inference
  • supports macros
  • support for run-time polymorphism (multimethods)
  • garbage collected, memory safe, but with access to unsafe operations
  • whitespace sensitive syntax
  • multiparadigm language with procedural programming at its core

Theoretically very interesting bag of features. In practice I think it needs some more polish to make all the features fit together better.

  • compiled to single binary
  • static typing with full Hindley-Milner inference
  • supports structural subtyping via objects and polymorphic variants
  • garbage collected, memory safe
  • multiparadigm: functional as base and imperative and OO added to mix

It's an old and well-known language, but it is still evolving with features being constantly added and new exciting tools being created. For example there's now a package manager (integrated with something similar to Python's virtualenv), OPAM, which greatly simplifies building OCaml projects.

  • compiled to a single binary
  • optional static typing
  • garbage collected, memory safe
  • multiparadigm: functional and OO
  • uses multiple inheritance and multimethods
  • supports macros
  • very dynamic nature of a language, with late binding of everything

Compiles to a native binary, but provides a huge amount of dynamic runtime flexibility. This comes at the cost of being somewhat slower than other compiled languages. Dylan is also essentially an abandonware, with very little resources for development, so go help if you can.

  • compiled to a single binary
  • a wide range of other compilation targets
  • static typing with full type inference
  • garbage collected, memory safe
  • multiparadigm: functional and OO
  • supports macros
  • great both compile and run-time flexibility

Its main focus is on developing for many platforms simultaneously. This makes the compiler able to compile to many different targets like C++ or PHP. The language itself is interesting, although its designer chose to stick with the most classical syntax possible.

Io, Pharo, LiveScript

Just for fun and to have a comparison with the most dynamic and flexible environments on the planet. They are slower, but Io code is the shortest. Pharo code is hard to present properly, so I will skip it.

That's it - next in the series: presenting one of the implementations.


Elements of Style

Just a quick quote - another one - about writing skills in programming.

The Elements of Style is not exactly a development or coding book but a book on writing. To be a great developer you need to communicate clearly, simply and directly. Strong writing skills are essential to success. The book is just 100 pages long and you can read it in one evening. Re-read it every couple of months for full effect.

What kind of redundancy in code is good?

This is a tricky question. Of course, we all know that we should follow DRY and not repeat ourselves. On the other hand, the way human cognitive processes work makes repetition the best way of learning and remembering; putting things where you look is one of the greatest time savers when reading code, especially if the "thing" you place there is some out-of-band context information, which would be hard to infer from current code alone.

A related question is that of terseness vs. verbosity. Many young programmers want to write the shortest code possible, probably because it's fun. It's also harmful to the projects they work on and it's actively hostile to other programmers working on the same projects.

The amount of literature on the topic is astounding - both redundancy and terseness in the code are important parts of maintainability of a software project, and software maintenance is one of the most researched topics in Software Engineering. That's because maintenance is the most expensive, and the longest, stage in software lifecycle: can't argue with that.

Let's start by repeating a very famous quote, taken from a legendary book on programming, which every programmer should read:

Programs should be written for people to read, and only incidentally for machines to execute.

If you think about it it's obvious: once the code is written it's going to be modified many, many times, probably by different people. And every maintenance programmer (this could be you, remember!) needs to understand the code he's going to work on, starting from zero knowledge of the system.

To understand a piece of code, you need to do the work of a compiler and linker: you need to resolve all the symbols used, assign some meaning to them. Unlike a compiler, however, your brain has a limited number of things it can track simultaneously. IDEs and other tools help to reduce the burden on code readers by adding tooltips, class diagrams and similar things, but there's only so much information an algorithm - well, at least until something passes Turing test - can extract from the code without help from a programmer.

In dynamic languages and/or in the absence of such tools, you're left with a huge pile of unfamiliar names, lacking most of the context and jumping frantically all over the project hunting for functions definitions. Sooner or later you're going to either give up and start guessing or to get a sheet of paper and pencil and start taking notes. I've compiled many a notebook with such notes for various projects: you wouldn't believe just how much information you need to track to understand even simple code.

This doesn't need to be like this. The code which is that hard to fully understand should be considered a bad code, plain and simple, and it shouldn't even be committed in the first place; instead the code should include all the information needed for its understanding in itself, neatly organized and easily reachable. As the highest deity in programming pantheon wrote:

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

Yeah, exactly, let's do this! But how? And how does it relate to the questions stated at the beginning?

It's easy to see that we should optimize our way of writing code for ease of understanding first. To understand in this case means to learn a bunch of things and to remember them long enough to be able to extract all the meaning you need from a piece of code. As it turns out, the processes of learning and remembering are not exactly unknown to science.

First, all the relevant data should be easily reachable. This is because human brain - in addition to being limited in terms of things it can track at once - has a rather sophisticated process of storing things in memory. The most important feature of this process is that short-term memory acts like a cache and is being flushed after 15 to 30 minutes if I recall correctly. It's also, unfortunately, an LRU cache: if you add too many things at once it's going to silently drop some of them. This means that by making a reader of your code waste his time on searching for relevant information you're hindering his ability to understand your code.

This is why you should provide relevant info close to the point where it's needed. But, what if it's needed in a couple of places? Wouldn't repeating it in all these violate DRY? No, it wouldn't, if done correctly. There are so many tools for providing information everywhere it's needed without repeating yourself that it's really easy to do so.

But it's still going to be redundant. A comment stating that this or that function is meant for use with this or that module indeed is redundant: you could find the same information with grep, right? Not quite: you won't EVER find the intention behind the code if it's not explicitly written somewhere (which is why you should ALWAYS explain what you think your code is doing); but you certainly could learn where the function is currently called from that way. Avoiding redundancy in this case, however, has a huge cost which is that it makes every maintainer waste his time on grepping, then looking through the hits and trying to understand what is it used for, then getting back to original code only to realize he needs to start almost from scratch, because he already forgot some other important piece of data.

In short: redundancy makes editing code harder, but eases its understanding. It's a continuum and we should balance the need for readability with the ease of editing. Very redundant, repetitive code is hard to work with, but the code without any repetition is very, very hard to follow.

It's also very important to know what things should be repeated or restated, and which repetition is bad. It's kind of easy, actually: if you find yourself repeating something verbatim you know there's something wrong and you need to refactor. On the other hand, if you repeat something stated already, but use different way of communicating it, it has a high chance of being a good kind of repetition. Why? Cognitive science explains it: different people can have very different cognitive patterns and trying to learn from description which does not fit well with a person way of thinking is crazy inefficient.

Just yesterday I was editing some django-tastypie resource and I saw a bad kind of redundancy - the one which makes editing harder but doesn't make understanding easier. I decided to rewrite it in a "good redundancy" style to hopefully show the difference. Here is how it looked before:

def prepend_urls(self):
    return [
        url(r"^(?P<resource_name>%s)/login%s$" %
            (self._meta.resource_name, trailing_slash()),
            self.wrap_view('login'), name="api_login"),

        url(r'^(?P<resource_name>%s)/subscribe%s$' %
            (self._meta.resource_name, trailing_slash()),

        url(r'^(?P<resource_name>%s)/logout%s$' %
            (self._meta.resource_name, trailing_slash()),

        url(r'^(?P<resource_name>%s)/forgot_password%s$' %
            (self._meta.resource_name, trailing_slash()),

        url(r'^(?P<resource_name>%s)/activate%s$' %
            (self._meta.resource_name, trailing_slash()),

        url(r'^(?P<resource_name>%s)/remove%s$' %
            (self._meta.resource_name, trailing_slash()),

        url(r'^(?P<resource_name>%s)/deactivate%s$' %
            (self._meta.resource_name, trailing_slash()),

This code is repetitive, no two ways about it. It creates a list of url objects by calling the url() function. Notice that most of the arguments in almost all the calls are identical, yet they are repeated this many times!

How would you go about adding another url to this list? You'd probably copy the last call and paste below, then you'd edit it, possibly forgetting to change one of occurences of the endpoint name (notice that not only the structure of a call is repeated, but there's also repetition in each call!) and committing broken code to the repo.

This is the bad kind of redundancy.

def prepend_urls(self):
    # example generated url pattern: '^(?P<resource_name>user)/deactivate/$'
    # a view function associated with this pattern needs to accept a single
    # kw argument called "resource_name"
    make_resource_re = lambda view_name: r'^(?P<resource_name>{})/{}{}$'.format(
        self._meta.resource_name, view_name,
        trailing_slash() # returns "" or "/" depending on Django settings

    # Most of the parameters to url() function here are either the same for all
    # calls or possible to construct from other arguments (specifically, the
    # internal URL name is a view name prefixed with string "api_"), so we
    # define helper to avoid writing them all by hand multiple times.
    make_url = lambda view_name, url_name=None: url(
        make_resource_re(view_name), self.wrap_view(view_name),
        name="api_{}".format(url_name or view_name)

    return [
        make_url("forgot_password", "api_logout"),

        # TODO: For some mysterious reason this endpoint doesn't follow the same
        # convention as the rest of them. Would be good to make this consistent.
        # -- [2015-04-23]
        url(r'^(?P<resource_name>%s)/subscribe%s$' %
            (self._meta.resource_name, trailing_slash()),

This version is highly redundant, too. For example, you certainly could find the definition (or docstring) of trailing_slash() function, so the comment explaining how it works is redundant. Same goes for an example of return value of make_resource_re() function: you can as well paste it into the REPL and get this information that way. You could also just execute this function in your mind, yielding the same effect. And of course, including a date when a comment was made is obviously redundant, because you could just git blame the file.

Still, this is the good redundancy. All the details explained in the comments are relevant to the task at hand: generating a list of routes from urls to view functions. When working on this code it's very probable that you'd have to check all these things anyway - especially if you're new to the codebase. Having those things stated - re-stated, even - in the same place they're being used saves a lot of time.

There are many shades of grey and many different situations. I'm not saying you should write all the code this way, especially if it's evolving rapidly. Also, please note that this uses probably the simplest possible technique for providing related informations to the reader - adding comments - and that there are many, many more such techniques, some much more sophisticated than this.

To summarize: some repetition is good for learning and remembering. Providing alternate explanations is good for understanding. Making people - on the other hand - search endlessly for any piece of info is not good programming. It's just being a dick.

And what about terseness? Well, terseness is OK unless it obscures the program structure or meaning. Raganwald has a great post on the topic. In general, the shorter/terser you make your code - without changing algorithms or refactoring, so by just linguistic means - the more of a context you make a reader remember. Which has all the problems described above.

The problem here is that, after some time working with such terse code, we internalize all the needed data and we see nothing wrong with it. It's a trade-off, of course, too: sometimes terseness lets you be more productive, because it makes editing the code easier (for you; remember that it's not universal). But terseness in itself is not good. I know it's hard to convince juniors to this, so hopefully an example will help.

There's this language, called J, which is a descendant of APL. If you're a junior I bet you never heard about them: they're frequently used in scientific computing and in finances, among other fields; they're so-called "array (based) languages", which combine simple semantics with crazy mad parsing rules to create a language as terse as possible.

Now, even if you're a junior with only a couple languages known, you should agree that you'd have no problem understanding a FizzBuzzBazz solution written in another language. For example, if you know C, you'll probably easily understand the solution in Pascal or Java. The structure of those languages is readily visible without much effort; you don't need to know anything other than where to find reference manual for that language.

Ok, so here is a reference for J, and this is - idiomatic, high-level, completely normal in J! - solution to FizzBuzzBazz:

g1=:(,&'zz')"1&gt;;:'Fi Bu Ba'
((0=fu i.101){"1(3 1$&lt;''),.(,.&lt;"1 g1)),&lt;"0 i.101

There's a screenshot of an output J produces for this code (ok: of part of the output, but I promise I works all the way up to 100)

if you'd like to see for yourself that it works.

Let me know how long it took you to unravel this. Would you want to work on a codebase written like that? No? Then stop making others work with your "as terse as possible" code. For me - and any other J programmer - this solution is obvious, although not the prettiest (I do like creating an array of functions and applying it instead of the functions themselves, though. Care to guess which several characters I'm talking about?). For you it's completely undecipherable at first sight and would take a long time to understand with manuals and such.

So, next time you think about unneeded verbosity, think harder, and think about who is it who doesn't need it. Is it just you? Then leave it, or even expand, making code more verbose and easier to follow. Go nuts if you work with a team of competent J programmers, though - it's all about a target audience, mentioned different cognitive patterns, information density and so on.

I'd like to finish with another famous quote; I'll leave it without a comment, as it doesn't really need one:

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

That's it for today. Have fun and write readable, understandable, maintainable code from now on!


How to learn

That's a question I was asked recently. It took me by surprise: I learned socket APIs as one of the first things in programming, probably nearly 20 years ago, I can't really remember what did I use then.

I had to go on a hunt for things I saw and read over the years and to try to compile a short list of useful resources for beginners. I assume Python knowledge and some C knowledge. This is what I came up with:

  • "Unix Network Programming" by Richard Stevens and others is probably the most complete reference I ever read. The author is the same person who wrote "(Advanced)Programming in the Unix Environment", the other very important book for everything OS-related in Unices - I'd say it's a Unix version of "Programming Windows" by Charles Petztold. It's using C as a language, but the APIs are almost identical in both languages.
  • There's a very interesting tool called ScaPy, which allows you to construct, deconstruct, send and receive raw TCP packets. It's great for playing around in the REPL, with great introspection capabilities. I didn't find one, but I think a good tutorial for ScaPy could serve as a very nice introduction to sockets - or for network programming in general.
  • Talking about network programming, there's a famous list of Fallacies of distributed computing which can serve as a summary of important characteristics and gotchas innetwork programming. Reading on those fallacies should give you some kind of understanding how communication between computers work in general; then you just need to go read a bit on TCP and then docs for Python socket module. At that point you should be able to write simple networking application without relying on StackOverflow and copy&paste magicks.

Well, this is rather huge topic and I really don't see any shortcuts. The best you could do in case you really don't have the time to learn this properly would be to use some kind of higher-level abstraction and pray it works without problems.

For example, in Python you can quite easily convert sockets into file-like objects, using makefile() method. You can write() and read() from such object as you'd normally do: it hides most of low-level send/recv stuff and provides a bit of error-handling.

But I'd recommend learning about sockets in depth anyway, at some point. We're surrounded by leaky abstractions and it's only a matter of time before you'll need to debug HTTP problems at TCP level. In such a situation knowledge of lower-level networking details proves priceless.


Lenses - composable setters and getters

...for immutable data. In short, lenses abstract a process of taking an element out of some structure and replacing it with another. It works by creating a new copy of original structure, just with the element modified.

Lenses are still being actively researched and new uses for them keep being discovered. They are language-independent concept, but they require some support (either in the form of library or language) to be useful.

An introduction to Lenses using F# (good for translating to other langs)

Boomerang - language for two-way data format manipulation (via lenses)

Lenses in JavaScript


Make a Lisp - in any number of languages!

This is simply beautiful! I can't believe I've never seen it before - it's similar in spirit to my dict.pl clients, but so much more impressive!

Basically it's a repository containing an implementation of simple Lisp intepreter in a number of languages, some of which are rather surprising. I think the most impressive ones are bash, forth and make implementations - but they all are interesting to read.

By the way, I was vaguely aware that Makefiles are Turing-complete but thought it was made so by accident. Turns out it really is, as shown here, but I couldn't find it on a list of accidentally Turing-complete. So maybe it was meant to be like that.

Cursory glance at Forth implementation made my head spin - it looked like the author built a whole OO language on top of normal Forth. Maybe I'm wrong, but:

  1. I don't recall normal Forth having words such as def-protocol-method
  2. the definitions for extend and similar (here) words are part of the source

But, it's not all there is to this repository. As you can see by following second link, the author actually documented a process of creating a new Mal interpreter in a language agnostic way. The whole process is split into steps, each being easily testable - with test suites provided! - and beautiful diagrams illustrating what's going on.

It's a work in progress, but it's incredibly (I mean it!) impressive work already, and it's going to only get better. I immediately felt the urge to contribute an implementation in one of the languages not present there yet. I'd really love to, however I don't think I will find the time for this in at least two months.

Well, it's such a massive pile of awesomeness that it definitely won't disappear until then, so worst case is I will have to implement it in a language other than I originally thought I would. It's still going to be great fun!

Closing thoughts: I wonder who the author is. I tried searching for his homepage, but didn't find anything. I can't imagine how much work he had to put into it to pull it off and so I'm extremely curious how did he find the time for this. I'd suspect a academic exercise, but it doesn't look like one (too few proofs and no bibliography and such...). Anyway, it's simply incredibly awesome, thanks for making it!


I'm getting old...

Well, it's nothing unexpected and it's not that bad either. But lately I realized I have trouble remembering when exactly I used what technology and in general when I worked on what.

Of course, "years of experience" are not a very useful metric, for any purpose other than showing off. I'm not interested in it that much (although I can't say I don't like to boast - can anyone?), but I believe that when I do I need to at the very least get the facts straight.

Additionally, I also realized that even if no one wants to listen about it, it's still a part of my story as a person. Over the years I lost many artifacts I created, but I never cared - I knew I created them and that's enough, right? Nobody can steal what you know, or so they say.

Well, even if I somehow could, I wouldn't want to dig up the source code I wrote years ago. I'm 100% sure it was hideous and wrong. But on the other hand, letting all that code vanish without any mention feels somehow wrong, like it would make my story incomplete. And I don't like stories with gaping holes in the plot one bit!

Anyway, I decided to try and create a rough timeline of what I did to date, in terms of programming. This way, I think, I will feel less bad for forgetting parts of it (and forgetting about some things may be quite a good thing for my psyche!). It looks like this:

I'm thinking about writing down some details, like what projects exactly I used these technologies for in which year, but this feels like quite a lot more work than this. It would be even more of an excuse to happily go on and forget about those, though...


Sociology of Programming Language Adoption

I recently watched this (rather boring) talk from Strange Loop: "The Sociology of Programming Languages" by Leo Meyerovich, it had one especially interesting slide, looking like this:

This actually makes a lot of sense, but then we hear that this is why language designers have to focus on lowering a barrier to entry as much as possible, possibly compromising on langauge features, because otherwise they won't have any users. Or at least they won't ever get near mainstream.

I immediately thought that there's something else we can do instead. Looking at this equation, it's obvious that we need to focus on making denominator smaller, but there's more than one way of doing this.

What I have in mind, of course, is to make programmers better in the skill of adopting new languages. It's a skill - a meta-skill, if you will - that can be taught and will get better with practice. In short, it's a skill similar to other professional skills programmers need to know.

Why, then, is it so hard to find a programmer who has any idea about how you go about learning and adopting new language or technology? The problem is with perceived adoption pain, this naturally means that people who never changed their tech stack will perceive any major change as huge pain in the ass. People who are using their 3rd+ stack and are ready to jump to another when the need arises won't think much of it. Yet it looks like it's the former group which forms a majority of programmers.

Is it because programmers are mostly young people, without much of real experience? Or are programmers just conservative and dislike changes just because? Is coding COBOL for the rest of one's life that interesting?

Honestly, I have no idea, but I now know what prevents programmers from using polyglot approach: it's perceived pain of adoption * number of languages, which may be a huge number indeed.

Language designers are doing all they can to make their languages as accessible as possible and I'm not in a position to help them in this (yet). But I think I can try convincing people to practice this skill of learning and adopting languages and I can hope that it will result in an easier job for PL designers, and - maybe, finally - to the PLs which are really innovative and going mainstream in years instead of decades.


A little lib for working with nested data

dQuery (I'd like DPath more, but it's already taken)

Storing hierachical data is something programmers need to do surprisingly often. There are many notations for describing such data - from s-expressions to XML to JSON and more. They all work quite well for storing data (despite differences in syntax), but just storing data is only half of what we normally do with it.

The other half is obviously retrieving the data we're interested in. Traversing many levels of a tree manually to find just one interesting leaf is a rather big overhead, so some automatic solutions were developed, like XPath for example.

XPath (or rather most implementations of it) however is specific to XML and nowadays web developers tend to work with JSON much more often. I thought that something like XPath, but for JSON would be good to have.

Turns out I'm not the only one who thought this. Andrew Kesterson wrote a library for this, called dpath-python.

It's good, and it works, but I skimmed its source and was not entirely happy with what I saw. There's nothing really wrong with it, and considering some interesting features it implements it's not that much code either. However, my personal preference is for Functional Programming style and declarative code instead of imperative style.

I remembered that I saw really pretty, functional style, code for selecting values from hierarchical data, and I thought maybe I should try to port it to Python. It was part of sxpathlib, a library for Scheme, documented here.

Its beauty comes from simplicity and elegant use of recursion and higher-order functions acting as combinators. I thought that if I could get those primitives working I'd have no trouble making a parser for some XPath-like little language which would compile down to the use of those.

Much to my surprise it worked! My code is not a direct translation, I use some Python features not available in (vanilla) Scheme to make the code easier to read and understand, but I implement the same general idea, which is to create a bunch of selectors and combinators for composing them.

It's still in development (I hope to work on it some at a Hackday in February) and lacks many comments, tests and features, but it implements the bare minimum of what I wanted, including a PyParsing backed parser for XPath-like notation.

On the whole working on this little project was very enjoyable. I came to understand the topic of function combinators better and played with parsing and compiling (not really, but close enough) very simple language.

As always, any help is greatly appreciated, especially in implementing functional update (a'la update-in from Clojure) and developing more interesting functions (there's only one right now, called text) for use inside selector expressions. I'd be very happy if anyone would like to do a code review, too.


How to disable internal keyboard

It's sometimes usefull to block internal keyboard in a laptop - I have at least two uses for this. One is cleaning the keyboard without having to switch the computer off and the other is when I want to place my external keyboard on top of internal one(it's much more convenient to use).

When working with Windows I wrote a simple script in Python - using pyhook, which in turn uses SetWindowsHookEx system call to register global callback procedure invoked for specified events. This callback procedure may decide to drop the event. By writing a hook callback which will drop all the events coming from keyboard we can make it (appear to be) disabled.

It wasn't that hard, as I was working with WinAPI a lot then, but it was certainly harder than I'd like it to be (just take a look at the docs). After I made a switch to Linux I never attempted doing the same, suspecting it would also take quite a bit of effort.

Turns out it's much easier to do this on Linux system: there's a command for that. I wrote a little shell function to automate the process and now I can disable and enable internal keyboard with a single, easy to use, command in the shell (obviously, for enabling I need to type this command on external keyboard).

The code looks like this:

## lkinput [+|-]
## This command either disables (without any or with - as parameter) or enables
## built-in keyboard and mouse/trackpad in a laptop. Make sure to have external
## keyboard available before invoking this function!
function lkinput() {
    if [[ -z $1 || "$1" = "-" || "$1" = "0" ]]; then

    # on my system output from xintput looks like this (showing only builtin
    # devices, omitting external ones):
    # Virtual core pointer                  id=2   [master pointer  (3)]
    #  ...more results...
    #   ↳ SynPS/2 Synaptics TouchPad        id=13  [slave  pointer  (2)]
    # Virtual core keyboard                 id=3   [master keyboard (2)]
    #  ...more results...
    #   ↳ AT Translated Set 2 keyboard      id=12  [slave  keyboard (3)]
    # We need to extract ids associated with these. I do it with a small AWK
    # script, but sed or perl, or anything else would do just fine:

        /PS/ || /AT T/ {
            match($0, /id=([0-9]+)/, arr);
            print arr[1]

    for id in $(xinput --list | awk $script); do
        echo "$msg input device: $id"
        sudo xinput set-int-prop "$id" "Device Enabled" 8 "$disOrEnable"


My talk on FP on WarsawJS meetup

Normally I'm passionate about Functional Programming and like advocating it, and in general I'm content with just having a chance to do so. This time, however, the talk was seriously below my capability, I could have done much better but didn't. It's not exactly a disaster, but it's not a success either...

For one thing, I my laptop decided to just shutdown its display the moment I plugged a HDMI cable into it. It remained black for the entire talk, which caused me to stand sideways to the listeners most of the time so that I could seewhat exactly I'm showing them. Fear of changing a virtual desktop accidentally and showing some porn was certainly not amusing. Then again, I had no Internet connection, so access to porn was severly limited...

BTW, my laptop, given to me generously by 10clouds, never-ever did something like this to me. I used beamers and other external screen many times, via HDMI and VGA both, and it just worked. I think that's because this particular beamer was optimized for working with Macs - I saw quite a lot of those around - and such appliances are notorious for rejecting anything cheaper than a new Mac Book Pro...

Anyway, much bigger problem was that I didn't get the chance to time my talk beforehand. Earlier this day I was busy preparing code examples, formatting them and so on: I thought them to be essential. It turned out that all that effort was wasted - by the time I was ready to show some code I was 46 minutes into, which was scheduled to last 30 minutes.

So, no code examples, and half of the talk thrown out - you see how little a success that would be. I did put those examples up here, so maybe they will be of use to some. The examples are formatted using Org Babel, which does a decent job, but the color scheme is awful and some examples are not finished yet, so I think I will work on them for some time still.

I wrapped the talk up by saying how much there still is to learn(via reading my examples...) and presented a list of books and resources I find useful for learning FP concepts.

I even was asked a question, which was, surprisingly, rather good: about which tools for FP I'd say are ready for production. I later takjed with the guy and didn't even ask for his name, which I know I'll be regretting long into the future, because he seemed to know his way around FP, which is so rare among programmers even now.

That's it, really - after the meetup some people went to drink beer, but I fled as soon as I could.

Maybe next time it'll turn out better...



I'm going to put a list of links to some of my projects and code here. I need to clean many things up before posting them, so it may take a while for this list to become longer.

  • Lang list - a list of languages I learned, am learning or want to learn.
  • dquery - a little library for working with deeply nested data in Python
  • transparent, caching proxy for HTTP service of FreeBase (turns out it's still running on production, so I can't publish the code)
  • Nginx script for recording req/resp data (COMING SOON: a full-blown project in the vein of https://mitmproxy.org/!)
  • A little Readline cheatsheet I put together for my coleagues
  • walkfiles - recursive directory tree traversal in Node and LiveScript
  • FFIP - Emacs plugin for opening files with fuzzy matching
  • Bezier Curves in Racket - displaying text along the curve, written in Racket GUI, for fun and to evaluate Racket features. There's a description of my experience with Racket in the README
  • DICTS: A few CLI clients for dict.pl in Racket, Io, Chicken Scheme and Python
  • A simple web chat implementation in Erlang and CoffeeScript - it's notable (?) because I didn't know OTP existed back then when I wrote this: quite a lot of reinvented wheels to see. Also, unfortunately, it's partially in Polish instead of English.
  • My Emacs config It's a mess right now, but it serves me well.