ClojureScript for absolute beginners

This post takes you through the process of writing a simple ClojureScript program and running it in a web browser. It’s in three parts: Writing, Compiling and Running the program. It assumes no prior experience with Clojure, ClojureScript, Java, or much else apart from a working knowledge of JavaScript and the command line. You’ll need to have the Clojure language installed (instructions here), because the ClojureScript compiler is implemented in Clojure.

Finally, you’ll need a clean working directory in which to keep the files you’ll create.

Writing the program

We’re not going to go into the syntax and semantics of ClojureScript here — we’re just demonstrating the end-to-end journey from ClojureScript source to your browser executing it — so the following code should do for a first program.

(ns beginner.core)
(js/document.write “I have been compiled to JavaScript!”)

When we compile and run this, we expect it to print “I have been compiled to JavaScript!” using the trusty document.write() function from JavaScript.

You’ll notice it declares a namespace using ns. This is obligatory for all ClojureScript files, and it matters for two reasons. Firstly, it’s necessary to put the file at a path which exactly reflects that namespace, i.e. beginner/core.cljs. Secondly, that path needs to be somewhere the ClojureScript compiler can find it. By default, the compiler looks in a folder called src/. So relative to the clean working directory we start with, the file needs to live at src/beginner/core.cljs.

At this point, your directory ought to look something like this:

.
└── src
    └── beginner
        └── core.cljs

Compiling the program

The ClojureScript compiler works like this: you tell it the namespace of the ClojureScript program you want to compile, it gathers the namespace’s files from their predicatable location on disk (see above), and it produces some more files which are that program translated into JavaScript.

Because the ClojureScript compiler is a Clojure library, we need to call it from Clojure, and to do that, we need to let Clojure know we want to take a dependency on the compiler.

When you run Clojure using the built-in clj tool, it loads any dependencies defined in a deps.edn file (more about that) in the current working directory. So all that’s required to declare a dependency on the ClojureScript compiler is to create that file and insert the following:

{:deps {org.clojure/clojurescript {:mvn/version "1.10.520"}}}

Now your working directory looks like this:

.
├── deps.edn
└── src
    └── beginner
        └── core.cljs

And it’s possible to invoke the compiler from the command line:

clj --main cljs.main --compile beginner.core

NB it looks like we’re passing two options to clj: --main and --compile. Actually, we are passing --main to clj and --compile to cljs.main 🙃. In practice, everything after cljs.main is an option for ClojureScript, not for clj.  You can find a list of the available options here.

If all is well, this will print nothing and exit, and you should find a new folder in your working directory called out/. This holds our program. Now: how to run it?

Running the program

We want a web browser to execute the JavaScript we’ve just compiled, and we can achieve that by including the compiled file in a web page. Create a file called index.html in your working directory and insert the following markup:

<html>
    <script src="out/main.js"></script>
</html>

Now your working directory looks like this:

.
├── deps.edn
├── index.html
└── out
    └── main.js
    └── *some other stuff*
└── src
    └── beginner
        └── core.cljs

Open this file in your web browser, and you should see the words “I have been compiled to JavaScript!”

You have just compiled some ClojureScript. You created a ClojureScript file, placed it in a predictable location, and gave its namespace to the ClojureScript compiler. The compiler found your file using the namespace and turned it into JavaScript. You included that JavaScript in a web page, and it executed.

Inside /out

We blindly included out/main.js to get the browser to pick it up. Let’s have a look inside the out/ folder.

out/ contains many more files than main.js. After compiling the very simple program from the previous post, my out/ folder contains no fewer than 35 files in 18 directories: a total of 4.7 megabytes of code. For comparison, the program itself in plain JavaScript would be 54 bytes.

To experience vertigo, have a look inside main.js:

var CLOSURE_UNCOMPILED_DEFINES = {};
var CLOSURE_NO_DEPS = true;
if(typeof goog == "undefined") document.write('<script src="out/goog/base.js"></script>');
document.write('<script src="out/goog/deps.js"></script>');
document.write('<script src="out/cljs_deps.js"></script>');
document.write('<script>if (typeof goog == "undefined") console.warn("ClojureScript could not load :main, did you forget to specify :asset-path?");</script>');
document.write('<script>goog.require("process.env");</script>');
document.write('<script>goog.require("beginner.core");</script>’);

This doesn’t even contain our program code — just some jargon and a series of calls to document.write to load yet more scripts. The program is here somewhere, because it runs when we include main.js, but it’s not immediately obvious where.

What makes this file complicated is that our original program has a single enormous implicit dependency: the ClojureScript language itself. The JavaScript implementation of that depends in turn on another library called Google Closure (GCL). Therefore our out/ folder includes the entirety of the GCL as well as the JavaScript translation of the ClojureScript standard library, and main.js loads the lot via those document.write calls which are including other scripts.

All of them are interesting in different ways. A good place to start poking around in /out is cljs_deps.js. Its job is to declare all our dependencies: the GCL, whichever bits of ClojureScript we need in the form of .js files in the out/cljs folder, and finally our program:

goog.addDependency("base.js", ['goog'], []);
goog.addDependency("../cljs/core.js", ['cljs.core'], ['goog.string', 'goog.Uri', 'goog.object', 'goog.math.Integer', 'goog.string.StringBuffer', 'goog.array', 'goog.math.Long']);
goog.addDependency("../process/env.js", ['process.env'], ['cljs.core']);
goog.addDependency("../beginner/core.js", ['beginner.core'], ['cljs.core']);

It now seems pretty clear that we can look inside out/beginner/core.cljs and find our ClojureScript code as JavaScript, and here it is:

// Compiled by ClojureScript 1.10.520 {}
goog.provide('beginner.core');
goog.require('cljs.core');
document.write("I have been compiled to JavaScript");

//# sourceMappingURL=core.js.map

That’s the basic journey of a .cljs file into a .js file.

All about the Gale-Shapley algorithm

I mentioned to a friend that I was about to begin a contract working on a Department for Education service handling admissions to higher education. He suggested I might be interested in the “stable marriage problem” and an algorithm I’d never heard of called Gale-Shapley. So I went and read about it, and this is what I learned.

The problem of college admissions

The original paper by Gale and Shapley sets out to solve the following problem:

A college is considering a set of n applicants of which it can admit a quota of only q. Having evaluated all their qualifications, the admissions office must decide which ones to admit… it cannot be assumed that all who are offered admission will accept[, because] many applicants will apply to and be admitted by more than one college and hence will accept only their first choice. It may not be known (a) whether a given applicant has also applied elsewhere; if this is known it may not be known (b) how the applicant ranks the colleges to which he has applied; even if this is known it will not be known (c) which of the other colleges will offer to admit the given applicant.

Gale/Shapley, 1962

In the worst case scenario College X gives one of its precious places to a candidate who didn’t particularly want it. This is bad for the college, who might not see the place not taken up if the candidate does get an offer from her preferred college. It’s bad for the candidate herself, who now has to decide to string along College X in the hope of a better offer OR accept the offer she’s got. And finally, it’s bad for some other student who did really want to go to College X but now can’t because the place has been filled.

As it turns out, the authors of the paper don’t actually tackle this problem, but a similar one called the “Stable Marriage Problem”:

A certain community consists of n men and n women. Each person ranks those of the opposite sex in accordance with his or her preferences for a marriage partner. We seek a satisfactory way of marrying off all members of the community.

Gale/Shapley, 1962

This is significantly more intuitive, but it’s also slightly different as colleges have multiple places to fill and people proposing marriage — most of them, at least — only have one. No matter: we can come back to the college admissions problem with the solution to the marriage problem.

In any case, the way they solve it is beautifully simple. As they point out in a peculiar coda to their paper with the heading Addendum on the Nature of Mathematics,

The argument is carried out not in mathematical symbols but in ordinary English, there are no obscure or technical terms. Knowledge of calculus is not presupposed. In fact one hardly needs to know how to count.

Gale/Shapley, 1962

Before showing some code which implements the algorithm, I want to mention one thing to mention about the stable marriage problem, which is that it’s not at all practical for marriages: who wants to marry their second choice? The Gale-Shapley solution is technically fair, in that it always achieves a solution with the best possible overall outcome, but that doesn’t mean that there are no winners and losers. The “optimal” outcome might be for you you to get your third choice.

The technical reason for this fair-not-fair system is that Gale and Shapley prize a quality called “stability” which exists in the set of marriages as a whole:

we call a set of marriages unstable (and here the suitability of the term is quite clear) if under it there are a man and a woman who are not married to each other but prefer each other to their actual mates.

Gale/Shapley, 1962

i.e. Your happiness in marriage is not as important as keeping the ambient level of dissatisfaction to a minimum.

How to get and stay married

The algorithm goes like this: start with an equal number of boys and girls. Each boy has a strict order of preference for girls, and vice versa. You can follow along in the code via the magic of the Klipse plugin, which means these code samples are editable and the outputs will update accordingly.

(def boys {:andrew [:alice :barbara :clare]
           :boris  [:alice :clare :barbara]
           :claude [:barbara :alice :clare]})

(def girls {:alice   [:andrew :boris :claude]
            :barbara [:boris :andrew :claude]
            :clare   [:claude :boris :andrew]})

Now follow these steps:

  1. each boy proposes to his favourite girl
  2. if a girl has been proposed to by her favourite, she accepts him. If not, she strings along the boy who has just proposed to her.
  3. each boy who is not either accepted or “on a string”, as Gale and Shapley put it, proposes to his next-favourite girl
  4. the girls either keep stringing along the boy they are already stringing along or replace him with the latest proposal, according to their preference.
  5. Repeat 3) and 4) until every girl has received a proposal.
  6. Every boy who has not been accepted is accepted by the girl who is stringing him along

Here is the algorithm in code, to the best of my abilities:

(defn gale-shapley
  ([proposers choosers]
    ;; initialize with an empty map of chooser -> proposer matches
    (gale-shapley proposers choosers {}))
  ([proposers choosers matches]
    (if (= (set (keys choosers)) (set (keys matches))) ;; has every chooser been proposed to?
      matches
      (let [eligible-proposers (apply dissoc proposers (vals matches))
            proposals (map vector (keys eligible-proposers) (map first (vals eligible-proposers)))
            next-proposers (zipmap (keys proposers) (map rest (vals proposers)))]
        (recur
          next-proposers
          choosers
          (reduce (fn [new-matches [proposer chooser]]
                    (let [chooser-prefs (get choosers chooser)
                          chooser-current-match (get new-matches chooser)]
                      (cond (nil? chooser-current-match)
                            (assoc new-matches chooser proposer) ;; take on the match if no prior match
                            (> (.indexOf chooser-prefs chooser-current-match) (.indexOf chooser-prefs proposer))
                            (assoc new-matches chooser proposer) ;; take on the match if chooser prefers this proposer
                            :else new-matches))) matches proposals))))))

Run it against the boys and girls defined above, with boys proposing and girls choosing:

(gale-shapley boys girls)

You can see that there’s a benefit to being the proposer. When the boys propose, two of them (Andrew and Claude) get their first choice, and one (Boris) gets his second choice.

The girls come off less well. Alice is happy with her first choice of Andrew, but Clare has indifferent second choice, Boris, and unwilling Barbara her third choice, the disgustingly enthusiastic Claude.

If the roles are reversed, so is the character of the outcome:

(gale-shapley girls boys)

All the girls have their first choice, but Boris and Claude both end up with their least preferred choice. Even in this circumstance, where two out of six people did very badly, the important thing to bear in mind is that this is the least worst outcome overall.

But this isn’t actually useful for college admissions

Let’s try to apply Gale-Shapley to the original problem. Consider a situation in which there are two colleges, and each college has two places. There are four applicants.

The colleges list the applicants in order of preference:

(def colleges {:button-college [:alice :barbara :clare :dora]
               :twine-college  [:dora :clare :alice :barbara]})

The applicants do the same:

(def applicants {:alice [:button-college :twine-college]
                 :barbara [:button-college :twine-college]
                 :clare [:twine-college :button-college]
                 :dora [:button-college :twine-college]})

Let’s see what happens if we try the same thing:

(gale-shapley applicants colleges)

This is no good. What happened was:

  • Alice applied to Button College, which accepted her.
  • Barbara applied to Button College, which rejected her in favour of Alice.
  • Clare applied to Twine College, which accepted her.
  • The process ended because Button and Twine college each had a proposal, leaving Barbara unduly rejected and Dora without the opportunity even to apply.

The good news is that the Gale-Shapley algorithm can be tweaked so that it supports a college-admissions type scenario. In fact, a US scheme called the National Residency Matching Program (NRMP), which exists to assign medicine graduates to hospitals, is based on it. You can read more about the NRMP algorithm and its proofs here. The basic idea is similar: applicants try for the first hospital on their list, then the second, etc, while hospitals accept them in order of preference.

An interesting side effect of this process in real life is that hospitals do not need to compete on salary to attract the candidates they want. If you would like to read more about that you can find an article about it here.

If I have made mistakes, or you’ve enjoyed this, or both, or neither, please do leave a comment.