All about the Gale-Shapley algorithm

I mentioned to a friend that I was about to begin a contract working on a Department for Education service handling admissions to higher education. He suggested I might be interested in the “stable marriage problem” and an algorithm I’d never heard of called Gale-Shapley. So I went and read about it, and this is what I learned.

The problem of college admissions

The original paper by Gale and Shapley sets out to solve the following problem:

A college is considering a set of n applicants of which it can admit a quota of only q. Having evaluated all their qualifications, the admissions office must decide which ones to admit… it cannot be assumed that all who are offered admission will accept[, because] many applicants will apply to and be admitted by more than one college and hence will accept only their first choice. It may not be known (a) whether a given applicant has also applied elsewhere; if this is known it may not be known (b) how the applicant ranks the colleges to which he has applied; even if this is known it will not be known (c) which of the other colleges will offer to admit the given applicant.

Gale/Shapley, 1962

In the worst case scenario College X gives one of its precious places to a candidate who didn’t particularly want it. This is bad for the college, who might not see the place not taken up if the candidate does get an offer from her preferred college. It’s bad for the candidate herself, who now has to decide to string along College X in the hope of a better offer OR accept the offer she’s got. And finally, it’s bad for some other student who did really want to go to College X but now can’t because the place has been filled.

As it turns out, the authors of the paper don’t actually tackle this problem, but a similar one called the “Stable Marriage Problem”:

A certain community consists of n men and n women. Each person ranks those of the opposite sex in accordance with his or her preferences for a marriage partner. We seek a satisfactory way of marrying off all members of the community.

Gale/Shapley, 1962

This is significantly more intuitive, but it’s also slightly different as colleges have multiple places to fill and people proposing marriage — most of them, at least — only have one. No matter: we can come back to the college admissions problem with the solution to the marriage problem.

In any case, the way they solve it is beautifully simple. As they point out in a peculiar coda to their paper with the heading Addendum on the Nature of Mathematics,

The argument is carried out not in mathematical symbols but in ordinary English, there are no obscure or technical terms. Knowledge of calculus is not presupposed. In fact one hardly needs to know how to count.

Gale/Shapley, 1962

Before showing some code which implements the algorithm, I want to mention one thing to mention about the stable marriage problem, which is that it’s not at all practical for marriages: who wants to marry their second choice? The Gale-Shapley solution is technically fair, in that it always achieves a solution with the best possible overall outcome, but that doesn’t mean that there are no winners and losers. The “optimal” outcome might be for you you to get your third choice.

The technical reason for this fair-not-fair system is that Gale and Shapley prize a quality called “stability” which exists in the set of marriages as a whole:

we call a set of marriages unstable (and here the suitability of the term is quite clear) if under it there are a man and a woman who are not married to each other but prefer each other to their actual mates.

Gale/Shapley, 1962

i.e. Your happiness in marriage is not as important as keeping the ambient level of dissatisfaction to a minimum.

How to get and stay married

The algorithm goes like this: start with an equal number of boys and girls. Each boy has a strict order of preference for girls, and vice versa. You can follow along in the code via the magic of the Klipse plugin, which means these code samples are editable and the outputs will update accordingly.

(def boys {:andrew [:alice :barbara :clare]
           :boris  [:alice :clare :barbara]
           :claude [:barbara :alice :clare]})

(def girls {:alice   [:andrew :boris :claude]
            :barbara [:boris :andrew :claude]
            :clare   [:claude :boris :andrew]})

Now follow these steps:

  1. each boy proposes to his favourite girl
  2. if a girl has been proposed to by her favourite, she accepts him. If not, she strings along the boy who has just proposed to her.
  3. each boy who is not either accepted or “on a string”, as Gale and Shapley put it, proposes to his next-favourite girl
  4. the girls either keep stringing along the boy they are already stringing along or replace him with the latest proposal, according to their preference.
  5. Repeat 3) and 4) until every girl has received a proposal.
  6. Every boy who has not been accepted is accepted by the girl who is stringing him along

Here is the algorithm in code, to the best of my abilities:

(defn gale-shapley
  ([proposers choosers]
    ;; initialize with an empty map of chooser -> proposer matches
    (gale-shapley proposers choosers {}))
  ([proposers choosers matches]
    (if (= (set (keys choosers)) (set (keys matches))) ;; has every chooser been proposed to?
      matches
      (let [eligible-proposers (apply dissoc proposers (vals matches))
            proposals (map vector (keys eligible-proposers) (map first (vals eligible-proposers)))
            next-proposers (zipmap (keys proposers) (map rest (vals proposers)))]
        (recur
          next-proposers
          choosers
          (reduce (fn [new-matches [proposer chooser]]
                    (let [chooser-prefs (get choosers chooser)
                          chooser-current-match (get new-matches chooser)]
                      (cond (nil? chooser-current-match)
                            (assoc new-matches chooser proposer) ;; take on the match if no prior match
                            (> (.indexOf chooser-prefs chooser-current-match) (.indexOf chooser-prefs proposer))
                            (assoc new-matches chooser proposer) ;; take on the match if chooser prefers this proposer
                            :else new-matches))) matches proposals))))))

Run it against the boys and girls defined above, with boys proposing and girls choosing:

(gale-shapley boys girls)

You can see that there’s a benefit to being the proposer. When the boys propose, two of them (Andrew and Claude) get their first choice, and one (Boris) gets his second choice.

The girls come off less well. Alice is happy with her first choice of Andrew, but Clare has indifferent second choice, Boris, and unwilling Barbara her third choice, the disgustingly enthusiastic Claude.

If the roles are reversed, so is the character of the outcome:

(gale-shapley girls boys)

All the girls have their first choice, but Boris and Claude both end up with their least preferred choice. Even in this circumstance, where two out of six people did very badly, the important thing to bear in mind is that this is the least worst outcome overall.

But this isn’t actually useful for college admissions

Let’s try to apply Gale-Shapley to the original problem. Consider a situation in which there are two colleges, and each college has two places. There are four applicants.

The colleges list the applicants in order of preference:

(def colleges {:button-college [:alice :barbara :clare :dora]
               :twine-college  [:dora :clare :alice :barbara]})

The applicants do the same:

(def applicants {:alice [:button-college :twine-college]
                 :barbara [:button-college :twine-college]
                 :clare [:twine-college :button-college]
                 :dora [:button-college :twine-college]})

Let’s see what happens if we try the same thing:

(gale-shapley applicants colleges)

This is no good. What happened was:

  • Alice applied to Button College, which accepted her.
  • Barbara applied to Button College, which rejected her in favour of Alice.
  • Clare applied to Twine College, which accepted her.
  • The process ended because Button and Twine college each had a proposal, leaving Barbara unduly rejected and Dora without the opportunity even to apply.

The good news is that the Gale-Shapley algorithm can be tweaked so that it supports a college-admissions type scenario. In fact, a US scheme called the National Residency Matching Program (NRMP), which exists to assign medicine graduates to hospitals, is based on it. You can read more about the NRMP algorithm and its proofs here. The basic idea is similar: applicants try for the first hospital on their list, then the second, etc, while hospitals accept them in order of preference.

An interesting side effect of this process in real life is that hospitals do not need to compete on salary to attract the candidates they want. If you would like to read more about that you can find an article about it here.

If I have made mistakes, or you’ve enjoyed this, or both, or neither, please do leave a comment.

Leave a comment

Your email address will not be published. Required fields are marked *