It is that time of the year again, where current students are tasked to develop an algorithm to find the best flat combinations for the new promotion. It turns out, this is no easy task! Being responsible for matching together people we don't know for a duration of 3 years isn't something we took lightly, and this year's process involved a lot more than simple combinatorics. We explored many options such as K-means clustering, deep learning algorithms and genetic populations. Here's how we ended up doing it.

Please note that the criteria are obviously not objective, and favourising people of different backgrounds was a conscious decision made with regards of what seems to work well in the Bachelor. Also, note that the results obtained with the final algorithm were a useful indication but actually weren't final, and manual reshuffling took place where it made sense.

Step 1: the questionnaire

The questionnaire itself is as important as what's done with the data later on, since it dictates what the results are directly based on. We decided that a good questionnaire:

  1. Covers the most important points (sleep times, age, gender) to ensure there won't be major clashes during the 3 years to follow
  2. Doesn't infringe on anyone's privacy. While the questions are important, it is crucial not to dive too deep into everyone's habits to respect everyone's private life
  3. Includes a range of concerns, including transient matters such as holidays, to ensure students aren't left out on campus without support, in particular during the first year. Also, having roommates to enjoy holidays with is always funnier!
  4. Doesn't take too much time to fill out. Here, participation rate goes above the comprehensiveness of the data collected. If people don't understand the stakes, they might regret not filling out the questionnaire later on.

We ended up collecting 11 different signals, as follows:

  • Gender
  • Age
  • Nationality
  • Languages spoken
  • Approximate bed time during week
  • Approximate bed time during weekend
  • Leniency regarding those times
  • Strictness with cleaning
  • Strictness with lack of noise
  • Holidays spent on campus
  • Additional comments

Close to 92% of the confirmed students answered the poll, which was a success.

Step 2: the criteria

With sufficient knowledge for us to go forward with the task, it was now time to set criteria for people to be matched with each other. These weren't set in stone, and it was possible to vary them and relaunch the calculations if needed. These also obviously weren't perfect, but we felt they reflected quite well what was typically working in the bachelor community from experience. They were the following:

Avoid (invalidate combination if happens):

  • < 3 per flat
  • Age difference of >=3 years
  • 3 or more minors by flat
  • Opposite gender
  • > 50% common language (except English)
  • >= 75% common nationality
  • >= 3 levels of difference for music and cleaning criteria
  • Depending on leniency as following:
  • Not lenient: > 30 mins of sleep time difference
  • Lenient: > 1h30 mins of sleep time difference
  • Very lenient: > 3h of sleep time difference

Favourise (prefer combination if happens):

  • Unique nationalities
  • One to two minors max
  • 2 or 3 on campus for same holidays
  • 70+% leniency cross-correlation
  • Common additional comments keywords (sports, games)

At the end (manually):

We looked at individual comments (for those who provided some) to refine the end combinations.

The idea following from here was hence to make a program that could generate combinations of roommates, score/invalidate them leveraging the criteria above, keep the good ones, then repeat with the remaining people.

Step 3: the solution (quick answer)

You will find the complete commented Jupyter notebook at the end of this article, including a few failed attempts and the final algorithm. Notwithstanding the failed attempts, here's how the final solution was computed:

  1. Load up the quizz answers from a CSV file.
  2. Do some processing to clean up the data.
  3. Make a list of girls and another one of boys, along with their preferences.
  4. Sort everyone by leniency score (see how in next section).
  5. Take the first one in the list, match the person with every other person, take the 3 best pair scores (see how in next section) to form one flat.
  6. Remove those 4 people from the list, repeat from step 5 until no one is left.
  7. In the end, obtain a list of lists of 4 people representing flats, and satisfying all criteria set before.

The solution (in details)

Pair scores represent one student's "affinity" for another. Meaning, this roughly translates to answering the question "Are those two students likely to enjoy living together?" (obviously only in terms of the poll answers). To best set this up, we clamped this score to a straight 0 if some mandatory criteria weren't met, this is, if any of those is true:

  • Age difference of more than 3 years
  • Cleaning and noise requirements are too far apart (ended up with 3+)
  • Sleep times are too different (see by how much in part 2: criteria)

Then, if those are all false, we start with a match score of 1, then add on "bonuses" for those extra things:

  • +0.5 per same holidays spent on campus
  • +1 for different nationalities
  • +(2 - delta) for difference in sleep leniency
  • +(4 - delta) for difference in cleaning leniency
  • +(4-delta) for difference in noise leniency

We end up with a score between 0 and ~15, so we can plot everyone's affinity for everyone else in a nice matrix like this (girls on the left, boys on the right):

Now this is nice and all, but we're trying to match 4 people together, not 2, and it is clearly out of question to create a 4-axis plot (for obvious reasons).

We introduce a second score: the leniency score, which quantifies everyone's leniency, which roughly translates to: "How many different people is this person likely to get along with?" (once again, subjectively depending on poll answers). The most people, the higher leniency.

What this allows us to do is to take the "least lenient people" and put them in flats first to ensure they will have a good combination.

This is essentially the basis for our algorithm, which goes down the ranked list of leniencies and progressively matches people, most lenient last. Here are a few advantages and drawbacks of this method:


  • The most unique individuals won't find themselves left out with no matches.
  • Lenient individuals are equally guaranteed to find people with similar needs.
  • There's no need to generate all different combinations (hundreds of billions, impossible to do in less than 10 hours).
  • It is actually impossible for incompatible people to be matched with each other given the criteria.
  • Biases and limits can be adjusted on the fly to re-run the selection process (but too tight criteria will lead to a match failure).


  • The final combination might not be the absolute best (but this is a necessary tradeoff).
  • Lenient people might be at a slight disadvantage since they are processed at the end (but flat scores obviously remain above 0).
  • The results are obviously limited by the quizz questions, since the criteria remain very narrow.

Other explored methods

  • Deep learning: scrapped since criteria were really identified as crucial and a large part of combinations were eliminated with simple rules.
  • Genetic algorithms: very good strategy and we tried different approaches linked to this but ultimately eliminated since it is inherently very difficult and inappropriate to control lack of repeats in combinations and "mutations" lead to a lot of stupid results given criteria.
  • K-means clustering: good for identifying patterns in the data initially and helped devise criteria but doesn't make much sense to generate small groups like what was needed.

What could be improved for next year

  • More exhaustive quizz questions (though at the expense of privacy).
  • More advanced selection algorithms that guarantee optimal combinations.
  • Gradient descent or other optimisation algorithms to refine biases and criteria (currently adjusted manually).
  • More fine-tuned control over number of flats and amount of people in one flat to generate.
  • Maybe Natural Language Processing for additional comments?

Lessons learned

  1. Don't wait until after sending the quizz to write code, the quizz should be designed with the next parts in mind.
  2. Oftentimes, the simplest rule-based solutions work best, don't overlook them for more complicated algorithms.
  3. Make sure your quizz answer data is well formatted from the start to avoid correcting everything manually afterwards.
  4. Don't overlook the human factor when making decisions, and accept that manual reshuffling in the end will be the only way to cater to everyone's preferences and comments.
  5. Don't forget to have fun, this is an important task but also very interesting and enriching to put together!
Google Colaboratory
The notebook with code

Thanks to Elsa B. for helping us out with the data!

You’ve successfully subscribed to Horizon
Welcome back! You’ve successfully signed in.
Great! You’ve successfully signed up.
Your link has expired
Success! Check your email for magic link to sign-in.