Looking at Transport for London’s colour-coded bus trial in Barkingside, CityMetric editor Jonn Elledge found an interesting problem:
Even thinking about the maths does my head in – but it seems unlikely to me that every bus in London can be given a colour different from that of every bus it ever shares a stop with. At some bus stops, there’ll be two buses in violet.
So let’s have a go at this: given the current bus routes in London, how many colours would you need to so that no bus stop has two buses using the same colour?
TfL have their bus stop locations and routes available as open data so we can quickly get a feel for how hard a problem is actually is. The way I processed the file I ended up with 20,028 bus stops and 729 routes (we can probably have arguments about exact numbers, but go with me). A lot of those bus stops serve exactly the same routes as others – so there are only 4,129 unique nodes in this problem. Which is still lots, but feels more manageable.
While there probably an elegant mathematical approach, the boring brute force technique goes like this. Starting out with all bus routes sharing a single colour (let’s say blue), you get the computer to go through every bus stop and change routes to other colours to make sure that bus stop has only unique colours. You then repeat this until all bus stops have no duplicate colours.
You get different results for this depending what order you tackle the bus stops, so you run it with a few different orders to get a feel of about how many colours are needed. Running the program 1,000 times, the lowest number I get using this technique is 44. Lower numbers are probably possible through trying many more orders, but let’s say for the moment this is roughly right – you’d need 44 colours to apply this approach strictly to the entire London bus system. This is really too many colours to be able to usefully distinguish lines, so is probably a no-go.
We get more manageable numbers if we try a less strict version of the rule. If we let bus stops served by five or more routes have two routes of the same colour, the total number of colours required drops to 14. This is approaching a workable colour scheme in terms of actually being able to distinguish between all varieties.
When your number is up
But let’s move on from colours and think about what TfL is actually trying to do here: it wants to make it easier for people who don’t currently use buses to use buses.
It’s worth thinking about where bus route numbers currently come from:
When we introduce a new route – or make alterations to an existing route by splitting it – the last digit or digits of the historic ‘parent’ route are used wherever possible, so that passengers might associate the incoming route with its predecessor. This was the case in 2003, for instance, when route 414 was chosen as the number for the new route between Maida Hill and Putney Bridge, which was intended to augment route historic route 14 south of Hyde Park Corner.
In other words, bus routes numbers are path dependent on old naming decisions because of the desire to keep existing users happy. While this is probably a good idea, it can also end up in results that are very un-good for new users.
So if you ignored the past and the need to keep the millions of current users not confused, what could do if you just scrapped all the current route numbers and started from scratch? Specifically let’s look at two problems:
If you don’t see why these things are problems, imagine yourself as a user for whom the concept of numbers is a bit fuzzier: for instance, dyslexic users for whom the number rearrange (where 365 and 635 might be similar), or those for whom the numbers are literally fuzzy because they’re less able to read the signs.
There are two key areas of ambiguity: digits that are visually similar to each other (66 and 68) and route numbers that are conceptually similar like 114 and 14.
For real world examples of conceptually confusing bus stops, there are 1,601 stops served by routes whose numbers wholly contain the number of another route at the same bus stop. While this is sometimes suggestive of similarity of route, in many instances it isn’t. If you’re at Church Lane the 71 and 671 share 88 per cent of stops in common – but if you’re at Southall Broadway, the 95 and 195 share just 0.1 per cent of their stops. Looking at all the stops with this problem, the average similarity is only 38 per cent.
As most journeys are short, differences at the far end of the route are probably not a problem for most users – but the point is, that vague, warm feeling that similar number routes at the same location should be similar is not backed up by the data.
There are also 205 stops that have routes which are anagrams of each other. The St Nicholas Center has the 407 and the 470, at Brooke Road you have the 76 and 67, and Lytton Grove has the 39 and the 93. This isn’t many in the grand scheme of things – but it’s not ideal.
While we’re thinking about which numbers are nicer than other, let’s look at research which numbers are easier to remember correctly than others. Milikowski & Elshout found that:
The order of memorability was
(1) Single digit numbers;
(2) Teen numbers (10-19);
(3) Doubled numbers (e.g. 44, 77, 22);
(4) Large tabled numbers (numbers which factor and therefore appear in the multiplication tables, such as 49, 36, 60, 84, 27); and
(5) Other numbers that do not fall into any of these categories.
While memorability for Single digit numbers was above 80 percent, that for Other numbers (no subcategory) was only around 40 percent.
This should inform our thinking about route numbers. The first thing our colour system lets you do is dump bus numbers above 100 and use colours as a replacement for the first digit. This immediately makes numbers easier to remember because we’re reducing the number of concepts you need to remember. Route 127 requires you to remember three things (one two seven) while Blue-27 requires you to remember two (Blue twenty-seven). This is more true with smaller numbers, but every little helps.
The next thing we need to do is jettison every number that is a reverse of another (we don’t want both 46 and 64). This gets rid of most numbers above fifty (while retaining doubles). The end result is each colour can now be followed by 62 numbers – which means 62 bus routes.
Ideally you’d also reduce ambiguous symbols such as (1 and 7) or (6 and 8) – but this really cuts down the number of usable numbers. Instead what we’ll do try and make sure ambiguous numbers like this do not appear at the same stop.
So here are our new constraints:
A colour can only have 62 routes;
There are 15 colours (up from 14, because the original solution required some colours to have more than 62 routes);
Bus stops with four or fewer buses can’t have multiple routes with the same colour, stops with more can have two;
One bus stop cannot have routes of different colours with the same number. You also can’t have both 21 and 27, or 46 and 48.
Is such an arrangement possible? It turns out it is.
To solve this one you randomise which routes get which numbers and score them according to how well they pass the above. Then you create random variations on the best performing plan, and so on, until it narrows in on a version that passes all the rules. This returned a viable arrangement of route colours and names after a few hours (and 161,663 attempts).
Can something like this be done in reality? Confusing all current users seems a bad idea – but maybe this kind of approach should affect how new bus routes are named. Rather than blindly following the history of a route, select rules you want to be true of your naming scheme (they might be different from mine) and get a computer to suggest the minimally confusing approach. It turns out it doesn’t take long to get answers to quite fiddly problems.
But the real point here is I don’t want to wear my glasses to wait for a bus, and changing the naming convention for every single bus route in London is a proportionate response to this problem.
Want more of this stuff? Follow CityMetric on Twitter or Facebook.
This article is from the CityMetric archive: some formatting and images may not be present.