The Reverend Thomas Bayes was born in Hemel Hempstead, Hertfordshire, in 1701. He grew up in London’s Southwark, and died in Tunbridge Wells, Kent in 1761. Had he lived 300 years later, a railway running from Hertfordshire to Kent via London Bridge would have been rather useful to him. And if the people who currently run that railway had paid more attention to him, everyone on the route would be a lot happier.

The Thameslink service links commuter towns to the north and south of London via the city centre. After a major timetable change this May, the network descended into chaos. Instead of the intended massive increase in services, the service through London collapsed.

Things got so bad that Govia Thameslink Railway (GTR) had to hire extra security staff to defend train crew from angry passengers. GTR’s CEO announced his resignation, although he’ll stay in place until the company finds someone willing to take on the poisoned chalice.

So what happened? First, some background. In the 1980s, British Rail (BR) reopened a disused freight line across London. This allowed BR to shift commuter services away from terminal stations, and free up peak hour space at St Pancras and Blackfriars.

This scheme worked so well that the railway went for a second round. This programme was called Thameslink 2000, after the year it was supposed to be finished. It’s nearly finished now (that’s another story). The timetable change was supposed to benefit from the new infrastructure.


Instead it collapsed. London Reconnections has outlined the underlying issues: in short, new trains were delivered late, so drivers didn’t know how to drive them; when GTR took over the franchise in 2014 the previous operator hadn’t been training new drivers, so it’s been playing catch-up; GTR’s training programme relies on drivers working overtime, which many of them don’t want to do; some new tunnels didn’t get handed over until far too late; and GTR didn’t transfer drivers to new depots in time. This meant that many drivers weren’t qualified to drive the new trains along the new routes in time for the change.

Some people might have decided to cancel at this point. But GTR had a cunning plan.

For a train to carry passengers, it needs to have a driver qualified to drive the route that it’s on, a driver qualified to drive the train, and a driver qualified to carry passengers. These don’t have to be the same person, so if you must, you can have three people in the cab, one of whom is qualified to do each. This isn’t ideal; but it’s safe, and it works.

GTR worked out that – between the drivers it had who were trained on the new trains, the drivers it had who were trained on the new routes, and the not-passenger-qualified drivers who had tested the new trains before they entered passenger service – it had enough drivers to run the new timetable by doubling or tripling up in the cab.

But it didn’t. Which is where the Reverend Bayes comes in.

The Reverend Thomas Bayes. Image: Wikimedia Commons.

If you’re working out the number of drivers you need based on traditional probabilities (statisticians call this ‘frequentism’), you look at five factors: the total number of trains needed, the number of drivers qualified for each part of the route, the numbers qualified for the right trains, the number qualified to carry passengers, and sickness/absenteeism rates.

Then you can work out the number of trains to run, based on the number of people likely to be around and qualified. On the evidence we’ve seen so far, GTR appear to have done this, and found that they were, narrowly, capable of running the service.

But there’s a problem here: people don’t come in percentages. Either you have a whole train driver or no train driver at all. And if you don’t have a train driver qualified to drive the train to Finsbury Park when it arrives at London Bridge at 7:30am on a Monday, then your whole timetable is stuffed.

Agent-based modelling is a more complicated way of looking at things than simple probability. But it has a huge advantage over simple statistical models, which is that it can deal with lumpy problems like train drivers. It requires a lot of hard maths, of the sort pioneered by the Reverend Bayes.

You use this maths to set up simulations of what will happen if you try and run the trains you have on the routes you have, using the drivers who you have. So your computer becomes a gigantic nerdy train simulator game, running the entire train timetable thousands of times, and seeing what happens each time you try to run it.

The conditions are slightly different each time: on run 3, the driver who’s off sick is Alan from Luton who is qualified to drive to Brighton but not Maidstone; on run 15, it’s Barbara from Brighton, who is qualified to drive to London Bridge but not Cambridge. The closer you can match the simulated agents to your real roster, the more accurate the simulation is.


Using this model, GTR would have found that having the right number of qualified crew is no use in itself: one person in the wrong place at the wrong time can make the whole thing fall over, even if there’s another qualified person on shift, because that qualified person is an hour’s cab ride away.

Because they didn’t do this kind of modelling, they took false reassurance from their data showing that they had enough crew. The first time their assumptions were put to the test was the first day of the real timetable – when it all fell to pieces.

If GTR had used agent-based modelling to test the new timetable, they would have had to ditch it at the last minute, which would have been horribly embarrassing. Maybe that’s why they didn’t do it. But looking back, it would have been much less embarrassing than what actually happened.

Want more of this stuff? Follow CityMetric on Twitter or Facebook.