June 16, 2019 · Projects departureboard.io API

Latest Project: departureboard.io REST API

Wow! It has been over 6 months since I last added a post to my blog. Definitely doesn't feel that long.

Since my last post I've been pretty busy, both at work and in my own time. Work aside, I've been moving more and more towards software development. I've always been largely infrastructure focused but there is only so much fun you can have without some actual software to build infrastructure around.

Earlier this year I launched departureboard.io, which is a site designed to provide live train departure information for commuters on the go. It focuses very specifically on providing simple and quick information for commuters who perform regular journeys.

The idea behind it is that you setup your regular journeys, and add them as an icon to your home screen on your iOS or Android device. When you're then approaching the station on your regular commute, you simply hit the icon for your saved journey and have platform and train information available in seconds. It isn't designed to be a tool for planning your annual visit to the seaside, but solves for the "I'm at the train station with seconds until my train is due, what platform is it on?" problem.

I've been pleasantly surprised with the traffic volumes to the site, still seeing 1000's of monthly users despite no active promotion.

departureboard.io pulls information directly from the National Rail Darwin API, which is a SOAP API that provides a wealth of information on the Rail Network. But there is an issue: SOAP isn't fun to develop with, not one bit, and that's what gave me the inspiration for my latest project.

Introducing the departureboard.io REST API

Over the past few months I have been working on developing a few other applications, which all have micro-services based architectures. These individual components interact with each other via REST API calls where appropriate.

No prizes for figuring out that it's much easier to develop with REST API's than it is SOAP API's.

I've always been impressed with the wealth of information available from National Rail, so I decided to write a REST API in Golang that pulled information from National Rail.

I had three main aims from this project:

  1. Expand my knowledge and experience in Golang, by writing a high-performance API.
  2. Make it easier for developers to to build cool applications with National Rail data, by making it easier to consume.
  3. Provide some additional useful information over and above the National Rail Darwin API.

After a couple of weeks of work, I'm pretty happy with the result, launching https://api.departureboard.io.

The departureboard.io API exposes a range of REST Endpoints that allow developers get information on services from across the United Kingdom. By making simple HTTP GET requests, developers are provided with JSON responses that provide all of the same information that is available from the National Rail SOAP endpoints.

For example to look up the next service from London Paddington to Ealing Broadway, you can run the following curl command (you'll need to generate an API Key for this to work for you):

curl -X GET "https://api.departureboard.io/api/v1.0/getNextDeparturesByCRS/PAD/?apiKey=REDACTED&filterList=EAL" -H "accept: */*"

A JSON response is returned showing the next service between Paddington and Ealing Broadway.

In the background, the departureboard.io API receives the HTTP GET request, performs a range of validation (e.g. is the station code valid, are the required parameters present, does the request make sense, etc) and crafts a SOAP XML payload. This XML payload is sent to National Rail, which sends an XML response. The response is validated, parsed, transformed and then returned to the user in a JSON format. It is all incredibly efficient, typically adding < 50ms to a request (including network time).

The departureboard.io API also adds some additional fields to the response, for example:

The departureboard.io API supports all of the functions that the official National Rail API supports, and is documented in detail over at https://api.departureboard.io.

It is hosted in Kubernetes, running in Google Cloud Platform across 3 Availability Zones in europe-west2 and auto-scales to demand.

Additional Functionality

With the above set of functionality completed, I'd achieved two of my 3 goals. I had significantly expanded my knowledge and experience in Golang, and made developing with the National Rail data a little easier. But I wanted to add some additional functionality to the departureboard.io API.

One of the pain points I had with departureboard.io was displaying friendly station names. The National Rail API operates using CRS Codes, which stands for Computer Reservation System. These are the 3 letter codes that refer to stations, for example KGX for London Kings Cross, WAT for London Waterloo, and so on and so forth.

If you're making an application that uses National Rail data, the chances are you're going to want to give your users the ability to search for stations using their friendly names, which means you need some sort of way to convert to and fro between CRS Codes and Station Names.

The National Rail API doesn't offer this functionality, but there are two places that the data can be sourced:

  1. Via the Station Codes CSV file provided on the National Rail Website
  2. Via the Stations Knowledgebase XML file

In my experience, I had found the Station Codes CSV file to be out of date, missing some of the new stations - which ruled it out as a source of data.

This left me with the Knowledgebase XML file, but there was a problem. This doesn't just contain station codes and names, it is a 900,000 line XML document that contains a huge range of information on train stations. Information that would actually be useful to developers!

So I needed a way to parse it, store it, and then make it available via an API Endpoint.

The Solution

The solution started back in Visual Studio Code. I started work on writing a custom parser in Golang. I creatively called this the departureboard-io-api-station-parser. Writing this took forever as there are thousands of different XML tags in this document, and the National Rail developer documentation is quite outdated. I had to compare 100's of stations to work out the full schema. Writing the structs in Golang took a day in itself, but eventually I got there.

The station-parser works like this:

And that's it, that's all the station-parser needs to do. The retrieval is to be dealt with by the departureboard.io API. However before I moved on, I containerised the station-parser and scheduled it to run using a Kubernetes Cronjob each evening at 00:30 United Kingdom time. This ensures that the station information is kept up to date.

This left me with two API endpoints to create:

getStationBasicInfo

As I mentioned above, this endpoint needed to be extremely high performance. All it needed to do was return a crsCode, and a stationName based on a search parameter. For example, the following HTTP Get request:

curl -X GET "https://api.departureboard.io/api/v1.0/getStationBasicInfo/?station=London%20Waterloo" -H "accept: */*"

Returns:

[
  {
    "crsCode": "WAE",
    "stationName": "London Waterloo East"
  },
  {
    "crsCode": "WAT",
    "stationName": "London Waterloo"
  }
]

To make this endpoint as fast as possible, I decided to store these in memory. When the departureboard.io API starts up, a map is initialised. A background goroutine then queries the Stations Collection in Google Cloud Firestore, and adds the crsCode and stationName for each UK Station to the map.

When the user queries the API, a loop of the map is performed, and any matching stations are added to a slice. Before being formatted as JSON and returned to the user. This is incredibly high performance. While sitting in my house over 100 miles away from London, I get a 26ms response time. For comparison, a ping to google.com responds in 30ms. Mission achieved.

getStationDetailsByCRS

Finally I had the getStationDetailsByCRS endpoint to create. This endpoint needed to be high performance, but not quite as high performance as the getStationBasicInfo endpoint. Plus it was waaaaaaaaay too much data to store in memory.

For this endpoint a query to Firestore each time was fine. Firestore is still very high performance and much quicker than the National Rail API, so each time the user queries the endpoint for a station the information is pulled from Firestore, and returned as JSON to the user.

And thats a wrap...

With all of the above complete, that marks the first version of the departureboard.io API complete. The departureboard.io is a high performance, highly available REST API interface to National Rail data.

It is really easy to get started with, so head over to https://api.departureboard.io to have a play around and read the detailed docs. Let me know what you create in the comments below.

I have loads of additional data items, and improvements on the roadmap. As I release them I will detail them on this blog. Now to re-develop departureboard.io to make use of the API...!

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket