Latest API improvements: Intelligent Caching

Over the past few weeks, I’ve spent some considerable time making improvements and optimisations to the API. This is a never-ending journey. I have a full roadmap of improvements and features that I want to implement over time.

But for this blog post, I want to focus on one improvement in particular: Intelligent Caching.

The Opportunity

As I have described in previous posts, the API provides a RESTful interface on top of the official National Rail SOAP API. Instead of having to make complex and legacy SOAP API calls, developers can use the API to access all of the same information via HTTP Get requests. The API then handles all of the complex and messy SOAP requests that are required in the background.

The API does store any rail data itself, each request that a developer makes is backed off to the National Rail Darwin API, and the response transformed and returned to the requestor. The below diagram shows the flow of a request.

This method works well. For every request, the API fetches the latest data from National Rail, before transforming it and returning it to the user. Every user request is sent to National Rail, so if 1 million people use the API to lookup the next departures from London Paddington, National Rail will also receive 1 million API calls.

But that gave me an idea. As the API becomes more and more popular, the likelihood that it will see requests for the same information at the same time (but from different users) starts to increase. For example, I often see in the logs multiple users requesting the next departures from London Waterloo within seconds of each other.

If multiple users are looking for the same data within seconds of each other, it didn’t seem to make sense for the API to be requesting effectively the same data time and time again from National Rail. The API is fast and efficient, but going to National Rail for the data does add time to the request. I started to wonder if there was any way I could do things more efficiently.

Intelligent Caching

It turns out there was.

I recently moved the API to sit behind Cloudflare. My primary motivation for this was to take advantage of some of their clever caching for some of the more static endpoints. For example the getStationDetailsByCRS and the getStationBasicInfo API endpoints only have their backend data updated once a night. These endpoints were a perfect candidate for Cloudflare to cache, which reduces the traffic to my origin and also gives a big response time improvement.

However, sadly Cloudflare’s standard caching wasn’t suitable for solving my problem of making multiple requests to National Rail for the same data. Cloudflare performs its caching via the URL of the request. The URL is effectively used as the “Cache Key”.

The problem that I had with the API is that multiple users can request the same data, but the request URL will be unique each time due to the fact users need to specify their API Key’s.

For example, to get the next 10 departures from London Paddington, I would make a GET request to the following URL:

Notice the ?apiKey in the URL? That is my (redacted) personal National Rail API Key. If someone else was to make the same request, the URL would be different as it would contain their API Key. Despite the fact they are requesting the same data.

There are also other complications that can lead to different URL’s for the same request, such as query parameters. I might specify some query parameters such as ?numServices=1&timeOffset=100 and someone else might specify query parameters such as ?timeOffset=100&numServices=1. Notice they are exactly the same parameters, but just in a different order.

Cloudflare would see these as two different URL’s, and consequently each request would result in a cache miss. The cached URL’s are also case sensitive, which can lead to even more cache misses (and consequently queries going to National Rail).

This ruled out Cloudflare’s standard caching as a solution to the problem. Cloudflare’s caching is perfect for typical websites, where users are following links. But for the API problem, it wasn’t suited. Cloudflare’s minimum caching period is also 30 minutes unless you pay them lots and lots of money. Nobody wants 30 minute old rail data!

Cloudflare Workers to the rescue

I considered a couple of solutions to this problem. Including doing the caching within the API application code and storing the responses in Redis. But I settled on an easier and more performant solution. Cloudflare Workers, and the Cache API.

Cloudflare Workers is a serverless offering from Cloudflare that allows you to write JavaScript code that is executed on each request a user makes. The code runs in the Cloudflare data centre closest to the user. It is incredibly powerful, and the possibilities are almost endless. The code is run on a Cloudflare server, not in the user’s browser, so it provides a secure environment to run code that the end user cannot tamper with.

The Cache API can be used with Workers to give finer grained control over the Cloudflare Cache, including putting objects into cache (with custom expiry periods), reading them from cache and deleting them.

Putting Cloudflare Workers to work

Cloudflare Workers and the Cache API gave me everything that I needed to implement “Intelligent Caching”. My plan was as follows:

In order to intelligently cache responses, and not send every single request to National Rail each time. I needed to be able to easily identify if requests were the same, even if the URL used to request them was different. To do this, I needed to remove the uniqueness in each of the URL’s when storing them in cache.

If we take a typical request URL such as:

In order to identify the request being made, and remove any uniqueness, the request URL has to go through the following transformations:

  1. Remove the apiKey query parameter
  2. Sort the query parameter’s in alphabetical order
  3. Convert the query parameters to lower case

In this example, the new URL after these transformations would be:

Now let’s run a different URL through that same process. I still want to get the next 5 departures from London Paddington, but I am now a different user, and have specified my query parameters in a different order. My request looks like this:

Notice how the apiKey parameter has changed, and how serviceDetails and numServices are now in a different order. The request is for exactly the same data as the previous request, but with different API Key, and specified slightly differently.

If we run this URL through the same 3 steps that we mentioned above, the end result is:

Oh, would you look at that! It is exactly the same as our other transformed URL.

These simple 3 transformation steps are all that is required to be able to identify requests that are the same. I had the process understood, all that I needed to do was implement this logic into Cloudflare Workers.

Writing the code was relatively straightforward. I wrote it and deployed it in Cloudflare Workers. I setup a route so that my Cloudflare Worker script runs each time someone makes a request to the API.

The above diagram shows the logic that is applied on each request to the API.

In summary: When a user makes a request to an endpoint with Intelligent Caching enabled, the request URL is transformed. The cache is then checked to see if someone else has made the same request in the past 20 seconds. If they have, the cached response is served, removing the need to make the request to National Rail.

If the same request has not been made in the past 20 seconds, then the users original request (including the apiKey) is sent to the API origin. The response is immediately served to the user, after which it is saved in the cache for 20 seconds in case someone else makes the same request again.

The Result

After a significant testing period, I rolled this out to the API. The results were amazing.

On average, requests served from the Intelligent Cache are 5x faster than results that go to National Rail in the backend. The short 20 second expiry ensures that the data is still effectively real-time, while giving incredible performance improvements.

The API powers Intelligent Caching has already saved me 100’s of queries to National Rail which reduces the chance of me hitting their query limit for my API Key. Other developers have also benefitted from the Intelligent Cache without having to make any code tweaks. It just works.

A combination of Intelligent Caching, and standard caching has enabled me to completely remove rate limiting on the static endpoints, and increase the rate limits on the other endpoints by 10x. You can read more about Intelligent Caching in the official API docs.

This is all possible due to the extremely powerful Cloudflare Workers! I’m now using Workers to do many other cool things which I will blog about another time. Check them out and let me know what you build.

Finally, I will leave you with another diagram to illustrate just how Intelligent Caching works:


comments powered by Disqus