As outlined recently by our executive director Stephen Foreshew-Cain in his vision for 2030, “platform thinking” is going to become a key part of government. We’re seeing this happen with the emergence of platforms such as GOV.UK Verify, GOV.UK Notify, Registers and the HMRC developer platform, all built on Application Programming Interfaces (APIs).
At GDS we’ve been thinking more about how we approach APIs. It’s important to clarify elements of our approach as we continue to build Government as a Platform and as other parts of government regularly request clearer advice in this area.
To help ensure services are widely adopted and speedily developed, we want government APIs to be consistent if not necessarily uniform. While developing this consistency, we may challenge assumptions and document our findings, eg we recently shared how we have used JSON web tokens as a method of authentication on the GOV.UK Notify platform.
What we mean by APIs
The term API has become shorthand for describing any type of computing system built on top of another. APIs allow software to be bound together.
Increasingly, the world is built on APIs, including many high-volume services that people use every day. Jeff Bezos, chief executive of Amazon, famously produced a company-wide memo on APIs in 2002, mandating their use. This focus on APIs has been a crucial part of turning Amazon from a bookseller into a company at the centre of an ecosystem – enabling other companies and organisations to thrive.
These days almost every service you use on the web or through a mobile app makes use of an API. This explosion has created a wealth of systems, standards, patterns and approaches to building, provisioning, documenting, publishing, managing, securing, consuming and monitoring APIs.
The kind of APIs we are most interested in at GDS tend to be application-to-application APIs that tend to transit over a public network. These are usually referred to by the industry as “web APIs” because they use web technologies and protocols and are often (but not always) used by online applications.
APIs are built on standards
Because we are talking about government APIs, this means complying with the Open Standards Principles of open access, consensus-based open process and royalty-free licensing.
These open standards help software reuse which in turn brings down procurement and development cost. Such standards also help ecosystems avoid friction associated with patents and licensing fees.
We plan to consider API standards in a series of blog posts, and this initial post will start with the basics.
HTTP, HTTPS and HTTP/2
In the world of APIs, you can’t get more basic than the HyperText Transport Protocol (HTTP), designed in 1991 by Tim Berners-Lee as a way to shuttle documents back and forth over the internet.
Partly because of HTTP’s current ubiquity, it transmits data across the internet easier than many other protocols and now plays a key role in the deployment of web APIs. Systems built on top of HTTP that make use of simple HTTP “verbs” of GET, POST, etc, are often referred to as RESTful. We tend to favour RESTful APIs at GDS because of this simplicity.
Increasingly when we talk about HTTP, we actually mean HTTPS, which adds a layer of encryption to the HTTP protocol. Over the past few years there has been a movement to make secure HTTPS the default transport of the web - for instance the report from the W3C’s Technical Architecture Group on securing the web.
These days it’s increasingly rare to find a website not using HTTPS to secure its connection, and this goes for web APIs as well. If requests to web APIs are being made directly from a web application served over HTTPS, then the API must be served over HTTPS as well or you’ll trigger mixed content warnings. But even if the API is being called from a server-based application, it’s still imperative to use HTTPS to preserve user privacy and ensure data integrity.
Over the last year it has become easier to get HTTPS up and running with the advent of free certificate providers such as LetsEncrypt. CESG have developed some additional in-depth material on this topic. You have no excuse any more.
HTTP was recently revised for the first time in 15 years with the development of HTTP/2. Although HTTP/2 reinvents HTTP in many respects, notably in that it is a binary protocol, it has the same semantics as HTTP. HTTP/2 was built to optimise performance in the browser, but many of its characteristics will also boost the performance for web APIs. HTTP/2 is clearly the future but industry take-up remains fairly limited (especially amongst high volume commercial websites) since the protocol is a relatively new development.
We haven’t begun using HTTP/2 yet at GDS for our web sites or APIs. As we explore its use, we’ll be posting about it here. You can read more about what the HTTP working group has to say about this topic in their FAQ.
Making use of URIs
Even more fundamental than HTTP is the use of Uniform Resource Indentifiers (URIs). The W3C views URIs as the page one, paragraph one architecture of the web. Although technically a superset of the Uniform Resource Locator (URL), the terms URI and URL are often used interchangeably. URIs are most often thought of as a way to get something online like interacting with a web site or addressing a web API. But URIs can also be used as unique identifiers, especially in the context of data.
APIs themselves are invoked over (HTTP) URIs. When these HTTP calls return data, that data should contain additional URIs as identifiers where appropriate. GDS strives to make use of URIs for identification wherever possible, eg the use of shorthand URIs (or CURIEs) in Registers.
XML and JSON
Since the W3C developed eXtensible Markup Language (XML) in the 90s (a lightweight version of Standard Generalised Markup Language) it quickly became a lingua franca for machine-readable data and web APIs.
Unfortunately, XML had some issues. Amongst these is that XML and the cascade of XML-based standards associated with it, are difficult to work with, error prone, brittle, poor performing and unnecessarily verbose. The consensus of the web developer community has been to reject XML, especially when it comes to web APIs, in favour of simpler formats for shuttling data around the web. JavaScript Object Notation (JSON) is the format of choice for the developer community and it has in some sense become the bedrock for web APIs.
Although JSON was initially standardised by ECMA as part of the JavaScript language definition, the format has since outgrown its roots and spawned many interoperable open source implementations in multiple programming languages. JSON has also been published as a standard by the IETF, cementing its role as one of the fundamental underpinnings of the internet.
At GDS, our format of choice for all web APIs tends to be JSON. We’d only build something using XML in exceptional cases, eg if we needed to connect to a legacy system that only speaks XML, or when required to comply with a broadly adopted standard (eg, SAML) where there is clearly demonstrable benefit.
The CSV data format
Comma separated values (CSV) is a granddaddy of data formats but it has often been neglected as developers prefer formats less prone to error.
The fact is though, there’s an enormous amount of data out there in CSV format and a lot of that data is on the web. You can’t write off CSV and this realisation led to W3C’s creation of the CSV on the Web Working Group. The work of this group and their primer provides a solid introduction and framework for publishing and dealing with CSV and other tabular web data.
GDS believe it’s important to make data available in CSV formats when publishing bulk data. This ensures people can use a wide range of tools, including off-the-shelf software, to import and analyse this data.
Putting our approach into practice
So there you have it. HTTP(s), URIs, JSON and CSV. But what does it all look like when you put it together?
You need look no further than our recently released Registers API. The Countries Register provides a list of countries recognised by the UK Foreign Office. The data is served by simple, RESTful API over HTTP(s). It can be served in JSON and CSV, and retrieved via URIs (and embedding CURIEs in its payload).
Of course, that’s all well and good for APIs like Registers which are open, mostly about data retrieval and don’t involve complex transactions. But what happens when your needs go beyond that? We plan to delve into this topic in subsequent blog posts, covering areas such as authentication, versioning and API management techniques. Stay tuned.
You can follow Dan on Twitter, sign up now for email updates from this blog or subscribe to the feed.
If this sounds like a good place to work, take a look at Working for GDS - we're usually in search of talented people to come and join the team.