Over the last few months, I’ve been leading a small team building tools to automate creation and configuration of our servers.
Tools to automate provisioning
Currently, our production environment is hosted with Skyscape, and we manage it using VMware vCloud Director. Skyscape, and other providers using vCloud Director, expose a UI for managing VMs and networks.
However, a UI isn't suitable when you need to bring a lot of machines up in a repeatable, reliable way, for example when migrating the site to a new platform. In addition, if you can automate provisioning, the configuration files represent your infrastructure as code, providing documentation about your system.
Our suppliers currently all use vCloud Director
The UK Government’s policy is cloud first and we procure services through CloudStore (which uses the G-Cloud framework). As government, it’s vital that we provide secure, resilient services that citizens can trust, and we need various assurances from our suppliers to support that.
There are a number of Infrastructure as a Service (IaaS) platforms that are able to provide these assurances, but when we procured our current suppliers last year, the vendors that met our requirements (both technical and non-functional, eg self-service provisioning, a pay-as-you go cost model) used vCloud Director.
We very much want to encourage potential new suppliers, but given the availability in the market, we can be reasonably certain we’ll be using at least one VMware-based solution for at least the next 12-18 months, and it’s likely that many other transformation projects will also be drawing on the same pool of hosting providers. So it’s definitely worth investing time to make provisioning easy.
Preventing vendor lock-in
Having an easy way to provision environments will allow us to move more easily between vendors that use vCloud Director, which will help prevent supplier lock-in. But we also don’t want to lock ourselves in to VMware, so we are taking several steps to guard against that:
- We access the vCloud API via an Open Source library called fog which can communicate with multiple types of cloud providers. If we add different vendor solutions in the future, we can extend our existing tools.
- Within our suite of vCloud Tools, the interaction with fog and the vCloud API takes place in vCloud Core, which the other tools depend on. This means any changes to use a different provider would be localised and should not affect the work we’ve done in the other tools.
- We’ve aimed to restrict our usage of VMware to functionality that has equivalents on other platforms, for example, Load Balancers, Firewalls and NAT, so we don’t have an infrastructure design that can’t be moved.
Previous iterations of provisioning tooling
Automation of provisioning is something we’ve been iterating on since we launched GOV.UK 18 months ago. Prior to vCloud Tools, we used a tool we had built called vCloud Provisioner. This was really useful when moving onto our first platform, but because it was built quickly it has a lot of hard-coded detail about the environment, so it doesn’t help us be flexible when moving between suppliers. In addition these hard-coded details include sensitive information, so we can’t share it, meaning it cannot be useful to anyone outside of the GOV.UK team.
Several members of the infrastructure team worked on subsequent iterations of the provisioning tooling. These iterations included vCloud Box Spinner and vcloudtools. However, the people working on these were building these tools alongside their other business-as-usual work keeping GOV.UK up and running, so when they ran into issues, it was difficult to find time to address them. With the migration to the new platform looming, we needed to prioritise this piece of work in order to be able to bring up a new environment quickly, reliably and easily.
We think vCloud Tools will be more robust
There are several things we have done to improve the chances of producing robust, more widely useful tools in this iteration.
We have committed the time and resources to this, forming a small team who focus entirely on vCloud Tools rather than firefighting or other operations work, and we are coding in the open, so we won’t fall into the trap of including sensitive information that it is later too hard to factor out.
Not only are the GOV.UK team “eating our own dogfood” by using vCloud Tools to provision our new hosting environments, there are two other GDS teams also using the tools, the Identity Assurance Programme and the Performance Platform. Other exemplar projects have started using the tools, and we have already accepted pull requests from people we do not know, so there is the beginnings of a community around the tools. This keeps us from making it too GOV.UK specific and means that we get a lot of extremely useful user feedback.
And in the meantime we are contributing back to Open Source - in one recent release of fog, a huge number of the contributions were from the GOV.UK vCloud Tools team (who were, at that time, me (Anna Shipman), Sneha Somwanshi, Mike Pountney and Dan Abel).
These factors mean that we feel confident that we are going to produce tools that will continue to be very useful to us and other teams for some time.
More will follow
There will be future posts about particular aspects of what we’ve done and how, including one soon on how we used vCloud Tools in our platform migration. If there’s anything you’d particularly like more detail on, please let us know in the comments below.
If work like this appeals to you, take a look at Working for GDS - we're usually in search of talented people to come and join the team.