Our colleagues in the GOV.UK Email team send out notification emails to around 750,000 GOV.UK users who have signed up for alerts about significant content changes.
Government as a Platform Programme Director, Ashley Stephens, announced yesterday as part of our #GovPlatforms week, that the GOV.UK Email team is now using the GOV.UK Notify platform to deliver email notifications.
GOV.UK Notify is now processing an extra 500 million messages a year because of this migration. We’ve recently optimised GOV.UK Notify to make sure we can handle this significant increase in traffic and this post explains what we did to prepare.
How GOV.UK Notify works
GOV.UK Notify is hosted by GOV.UK PaaS and uses the Amazon Web Services Simple Email Service (SES). GOV.UK Notify processes notification requests for our service users in the following way.
- The GOV.UK Notify Application Programming Interface (API) puts emails that need to be sent into a queue.
- GOV.UK Notify uses delivery workers (which are applications that run on GOV.UK PaaS) to pick up emails from the queue and send them out via the GOV.UK Notify email service provider.
- The email delivery receipts are processed by GOV.UK Notify’s receipt workers, which are applications that run on GOV.UK PaaS in another queue.
- Service teams using GOV.UK Notify can then choose to use the API to check whether a notification was successful or not, or they can provide an endpoint for us to post the outcome to.
GOV.UK Notify sent between 5,000 to 42,000 emails per day in the last 60 days of 2017. We averaged sending 18,000 emails per day. Our biggest spike of requests during that period was around 10,000 requests received over one hour, as illustrated in Figure 1.
GOV.UK Email is currently GOV.UK Notify’s biggest user and our peak hourly traffic is now 1 million messages per hour.
Performance metrics and testing
GOV.UK Notify measures the percentage of messages sent within 10 seconds as a performance metric. In the last 60 days of 2017, we delivered 99.5% of messages within 10 seconds. We had to prepare GOV.UK Notify to send a peak of 1 million emails per hour, and an average of 1.5 million emails per day to meet GOV.UK Email’s delivery needs.
We measured GOV.UK Notify against 3 performance metrics to check if we could meet expectations:
- API response time - the time between a service provider sending a HTTP POST request to GOV.UK Notify and the task being placed in the email sending queue, after data is validated and database updates were complete
- throughput - this measured the total number of notification requests per second GOV.UK Notify could process
- round trip time - the seconds it took to confirm an email had been sent to a user after GOV.UK Notify received the initial request
We measured these metrics by using Gatling as our open-source load and performance tool for the testing. We configured and ran 4 tests:
- load testing (baseline) - helped us understand GOV.UK Notify’s behaviour under expected concurrent user traffic load
- spike testing - showed us how GOV.UK Notify handled sudden increases and decreases in request load
- stress testing - helped us understand the upper limits of the current GOV.UK Notify system under extreme load, based on the predicted overall peak traffic
- soak testing - combined all the above tests for different periods of time
GOV.UK Notify uses continuous integration and 3 environments - preview, staging and production. We performed the tests in our staging environment by making a copy of our production system.
Test results and upgrades to GOV.UK Notify
Our throughput results showed that GOV.UK Notify could handle approximately 170 notification requests per second. We needed to offer GOV.UK Email at least 360 notification requests per second to match the 1 million per hour delivery peak previously provided by its third-party provider. Our current infrastructure would not be able to cope with the predicted increase in traffic for a number of reasons:
- the scaling of the number of delivery receipt workers and database workers were maxed out at 170 notification requests per second
- our high-availability database, which runs on GOV.UK PaaS only supported 500 concurrent connections and was maxing out its CPU and RAM in periods of high demand
- our throttling limits were set at 3000 requests per minute for each individual service user
To prepare our system to meet anticipated traffic needs, we:
- upgraded our high-availability database, from the medium subscription package to large - we now have 5 times more storage space (512GB) and support 10 times the number of concurrent connections (5000), with more CPU and RAM capacity
- changed the throttling limits of incoming traffic to match the specific needs of our service users - we provide GOV.UK Email with a limit of 24,000 requests per minute (compared to the default limit of 3000 requests per minute), which lets them send 360 notification requests per second
- upgraded GOV.UK Notify’s daily log size to store additional logging information caused by increased traffic
After we made these changes, we paired with the GOV.UK Email team to do more load testing. The Email team tested their new features on our staging environment, and we got to test our system performance. A win-win situation!
GOV.UK Email went live with the GOV.UK Notify platform on 7 March and we are confident we can support an additional 500 million email deliveries per year and continue to scale.
We predict the number of daily messages sent through GOV.UK Notify will continue to increase month on month as more services across the wider public sector start using Notify to meet their messaging needs.
If this sounds like a good place to work, take a look at Working for GDS - we're usually in search of talented people to come and join the team.