The GOV.UK Verify Proxy Node is a component of Government Digital Service (GDS) integration with the eIDAS framework. The Proxy Node is deployed on the GDS Supported Platform (GSP), a Kubernetes-based platform that provides standard systems and tooling for building, deploying and monitoring GDS applications.
Choosing an open source tool
We knew regulatory requirements and the sensitive nature of the data shared between GDS and the eIDAS framework meant that security was crucial. We had several sessions with the GDS Information Assurance team, the project team and colleagues from the National Cyber Security Centre.
We decided to use open source Istio to mitigate some of the risks. We chose Istio because it provides:
- layer 7 routing rules that work with the layer 4 network policy resources native to Kubernetes
- mutual TLS, both internal and external
- HTTP request end-to-end tracing across multiple microservices
- egress traffic control
There are also several other aspects of Istio that made it an attractive choice for the team, including:
- traffic shaping, to support canary-style deployments and A/B testing
- service to service authorisation rules
Securing the Proxy Node
The GSP is based on AWS Elastic Kubernetes Service (EKS). To provide security for the Proxy Node and its data we used an AWS CloudHSM, with strict controls over which components were able to connect to this external resource. There are 2 components, residing in different namespaces, that connect to the CloudHSM over a TCP connection on ports 2223-5.
We were using EKS version 1.12 and Istio version 1.1.4.
We installed Istio using a Helm Chart, a programme that helps manage Kubernetes applications.
These are the relevant parts of the values.yaml:
global: k8sIngressSelector: ingressgateway mtls: enabled: true proxy: accessLogFile: "/dev/stdout" istio_cni: enabled: true gateways: istio-egressgateway: enabled: true ports: - port: 80 name: http2 - port: 443 name: https # This is the port where sni routing happens - port: 15443 targetPort: 15443 name: tls - port: 2223 name: tcp-cloudhsm-2223 sidecarInjectorWebhook: enableNamespacesByDefault: true rewriteAppHTTPProbe: true
We followed the example given in Istio’s documentation to remove arbitrary egress directly, forcing all egress traffic (out of the cluster) to route via something in istio-system
:
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-egress-to-istio-system-and-kube-dns namespace: proxy-node spec: podSelector: {} policyTypes: - Egress egress: - to: - namespaceSelector: matchLabels: kube-system: "true" ports: - protocol: UDP port: 53 - to: - namespaceSelector: matchLabels: istio: system - to: - namespaceSelector: matchLabels: namespace: proxy-node
If you’re doing something similar, you’ll need to add the kube-system
and istio
labels referenced here as they are not present in a default install.
We also added a to
block, which was not included in the Istio documentation example, to allow all egress within the namespace. Most people find it easier to reason about traffic rules on an ingress basis so this block allows that by re-enabling all local egress.
At this point of the project, we’d:
- disabled arbitrary egress from all but the
istio-system
andkube-system
namespaces - we plan on limiting arbitrary egress from these namespaces in the future - enabled mutual TLS globally within the Istio service mesh
- added an injected Istio sidecar to every pod in the affected namespaces
The next task was to use Istio’s traffic management resources to allow the 2 components to communicate with the CloudHSM. We used a blog post on using external services to help with this.
However, that and most of the examples in Istio documentation only covered a single endpoint from a single application in a single namespace. Our need spanned several namespaces.
Figuring out which of the resources documented in the examples needed to be placed in which namespaces was the piece missing from the existing documentation.
As the CloudHSM is outside the service mesh, we needed to add a ServiceEntry
. Some of the configuration goes into a “common” namespace, which in this case is istio-system
, while other parts of the configuration is applied to the application-specific namespaces.
apiVersion: networking.istio.io/v1alpha3 kind: ServiceEntry metadata: name: cloudhsm-2223 namespace: istio-system spec: hosts: - cloudhsm-2223.tcp.svc addresses: - 10.100.100.100/32 ports: - name: tcp-2223 number: 2223 protocol: TCP location: MESH_EXTERNAL resolution: STATIC endpoints: - address: 10.100.100.100
We did not use the hosts
entry as a fully qualified domain name in the DNS sense, as it connects to the CloudHSM via IPv4 address. However, we used it to tie together the various resources for this particular route.
You’ll see the “-2223” suffix in the code example. This is because the CloudHSM connectivity spans ports 2223-5 and these all need repeating for 2224 and 2225. The code examples here will only show 2223.
We needed to redirect the traffic leaving the pod to the istio-egressgateway
in the istio-system
namespace. That required a VirtualService
. We needed to add this to each namespace that wants to make use of this connection.
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: proxy-node-cloudhsm-2223-egress namespace: proxy-node spec: hosts: - cloudhsm-2223.tcp.svc gateways: - mesh tcp: - match: - gateways: - mesh destinationSubnets: - 10.100.100.100/32 port: 2223 sourceLabels: talksToHsm: "true" route: - destination: host: istio-egressgateway.istio-system.svc.cluster.local subset: proxy-node-cloudhsm-2223-egress port: number: 443 exportTo: - "."
The use of sourceLabels
limits which pods are allowed to use this route. Similarly, exportTo
limits the exposure of the route definition to the current namespace to prevent an overlap.
We added a redirection in port from 2223 to 443. This is because the connection with the istio-egressgateway
is secured with mutual TLS. Traffic from this pod is tagged as being part of an Istio subset
, the behaviour of which is governed by a DestinationRule
that handles connectivity with the istio-egressgateway
.
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: proxy-node-egressgateway-for-cloudhsm-2223 namespace: proxy-node spec: host: istio-egressgateway.istio-system.svc.cluster.local subsets: - name: proxy-node-cloudhsm-2223-egress trafficPolicy: portLevelSettings: - port: number: 443 tls: mode: ISTIO_MUTUAL sni: cloudhsm-2223.tcp.svc exportTo: - "."
The istio-egressgateway
needs to be configured to listen for the incoming connection. We did this with a Gateway
resource.
apiVersion: networking.istio.io/v1alpha3 kind: Gateway metadata: name: cloudhsm-egress-2223 namespace: istio-system spec: selector: istio: egressgateway servers: - port: number: 443 name: tls-cloudhsm-2223 protocol: TLS hosts: - cloudhsm-2223.tcp.svc tls: mode: MUTUAL serverCertificate: /etc/certs/cert-chain.pem privateKey: /etc/certs/key.pem caCertificates: /etc/certs/root-cert.pem
The locations of the certificates and keys are important. These are provided by Istio (citadel, specifically) and mounted automatically.
At this point, the istio-egressgateway
needed a route definition for what to do with the CloudHSM-bound traffic that reaches it. We used a VirtualService
for this.
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: cloudhsm-egress-2223 namespace: istio-sytem spec: hosts: - cloudhsm-2223.tcp.svc gateways: - cloudhsm-egress-2223 tcp: - match: - gateways: - cloudhsm-egress-2223 port: 443 route: - destination: host: cloudhsm-2223.tcp.svc port: number: 2223 weight: 100
The final piece in the puzzle is the DestinationRule
to handle the final segment.
apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: proxy-node-cloudhsm-2223-egress namespace: proxy-node spec: host: cloudhsm-2223.tcp.svc exportTo: - "."
We did not need this DestinationRule
when routing an HTTP, HTTPS or TLS service via the istio-egressgateway
. It only appears to be necessary when routing TCP services. It’s also not clear why this resource cannot live in the common namespace.
At this point, a pod in the verify-proxy
namespace that carries the label talksToHsm: "true"
will be able to establish a TCP connection, via mutual TLS with the istio-egressgateway
, to the CloudHSM.
You can also use the examples above to cover traffic over HTTP and HTTPS (or via TLS using Server Name Indication). At this point, you can switch back to the Istio documentation for the rest of the instructions. What we’ve shown here is how and where to split the various resources across the namespaces to ensure connectivity.
If you’re doing something similar and have found our experience helpful, we’d like to hear from you. You can get in touch by leaving a comment below.