Skip to main content

https://technology.blog.gov.uk/2020/06/29/using-multi-stage-docker-builds-to-patch-vulnerable-containers/

Using multi-stage Docker builds to patch vulnerable containers

Posted by: , Posted on: - Categories: Open Source, Tools

 

At Government Digital Service (GDS) we use Concourse for continuous delivery and other automation tasks across many of our projects. Concourse has a lot of community-contributed content, mostly pre-built container images published as public images on Docker Hub

To help us in the Cyber Security team provide simple health monitoring for our pipelines, we built a webhook in AWS Lambda. We used one of these community-built container images, in this case http-api-resource, to send data to the webhook.

Discovering vulnerabilities

However we soon found that the http-api-resource image was on a published list of vulnerable containers. This list is an analysis of the top 1,000 most popular images on Docker Hub by number of downloads. 

The container had a number of vulnerabilities marked as critical. The Cyber Security team investigated the vulnerabilities and discovered there was nothing wrong with the container code itself. All the vulnerabilities were all in the Python 3.6 base image. 

We reviewed the vulnerabilities with GDS penetration testers and concluded that although critical, our exposure was limited given the role the container played in our pipeline. 

To patch the vulnerabilities we used a multi-stage Docker build. This meant we could take the code from the latest version of the source image and then replace the base image with a less vulnerable Alpine base image. 

We tested the new version with trivy, which is the same tool used to produce the vulnerable containers report. Happily, the report gave us the all clear, and we had a clean container to work with. 

This means we now have our own version of the container running the same code, but with none of the security vulnerabilities.

Making the container public

The container is still pinned to a Python version so it’s not a perfect solution, since it will drift as the new base image gets older. This means as future versions of Python are released the container will stay on the older version. One approach might be to take the `latest` tag of Alpine and install Python ourselves rather than using a Python base image, although this has its own maintenance requirements. 

Once we were happy with the new, clean container, we pushed the new image back to Docker Hub and then swapped our pipelines over.

A while later we noticed that the container had more than 100,000 downloads. We realised that what we’d done to fix the vulnerable container was probably useful for other people to know as well. The original container has over 200 million downloads on Docker Hub. The image is already public so we thought we’d write about it.

We’re now looking to take a similar approach to scanning other containers we build and use to see if we can patch vulnerable base images in a similar way. 

Because we’ve not changed anything in the original source image you can implement it in exactly the same way. The only thing you need to change is the image repository path.

resource_types:
- name: http-api
  type: docker-image
  source:
repository: gdscyber/http-api-resource
tag: 1.0a

If you want to implement a similar technique yourself here’s the Dockerfile as an example. The only thing we really needed to do is reinstall the Python requirements file to reinstall the runtime dependencies. 

FROM aequitas/http-api-resource
FROM python:3.7-alpine
COPY --from=0 . .
RUN pip install --no-cache-dir -r requirements.txt
COPY --from=0 /opt/resource/ /opt/resource/
COPY --from=0 /opt/resource-tests/ /opt/resource-tests/
RUN /opt/resource-tests/test.sh

You can check out the image on Docker Hub. If you’ve used the image, please let us know your thoughts by leaving a comment below.

Sharing and comments

Share this page

3 comments

  1. Comment by Tim Wisniewski posted on

    Great write-up, and clever approach! Do you find that you get adequate security scanning running on alpine in the end, though? I've heard CVE scanners don't work as well with alpine.

    Reply
    • Replies to Tim Wisniewski>

      Comment by Dan Jones posted on

      Thanks for your comments Tim.

      We haven't had any reason to suspect it doesn't work. Do you have reason to believe that it doesn't work or doesn't work well?

      https://www.infoq.com/news/2020/04/trivy-docker-harbor/

      "Trivy is able to detect vulnerabilities in a number of Linux operating systems including Alpine, RHEL, CentOS, Debian, Ubuntu, SUSE, and Amazon Linux. According to Aqua, Trivy has a high accuracy for detection of vulnerabilities especially with Alpine Linux and RHEL/CentOS. Teppei Fukuda, OSS engineer at Aqua Security, shared an analysis of vulnerabilities detected on a version of Alpine Linux by a number of vulnerability scanners in which Trivy was most successful."

      Reply

Leave a comment

We only ask for your email address so we know you're a real person

By submitting a comment you understand it may be published on this public website. Please read our privacy notice to see how the GOV.UK blogging platform handles your information.