Using multi-stage Docker builds to patch vulnerable containers

At Government Digital Service (GDS) we use Concourse for continuous delivery and other automation tasks across many of our projects. Concourse has a lot of community-contributed content, mostly pre-built container images published as public images on Docker Hub.

To help us in the Cyber Security team provide simple health monitoring for our pipelines, we built a webhook in AWS Lambda. We used one of these community-built container images, in this case http-api-resource, to send data to the webhook.

Discovering vulnerabilities

However we soon found that the http-api-resource image was on a published list of vulnerable containers. This list is an analysis of the top 1,000 most popular images on Docker Hub by number of downloads.

The container had a number of vulnerabilities marked as critical. The Cyber Security team investigated the vulnerabilities and discovered there was nothing wrong with the container code itself. All the vulnerabilities were all in the Python 3.6 base image.

We reviewed the vulnerabilities with GDS penetration testers and concluded that although critical, our exposure was limited given the role the container played in our pipeline.

To patch the vulnerabilities we used a multi-stage Docker build. This meant we could take the code from the latest version of the source image and then replace the base image with a less vulnerable Alpine base image.

We tested the new version with trivy, which is the same tool used to produce the vulnerable containers report. Happily, the report gave us the all clear, and we had a clean container to work with.

This means we now have our own version of the container running the same code, but with none of the security vulnerabilities.

Making the container public

The container is still pinned to a Python version so it’s not a perfect solution, since it will drift as the new base image gets older. This means as future versions of Python are released the container will stay on the older version. One approach might be to take the `latest` tag of Alpine and install Python ourselves rather than using a Python base image, although this has its own maintenance requirements.

Once we were happy with the new, clean container, we pushed the new image back to Docker Hub and then swapped our pipelines over.

A while later we noticed that the container had more than 100,000 downloads. We realised that what we’d done to fix the vulnerable container was probably useful for other people to know as well. The original container has over 200 million downloads on Docker Hub. The image is already public so we thought we’d write about it.

We’re now looking to take a similar approach to scanning other containers we build and use to see if we can patch vulnerable base images in a similar way.

Because we’ve not changed anything in the original source image you can implement it in exactly the same way. The only thing you need to change is the image repository path.

resource_types: - name: http-api type: docker-image source: repository: gdscyber/http-api-resource tag: 1.0a

If you want to implement a similar technique yourself here’s the Dockerfile as an example. The only thing we really needed to do is reinstall the Python requirements file to reinstall the runtime dependencies.

FROM aequitas/http-api-resource FROM python:3.7-alpine COPY --from=0 . . RUN pip install --no-cache-dir -r requirements.txt COPY --from=0 /opt/resource/ /opt/resource/ COPY --from=0 /opt/resource-tests/ /opt/resource-tests/ RUN /opt/resource-tests/test.sh

You can check out the image on Docker Hub. If you’ve used the image, please let us know your thoughts by leaving a comment below.

3 comments

Comment by Tim Wisniewski posted on 13 July 2020

Great write-up, and clever approach! Do you find that you get adequate security scanning running on alpine in the end, though? I've heard CVE scanners don't work as well with alpine.

Link to this comment
- Replies to Tim Wisniewski>
  Comment by Dan Jones posted on 24 July 2020
  
  Thanks for your comments Tim.
  
  We haven't had any reason to suspect it doesn't work. Do you have reason to believe that it doesn't work or doesn't work well?
  
  https://www.infoq.com/news/2020/04/trivy-docker-harbor/
  
  "Trivy is able to detect vulnerabilities in a number of Linux operating systems including Alpine, RHEL, CentOS, Debian, Ubuntu, SUSE, and Amazon Linux. According to Aqua, Trivy has a high accuracy for detection of vulnerabilities especially with Alpine Linux and RHEL/CentOS. Teppei Fukuda, OSS engineer at Aqua Security, shared an analysis of vulnerabilities detected on a version of Alpine Linux by a number of vulnerability scanners in which Trivy was most successful."
  
  Link to this comment
  - Replies to Dan Jones>
    
    Comment by Tim Wisniewski posted on 25 July 2020
    
    Ah, I'd heard this from Bret Fisher's course and materials:
    
    https://www.youtube.com/watch?v=e2pAkcqYCG8
    
    It looks like trivy may be the exception, or that trivy has improved its alpine support since that video and related materials. His evergreen AMA on security suggests the same:
    
    https://github.com/BretFisher/ama/issues/17
    
    Link to this comment

This blog post was published under the 2015-2024 Conservative Administration

Using multi-stage Docker builds to patch vulnerable containers

Discovering vulnerabilities

Making the container public

Share this page

3 comments

Technology in government

Categories

Work with us

Sign up and manage updates

Find out more

Recent Posts

Comments and moderation

Discovering vulnerabilities

Making the container public

Sharing and comments

Share this page

3 comments

Related content and links

Technology in government

Categories

Work with us

Sign up and manage updates

Find out more

Recent Posts

Comments and moderation