How-to Debug a Running Docker Container from a Separate Container

Posted on February 13, 2017 • 4 minutes • 777 words

Alternate title: How-to debug freaking go binary containers

Containers are great for shipping software, but sometimes you can go too far when stripping down your container to make it as small as possible. There’s a fine balance between a “no-frills” image and something impossible to debug (I’m looking at you, single binary go containers).

The normal way I see people debug a running container is to docker exec -it $CONTAINER sh and install their debug tools on demand in the container. But what if your container doesn’t have /bin/sh? What if it doesn’t have a package manager? You could copy utilities into the container with docker cp and then exec into the running container but that’s also a hassle.

So instead of trying to debug from within the container, a friend recently asked how you could debug from a different container. I’m not that smart so I asked much smarter people online and got a good answer back. Thanks again Justin Cormack for the reply .

Let’s create a stripped down container with only caddy .

First download/extract the caddy binary

$: curlhttps://getcaddy.com | bash -s personal && mv /usr/local/bin/caddy .

Then create a Dockerfile to copy the binary into a scratch container.

FROM scratch
ADD caddy /

Build the container and run caddy.

$: docker build -t caddy .
<output trimmed>

Now run the container.

$: docker run -d --name caddy -p 2015:2015 caddy /caddy

Now caddy is running publishing port 2015 (currently giving a 404 page because there’s no content that but doesn’t matter). How do you debug the container? Not that you’d ever need to, caddy doesn’t have bugs. :) But for hypothetical reasons.

Many people suggested using --link but that only puts the containers on the same network. Not the same namespace, but connected to each other on the same virtual network.

$: docker run -it --rm --link caddy:caddy alpine sh/ # ping caddy -c 1
PING caddy (172.30.238.2): 56 data bytes
64 bytes from 172.30.238.2: seq=0 ttl=64 time=0.075 ms/ # ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 sh
    8 root       0:00 ps aux

Others suggested using --volumes-from but that doesn’t let you mount your tools into an existing running container unless that running container is exporting a volume and that volume is already in the $PATH.

Instead we’re going to build a separate container with all the tools we need (in this case strace) and run it in the same pid and network namespace as the original container.

First create a debug container with strace

FROM alpine
RUN apk update && apk add strace
CMD ["strace", "-p", "1"]

Build the container

$: docker build -t strace .
<output trimmed>

Now run your strace container in the same pid and network namespace.

$: docker run -t --pid=container:caddy \
  --net=container:caddy \
  --cap-add sys_admin \
  --cap-add sys_ptrace \
  stracestrace: Process 1 attached
futex(0xd72e90, FUTEX_WAIT, 0, NULL

This attached strace to the caddy process and is following it as it executes.

That’s great but we also can get to the root filesystem (not that there’s much of one) of the remote container. This time we’ll just use the alpine image and launch a shell, again in the same pid and network namespace.

$: docker run -it --pid=container:caddy \
  --net=container:caddy \
  --cap-add sys_admin \
  alpine sh

We can now see caddy running

/ # ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 /caddy
   13 root       0:00 strace -p 1
   34 root       0:00 sh
   40 root       0:00 ps aux

The caddy container file system is available in /proc/1/root

/ # ls -l /proc/1/root/caddy
-rwxr-xr-x    1 root     root      16099400 Jan 24 15:30 /proc/1/root/caddy

With this container attached to the original we can do more debugging. You can still debug the network but make sure you use localhost because your new sh process is running in the same network namespace

/ # apk update && apk add curl lsof
/ # curl localhost:2015
404 Not Found/ # lsof -i TCP
COMMAND PID USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
caddy     1 root    4u  IPv6 330044347      0t0  TCP *:2015 (LISTEN)

All your standard debugging tools should work from this 2nd container without tainting the original container. If you run into errors make sure you check your kernel permissions (notice how strace needed --cap-add sys_ptrace but the sh container only needed sys_admin)

This can obviously be helpful for go containers or any other container that you just need to bring some extra debugging tools into without changing the container itself.

Let me know if you’ve found any other helpful tricks for debugging containers on the fly.

Your best option to get a response or feedback is on twitter. https://twitter.com/rothgar