Update (April 2016): There’s now an updated and revised version of this article.
I recently joined Jimdo’s Werkzeugschmiede team. In brief, our goal is to build a development platform (PaaS) where all Jimdo engineers can deploy and run their applications, ideally as microservices. I’m impressed with what the team has built so far, and to be honest, also a bit surprised just how much work goes into such a platform. I hope that my experience in infrastructure automation will help the project move forward in the right direction.
The platform, which we internally call Wonderland, utilizes Amazon ECS to run Docker containers on a managed cluster of EC2 instances. For the cluster instances, we decided to use CoreOS, a slim Linux distribution optimized for server deployments. Those are the basic building blocks of our microservices infrastructure. Everything else – from providing a deployment API to forwarding metrics and logs to third-party services – is our job.
As is usually the case with non-trivial software, there’s always something – be it ever so small – that doesn’t work as expected. The system is misbehaving in some way or another and you want to figure out what the heck is going on. In that situation, you instinctively fire up your favorite debugging tools, the ones you trust and are comfortable with, no matter the circumstances. The only problem: the tools you need so desperately might not be available in the environment you’re supposed to scrutinize. Now what?
You might be tempted to run
apt-get install or
yum install to get what you want, even on a production server. What could possibly go wrong? Depending on your infrastructure, the answer is either “not much” (you throw away the server instance after debugging) or “a lot” (you keep using the now tainted system – a unique snowflake among your servers). With the rise of container technology, and Docker in particular, there’s another appealing option centered around one simple idea: you bring your tools with you.
As I said, something always goes wrong; our Wonderland is no exception. Just recently, we had to debug one of our Go applications that routes all Docker container output from our cluster instances to Papertrail. For some mysterious reason, the application would stop sending any logs out of the blue.
In order to find out whether log data was being sent at all, I checked if
ngrep were available under CoreOS. The answer was no. Now I could have installed the missing tools inside the Docker container running the Go application (via
docker exec). However, altering the system you want to observe is generally a bad idea. For the same reason, CoreOS prevents you from modifying the data in
/usr, eliminating the ability to install programs outside of containers.
Of course, I wouldn’t write this post if there wasn’t some delightful solution: CoreOS comes with a helpful little script called toolbox, which will launch a container specifically for the purpose of bringing in your favorite command-line tools.
By default, the
toolbox command will give you a container based on
fedora:latest, but you may also use your own custom Docker image with everything preinstalled. The spawned container will have full system privileges allowing you to inspect anything running on CoreOS with
tcpdump, and friends.
Toolbox itself might not be the most exciting engineering achievement out there. Still, it’s a good example of the bring-your-own-tools (BYOT) pattern. Now that containers are the smallest unit of deployment and many companies are starting to embrace new ways of running services in production, I think it makes sense to also adopt the same technology in other areas like debugging.
I, for my part, like the idea of having my tools at my disposal whenever I need them.