Decoding YAML in Go
I originally developed chef-runner as a fast alternative to the painfully slow vagrant provision
. The tool has since evolved and can now be used to rapidly provision not only local Vagrant machines but also remote hosts like EC2 instances (in fact, any system reachable over SSH).
Due to its popularity in the Chef world, I also added support for Test Kitchen, a CI tool for testing infrastructure code. Test Kitchen happens to store its configuration as YAML files. To provision an instance managed by Test Kitchen, chef-runner parses the respective YAML file on disk and extracts SSH connection settings from it.
This was the first time I had to decode YAML data in Go. Decoding itself is easy. There are, however, some more advanced techniques I’d like to write about today. I’m going to show different iterations of the code I wrote for chef-runner. To make complete sense out of the presented examples, though, it’s worth taking a short look at Test Kitchen first.
Test Kitchen
Test Kitchen comes as a Ruby gem. You can install it this way:
Once installed, use the kitchen
command-line tool to create a project skeleton with two Test Kitchen instances:
For our purposes here, it’s enough to boot up one of the instances, e.g., the one based on Ubuntu 12.04:
For each instance, Test Kitchen will store a YAML-encoded configuration file in .kitchen/
. The file looks like what you see below:
Bingo! I was pleased to find out that this file stores all information required to access the instance via SSH. Now I only had to process the data in Go for chef-runner to be able to provision Test Kitchen instances…
Decoding YAML in 3 steps
When it comes to working with YAML in Go, there is no better package than, well, the yaml package:
The yaml package enables Go programs to comfortably encode and decode YAML values. It […] is based on a pure Go port of the well-known libyaml C library to parse and generate YAML data quickly and reliably.
For the sake of this post, I’m only going to focus on decoding YAML (encoding really isn’t that different). With the yaml package, the whole decoding process typically boils down to three steps:
Step 1: Declare a struct type with fields mapping to YAML values
After looking at the contents of the Test Kitchen YAML file again, it’s easy to come up with a struct type that contains all the fields – with proper name and type – we need:
Note that the yaml package will only decode exported struct fields. It will map a field name of, say, Hostname
to a YAML key of hostname
by default. Sometimes keys don’t map nicely to field names. In that case, you may define a different key via a field tag, as I did for SSHKey
. Last but not least, if you don’t care about some YAML value, simply omit it from the struct type.
Step 2: Add a method for decoding
Now that we have our instanceConfig
type, let’s add a Parse
method to it. In its simplest form, this method is just a wrapper around yaml.Unmarshal, which decodes the YAML data within the passed byte slice into our struct:
Since Parse
just wraps yaml.Unmarshal
, you might be wondering why the method needs to exist at all. There are actually two reasons. First, the caller doesn’t have to know anything about YAML decoding (separation of concerns). Second, we can extend Parse
without having to change its signature. For example, in chef-runner I also check that each struct field has a valid (non-zero) value and otherwise return an error:
Step 3: Put it together
Finally, if you put the pieces together – reading the YAML file, decoding the data, printing the result – you will end up with something like this:
The program’s output shows that we successfully decoded the configuration file of the Test Kitchen instance:
With this information at hand, chef-runner is in the position to log into the instance and do its magic via some third-party SSH library.
Auxiliary structs
While the code from the previous section already does the job, there is one thing that bothered me: instanceConfig
stores the SSH port as a string. This is a direct result of Test Kitchen quoting the YAML value like this: port: '2222'
. I wanted the port to be an integer since I had to pass it to a SSH library that way. Having to convert the string after decoding YAML appeared to be a bad solution.
Fortunately, I stumbled upon the slides of what must have been an excellent presentation by Francesc Campoy. In it, Francesc suggests to use an auxiliary struct type to decode JSON that cannot be decoded the usual way. After applying this idea to the Parse
method, the Port
field finally ended up being an integer:
The Unmarshaler interface
One more tip: it is also possible to decode YAML data directly into instanceConfig
by implementing the yaml.Unmarshaler interface. For this, rename and modify the Parse
method to look like this:
Now this code works as expected:
If you’d like to learn more about the actual implementation in chef-runner, you can view the final code here.