Teaching Go programs to love JSON and YAML

3 minute read

Go allows you to work effectively with two of the most popular data serialization formats: JSON and YAML.

JSON is commonly used for communication between backend services and clients such as JavaScript applications. Go has built-in support for converting between Go values and JSON objects, making it straightforward to both expose and talk to HTTP/JSON APIs.

With its compact syntax, YAML is said to be more “human friendly” than JSON. The format’s line and whitespace delimiters make it a good fit for configuration files. In Go, the yaml package provides everything you need to parse and generate YAML data.

That being said, what format are you going to choose? Do you prefer JSON, which lends itself to being processed by machines (and powerful tools like jq)? Or do you like storing your data in YAML, which is easier to read and write for most human beings? Of course, the answer is that it depends. It depends on who – or what – is going to process the data.

However, I didn’t write this post to argue over which format is superior – both have their place. Instead, I want to show how you can support JSON and YAML in Go programs with little effort.

Learning from Kubernetes

I first saw this technique being used in Kubernetes. The Kubernetes API generally accepts and returns JSON. But kubectl, the command-line tool for interacting with the API, understands YAML as well. I was curious how they pulled this off without duplicating too much logic, mainly because I wanted to do the same thing in a deployment tool at work.

As it turns out, kubectl uses github.com/ghodss/yaml to convert YAML input to JSON before sending it to the Kubernetes API. Here’s the Go code in question:

// pkg/util/yaml/decoder.go

import "github.com/ghodss/yaml"

// ToJSON converts a single YAML document into a JSON document
// or returns an error. If the document appears to be JSON the
// YAML decoding path is not used.
func ToJSON(data []byte) ([]byte, error) {
	if hasJSONPrefix(data) {
		return data, nil
	}
	return yaml.YAMLToJSON(data)
}

// ...

var jsonPrefix = []byte("{")

// hasJSONPrefix returns true if the provided buffer appears to start with
// a JSON open brace.
func hasJSONPrefix(buf []byte) bool {
	return hasPrefix(buf, jsonPrefix)
}

// Return true if the first non-whitespace bytes in buf is prefix.
func hasPrefix(buf []byte, prefix []byte) bool {
	trim := bytes.TrimLeftFunc(buf, unicode.IsSpace)
	return bytes.HasPrefix(trim, prefix)
}

You’re reading correctly – yaml.YAMLToJSON is all it takes to do the conversion. How is this possible, given that there are almost no YAML struct tags in the entire Kubernetes codebase? Here’s what the github.com/ghodss/yaml package does under the hood:

In short, this library first converts YAML to JSON using go-yaml and then uses json.Marshal and json.Unmarshal to convert to or from the struct. This means that it effectively reuses the JSON struct tags as well as the custom JSON methods MarshalJSON and UnmarshalJSON unlike go-yaml.

Neat idea! By reusing JSON struct tags (which you still need to define yourself), the package enables your Go programs to consume JSON and YAML as input formats without having to write a lot of duplicate code.

Wrapping up

I ended up using github.com/ghodss/yaml at work and made some of our YAML-loving customers happy. I like giving users the ability to choose what works best for them, and the technique I presented just makes it so damn easy to give them a choice.

In a similar way, I recently submitted a pull request to add YAML as an output format to Vault. Check it out if you’re looking for a concise example.

Updated: