Go types: an error story
By Ewan Valentine on December 9, 2019As we all know, bugs can be very, very infuriating. That said, often the most infuriating bugs can turn out to be the most interesting.
In this blog, I’m going to discuss – in brief detail – a Go type error that was the cause for a lot of head-scratching amongst the Peak engineering team. However, we learnt a lot in the process, specifically about Go’s type system and how some of the most popular libraries in Go deal with a range of ambiguities.
What is Go?
Go is an open source programming language utilized by members of the Peak engineering team that makes it easy to build simple, reliable, and efficient software.
One of the features we work on is a web-based SQL IDE – and it’s pretty cool, actually. It hooks into a user’s Redshift data warehouse and allows us to securely write queries and view data in a nice, digestible format. It also allows for the outputting of data into various locations in a wide range of formats.
It’s a relatively brand new feature, so we’ve been doing lots of improvement work and debugging. When we were testing with some real data, we noticed that numeric columns were appearing as base64 encoded. Weird. Really weird!
So, we did some digging. We soon found that the base64 encoded data seemed to be coming from the Go service which ran the queries. We placed some logs here and there, and noticed that the query was returning `[]uint8{51, 48, 48}` instead of the number 300. Then, when we attempted to marshal the data into JSON to send back to the browser, that byte array was being converted into a base64 string.
Now, I’ve been using Go for quite a while, but it’s fair to say I was pretty baffled by these two strange behaviors. But, I wasn’t entirely surprised that the query was returning something strange. Because the queries could be anything on lots of different tables, we couldn’t pre-define the data types we expected back and case the response back to a struct. So, I had to use a lot of pointers and interfaces to be able to parse the response.
I did some further digging, and spoke to a few people clued up on Go via Slack. It turns out that the byte array, returned by the `lib/pg` PostgreSQL driver is because the author – rightly – didn’t want to assume what type of integer you were expecting, as converting between the types could cause inaccuracies and imprecisions. The driver returns an array of unicode references for each digit in the numeric value returns from Postgres. In my case, `51, 48, 48` > `300`. So, all I had to was check whether or not the value was a `[]uint8` and simply wrap it as a string `string(rawVal)`.
Hooray! That was the first mystery solved – but why was it returning a base64 string?
After yet more exploration with the team, we found that the `json.Marshal` function doesn’t know what to do with a byte array, so it assumes it’s binary data. The Marshal function tries to be useful by encoding the data as base64, and it does this because the encoded data is potentially much smaller than the potential binary data. So, in essence, it attempts to compress this binary data.
So there you have it – Go type error solved! This bizarre behavior actually makes a lot of sense once you get your head around it. It’s also largely down to us attempting to deal with loose types, by using lots of pointers, interfaces, and reflection – which you should always use with caution in Go. It is, after all, a strongly-types language, so any attempts to ‘get around this’ can lead to trouble without caution!