11 minutes
A glimpse at parametric polymorphism in Go: designing a generic bidirectional map
TL;DR ¶
To familiarise myself with the updated design draft on Type Parameters in Go, I wrote a generic implementation of a bidirectional map. You can try it out in this playground.
Edit (2022-04-03): Now that Go 1.18 is out, I’ve spruced up this post a bit.
Generics are coming to Go ¶
Support for parametric polymorphism (also colloquially known as “generics”) in the Go language is in active development. Last month, the Go team published a blog post detailing their progress on this front.
Johnny Boursiquot and Jaana B. Dogan reacted enthusiastically to the announce, which spurred me to read the updated design draft and take Go generics for a spin myself!
Why generics matter ¶
Before you keep reading this post, if you don’t know what generics are or if you’re doubtful as to whether they’re a good fit for Go, I invite you to read Ian Lance Taylor’s 2019 post on the topic.
The latest version of Go at the time of writing this post (v1.14)
doesn’t support generics.
As a result, library authors striving for maximum flexibility often have
no other choice but to resort to using the empty interface
(interface{}
) all over the place (see sync.Map
, for instance).
Unfortunately, the ramifications of such a design choice can be acutely felt
in client code.
The empty interface being such an indiscriminate abstract type,
it provides no compile-time guarantee as to what behaviours the concrete type
underneath might exhibit.
Users of the library are then forced to pepper their code with type assertions
in order to manipulate the concrete types hiding underneath interface{}
.
Generics purport to solve this problem:
they hold the promise of more flexible code
without having to compromise on usability or type safety.
Some legitimate questions about the addition of generics to Go remain, though. Will it affect compilation speed detrimentally? More importantly, will it significantly compromise Go’s agenda of simplicity? Will it drastically complicate writing, reading, and consuming Go code? Finding answers to some of those questions was my primary motivation for writing this post.
Important changes to the generics design draft ¶
I won’t go into the gory details here; you can read the updated design draft for completeness. Here are the main elements I took away from it.
Contracts are gone ¶
Most importantly, contracts have been dropped from the design draft. Thanks to insight from parametric-polymorphism expert (and lambda-calculus superhero) Phil Wadler and his collaborators, the Go team was able to unify contracts under the existing concept of interfaces. I believe this to be an important step in the right direction, for two main reason:
- The concept of contracts was always confusingly similar to, yet distinct from, that of interfaces. I remember this blurred line caused a mental block for me: it probably explains why I didn’t delve into the generics draft design before the most recent update.
- The addition of a
contract
keyword, which would have required a break of compatibility with Go 1.0 at the source level, is no longer needed. The current draft proposal still requires changes to the language, but those changes are expected to be largely inconsequential for the majority of programs.
A new comparable
interface ¶
Another notable change is the addition of an interface type named comparable
as a predeclared name in the language.
You guessed it: comparable
denotes types that can be compared with operators
==
and !=
.
One use case for generics: a bidirectional map ¶
After perusing the updated design draft, I scoured through my personal notes for interesting use cases of generics in Go libraries and programs, and the idea for a generic bidirectional map popped up. According to Wikipedia, a bidirectional map is
an associative data structure in which the key-value pairs form a one-to-one correspondence. Thus the binary relation is functional in each direction: each value can also be mapped to a unique key.
In other words, a bidirectional map is an abstraction not unlike a regular map but that enforces the following invariant: each value is associated to one and only one key.
In one of my side projects, I recently came across a use case (a tag system if you must know) for a bidirectional map. A non-generic implementation (of strings to strings) did the trick, but it left me wondering:
- What would a generic implementation look like?
- How does the addition of generics to the language change API design of Go packages?
- How natural would it feel to use such a package?
Curiosity proved too strong to resist and I set out to design a Go API for a generic bidirectional map.
API design of a generic bidirectional map ¶
First things first: before diving into a possible implementation, let me outline what I’m trying to achieve.
The bimap
package and its exported Bimap
type ¶
The objective is to write a single package named bimap
, which
exports a (concrete) generic bidirectional-map type named Bimap
.
This type should take two comparable type parameters,
one for for keys and and one for values.
The zero value should be readily usable ¶
An important lesson in API design from Josh Bloch that marked my Java days and has stayed with me ever since is that
[…] not only should it be easy to use a good API, but it should be hard to misuse a good API.
Designing a type so that its zero value correspond to a valid state is an important principle in Go, as it helps making the type’s API hard to misuse.
Factory function and methods ¶
bimap
should provide a factory function for Bimap
; such a function
is named New
, by convention.
Here is a list of Bimap
’s methods:
func (bi *Bimap[K, V]) Store(key K, value V)
func (bi *Bimap[K, V]) LoadValue(k K) (V, bool)
func (bi *Bimap[K, V]) LoadKey(v V) (K, bool)
func (bi *Bimap[K, V]) DeleteByKey(k K)
func (bi *Bimap[K, V]) DeleteByValue(v V)
func (bi *Bimap[K, V]) Size() int
func (bi *Bimap[K, V]) String() string
I apologise for the verbosity of some of those method names, but I wanted them to reflect the symmetry of the data structure: the roles of keys and values could indeed be swapped without consequence.
Note that all those methods use a pointer receiver:
most of them require mutation of the Bimap
’s internal state,
and mixing pointer and value receivers within a single type is discouraged.
Store
stores a key-value pair in the bidirectional map.
Several semantics are possible, but one natural design choice is to silently
drop pre-existing pairs involving the supplied key or the supplied value.
LoadValue
returns the value stored in the bidirectional map for a key
and a boolean that indicates whether the key in question was found;
LoadKey
is LoadValue
’s symmetric operation.
DeleteByKey
and DeleteByValue
delete the key-value pair associated with
the given key or value, respectively.
Size
returns the number of key-value pairs in the bidirectional map.
Finally, Keys
returns a slice of the keys,
and Values
returns a slice of the values, in no deterministic order.
The API of my bimap
package is deliberately minimalistic.
In particular, Bimap
lacks a Range
method;
I leave this as an exercise for you.
For visualisation purposes, though, I’ll make Bimap
a Stringer
,
with a simple String
implementation.
One possible implementation ¶
A very common implementation of a bidirectional map involves maintaining two
maps, one from keys to values and another from values to keys;
not a bad starting point,
at least for a reference implementation of such a data structure.
Because Bimap
will hold at least two maps, the most natural choice for
its underlying type is a struct.
All those design decisions lead us to the following type declaration for Bimap
:
type Bimap[K, V comparable] struct {
forward map[K]V
inverse map[V]K
}
The type parameters K
and V
, which are constrained to be comparable,
are introduced in square brackets following the type name.
For encapsulation purposes, the struct is completely opaque
i.e., none of the fields are exported:
otherwise, users of the bimap
package would be able to mess with
Bimap
’s internals directly and break its invariant.
Writing the rest of the bimap
package is relatively straightforward:
func (bi *Bimap[K, V]) Store(key K, value V) {
k, exists := bi.inverse[value]
if exists { // value is already associated with k
delete(bi.forward, k)
}
v, exists := bi.forward[key]
if exists { // key is already associated with v
delete(bi.inverse, v)
}
if bi.forward == nil { // bi hasn't been initialised yet
bi.forward = make(map[K]V)
bi.inverse = make(map[V]K)
}
bi.forward[key] = value
bi.inverse[value] = key
}
func (bi *Bimap[K, V]) LoadValue(k K) (V, bool) {
v, ok := bi.forward[k]
return v, ok
}
func (bi *Bimap[K, V]) LoadKey(v V) (K, bool) {
k, ok := bi.inverse[v]
return k, ok
}
func (bi *Bimap[K, V]) DeleteByKey(k K) {
v := bi.forward[k]
delete(bi.forward, k)
delete(bi.inverse, v)
}
func (bi *Bimap[K, V]) DeleteByValue(v V) {
k := bi.inverse[v]
delete(bi.inverse, v)
delete(bi.forward, k)
}
func (bi *Bimap[K, V]) Size() int {
return len(bi.forward)
}
func (bi *Bimap[K, V]) Keys() []K {
var keys []K
for k := range bi.forward {
keys = append(keys, k)
}
return keys
}
func (bi *Bimap[K, V]) Values() []V {
var values []V
for v := range bi.inverse {
values = append(values, v)
}
return values
}
func (bi *Bimap[K, V]) String() string {
return fmt.Sprintf("Bi%v", bi.forward)
}
Edit (2020-07-23): Store
contains a bug; see the Addendum.
Note that, according to the current design draft,
a generic receiver must specify all the type parameters that
are required by the type,
even when some of those type parameters are not used within the method’s body.
For instance, declaring Size
as follows
func (bi *Bimap[_, _]) Size() int
is illegal. Being to able to use the blank identifier in such places would certainly be nice, but I don’t know whether it’s on the cards. I’m only speculating here, but perhaps implementing this feature in the compiler would complicate monomorphisation…
The full source code of my bimap
package, along with a test suite,
is available in my jub0bs/bimap repo on GitHub.
Consuming the bimap
package ¶
Once concrete types have been specified for a Bimap
(a process known in the design draft as instantiation),
the rest of the client code feels to me as natural as pre-generics Go code:
bi := bimap.New[int, string]()
bi.Store(0, "Planets&Stars")
bi.Store(1, "Chemistry")
bi.Store(0, "Astronomy")
bi.DeleteByValue("Chemistry")
What do you think?
Conclusion ¶
A bidirectional map is arguably a very simple use case of generics in Go, but it gives you some idea of what generic code will look like, should the design draft get accepted with little to no modifications.
What’s great about generics is that
this bimap
package only needs to be written once
and can be used with any combination of (comparable) concrete types for the
keys and values.
As a library author, I find this liberating.
Moreover, I doubt that the added cognitive load required from users of generic code will prove prohibitive. Generics may be rightfully dreaded in other languages, but the absence of inheritance makes Go generics comparitively simpler to apprehend. As far as I’m concerned: Go generics for the win!
Addendum (2020/07/23) ¶
Unfortunately, as Martin Möhrmann
pointed out to me on the Gophers Slack workspace,
NaN
(Not a Number) throws a spanner in the works
and reveals a subtle bug in my original implementation of Bimap
.
The Bimap
type uses two backing maps, and maps exhibit a rather weird behaviour
with keys for which equality isn’t reflexive
(i.e. k
such that k == k
evaluates to false
).
Equality isn’t reflexive for NaN
and
for values of any composite type or defined type that contain a NaN
.
Because their type is comparable,
such values can legally be added as keys to a map,
but they cannot be loaded or deleted;
check this for yourself in this playground.
As a result, the invariant of my original implementation of Bimap
can be broken after such values are stored in it;
see an example in this playground.
At first sight, this unforeseen difficulty may seem inescapable…
An easy way out, though,
short of changing the internal representation of Bimap
,
is to disallow storing keys and values for which equality isn’t reflexive.
I can define a generic predicate for checking whether equality is reflexive
for its argument,
func isEqualityReflexive[T comparable](t T) bool {
return t == t
}
and check that both the key and the value passed to the Store
method satisfy
this predicate before adding the key-value pair to the Bimap
.
Rather than silently denying disallowed values,
Store
can be modified
to return whether the operation was successful:
// Store creates a key-value pair and returns whether or not the
// operation was successful. Pre-existing key-value pairs (if any)
// that involve the given key and/or the given value are silently
// removed from the Bimap. Keys and values for which equality is
// not reflexive are disallowed.
func (bi *Bimap[K, V]) Store(key K, value V) bool {
if !isEqualityReflexive(key) || !isEqualityReflexive(value) {
return false
}
k, exists := bi.inverse[value]
if exists { // value is already associated with k
delete(bi.forward, k)
}
v, exists := bi.forward[key]
if exists { // key is already associated with v
delete(bi.inverse, v)
}
if bi.forward == nil { // bi hasn't been initialised yet
bi.forward = make(map[K]V)
bi.inverse = make(map[V]K)
}
bi.forward[key] = value
bi.inverse[value] = key
return true
}
Note that this boolean result is only useful when either the key type or the value type or both contain such disallowed values; with all other types, you can safely ignore that result.