zackoverflow

Go Generics and Static Dispatch

Interfaces rely on slow dynamic-dispatch, but generics could open the door to performance boosts with the ability to leverage the cache-friendliness of static dispatch


There is a lot of contention related to Go’s slated generics feature. One side welcomes the ensuing developer ergonomics, while the other side fears that the feature will taint Go’s elegant minimalism and simplicity (probably PTSD from C++‘s template meta-programming).

But one point I’ve recently been thinking of, and that I don’t see talked about enough, is that using generics means we will see big performance gains since we won’t have to incur the performance overhead of interfaces and dynamic dispatch.

Generic operations using polymorphism and interfacesλ

The primary way Go developers write generic functions and operations without generics is through interfaces. This is a nice solution because we all love the simplicity and elegance of interfaces:

type Quacker interface {
    Quack() string
}

// Polymorphism
func DoQuack(quacker Quacker) {
    fmt.Println(quacker.Quack())
}

Interfaces allow us to create a form of generic functions. For example the standard library’s sort package uses interfaces to implement generic sorting functions:

// Package sort
type Interface interface {
    Len() int
    Less(i, j int) bool
    Swap(i, j int)
}

func Sort (data Interface) {
    n := data.Len()
    quickSort(data, 0, n, maxDepth(n))
}

The Sort() function can be used on any type that implements the Interface interface. This is a common pattern in a Go, and you’ll see interfaces used as a means of achieving polymorphism and providing generic operations. Unfortunately, interfaces come at the performance cost of dynamic dispatch.

Dynamic dispatchλ

One very important thing to consider is that calling an interface’s method is done via dynamic dispatch. Put very simply, an interface itself is a data structure containing it’s underlying data, and also metadata information like it’s type. The metadata also contains an array of function pointers that are the implementations of the interface’s defined functions. When you call a method on an interface, you are dereferencing the function pointer and then calling the function. This means that finding and then invoking an interface is done at runtime.

You may think this shouldn’t have a large overhead, isn’t it just a single pointer indirection?

Explaining the performance overhead of dynamic dispatch is lengthy, but the gist is this: dynamic dispatch is not cache-friendly. Because the CPU doesn’t know about the function in advance, it can’t pre-fetch instructions/data, pre-execute code, etc. This has a huge performance overhead.

So even though indirecting a pointer itself is fast, you miss out on all the optimizations the CPU could make, and this is where the performance problem lies.

Genericsλ

The way most implementations of generics work, is that they are really just syntax for automatic code generation. Here is an example in Rust (note that Rust’s traits are analogous to Go’s interfaces):

trait Sortable {
    fn len(&self) -> usize;
    fn less(&self, i: usize, j: usize) -> bool;
    fn swap(&self, i: usize, j: usize);
}

struct MyList;

impl Sortable for MyList {
  // implement the functions here
}

fn sort<T: Sortable>(sortable: T) {
    let n = sortable.len();
    quick_sort(sortable, 0, n, max_depth(b));
}

fn main() {
    let my_list = MyList;
    sort(my_list);
}

At compile time this will generate a sort function specifically for MyList (Note that quick_sort() would also have to be a generic function, but it is left out for brevity):

fn sort_mylist(sortable: MyList) {
    let n = sortable.len();
    quick_sort(sortable, 0, n, max_depth(b));
}

Note that there is no source code being generated, this is just used internally by the compiler.

This is called monomorphization and it is a common technique for languages with generics.

The advantage here is now we do know where this function is at compile time, and the CPU is free to perform all of its cache optimization. Without generics we cannot do this in Go, because interfaces will always use dynamic dispatch.

Go is already quite a performant language so I am excited to see the impact of this performance boost. My prediction is that this will be a big win for use cases like the sort package I showed above, basically any functions that operate on a large N will definitely see big performance gains.

The current and less ergonomic solutionλ

As I mentioned before, generics are just syntax for automatic code generation. You can achieve the same effect as generics, though in a more manual and less ergonomic way, using Go’s go generate command. You simply write a Go program that generates the code you want and go generate facilitates managing the files that will be auto-generated and such.

Obviously, this is a “frictionful” experience, and is not as seamless as if the functionality were made available as language syntax. Additionally, this type of workflow is cumbersome if you employ it in a public API. For example, imagine the sort package was designed like this and made you generate each implementation of the Sort() function for every type variant of slices that you used.

That doesn’t mean code generation in general is terrible, it has its uses. The oto project is a good example, a “Go driven rpc code generation tool for right now.”. It generates Go “scaffolding” code for writing API services, and the corresponding JS client code for consuming them.

Further readingλ

This section of the Go internals book is a fun read and talks in great detail about: the internals of an interface, the overhead of dynamic dispatch

This is another fun read on the internals of interfaces.