Home - Topics - Papers - Theses - Blog - CV - Photos - Funny |
The Go language appears to be getting slightly closer to supporting generics, with the recent release of a new proposal for generics based on type parameters and contracts.
I generally like the direction this proposal is going, with one significant reservation. I feel that the current proposal both unnecessarily limits how “generic” Go’s generics will be, and risks painting the language into a corner in the long term. This is because the proposal single-mindedly assumes that the only compile-time generic parameters we will ever want are type parameters, which seems like an unnecessary and undesirable restriction. This blog post offers feedback to the current generics draft proposal, exploring this concern and potential solutions.
In the current generics proposal,
function definitions may be parameterized
with a set of optional type parameters,
separate from and preceding the function’s regular parameters,
as illustrated in the proposal’s simple generic Print
example:
func Print(type T)(s []T) {
for _, v := range s {
fmt.Println(v)
}
}
What is actually going on here is that
the generic function Print
takes one compile-time parameter T
(a type)
and one run-time parameter s
(a slice of type T
).
A user of this generic binds T
to a specific actual type at compile time,
which may happen either explicitly or implicity using type inference,
to produce a “version” of Print
specialized to type T
.
This specialization of Print
to type T
is in turn
a function taking a run-time parameter s
,
which the function’s caller will bind to a run-time actual parameter:
i.e., a specific slice with elements of type T
to be printed.
Of course, as the proposal discusses,
such a generic can be compiled multiple ways with different tradeoffs.
The compiler could produce a single block of Print
code
that internally uses polymorphism or reflection mechanisms
to operate dynamically on any type that might be passed to it.
Alternatively,
the compiler might actually instantiate a separate copy of the Print
code
specialized to each specific type that callers happen to bind T
to.
The compiler has both options: that’s part of the point.
The restrictions the generics proposal imposes on type parameter lists
ensure that the compiler can enumerate all possible bindings of T
if it needs to,
and thus has the option of choosing either implementation approach.
Run-time function parameters clearly do not in general,
and should not be expected to,
enable the compiler to determine at compile-time (for example)
all possible slices that might be passed in as the run-time s
parameter.
With run-time parameters,
compile-time enumeration of all parameter bindings
and compile-time specialization based on them is not, in general, an option.
The central crux of this post is simple:
what most fundamentally distinguishes the optional “generic” parameter list
of a generic function such as that above from its regular parameters
is not the fact that T
is a type parameter
while s
is a slice parameter,
but the fact that T
is a compile-time parameter
while s
is a run-time parameter.
I that think the current Go 2 Generics proposal unnecessarily “bakes in” an assumption that all compile-time parameters must be type parameters while all run-time parameters must be non-type parameters (e.g., values of specific types such as slices, integers, etc.). This baked-in assumption may appear expedient in keeping the current proposal “as simple as possible” — though any realistic generics proposal including this one is inevitably far from simple — but in doing so this decision may paint the language into a corner and preclude useful and powerful generalizations either now or in the future.
Suppose for a moment that Go 2 were to follow the C++/Java tradition
of using angle brackets instead of parentheses
to distinguish the compile-time parameters of generics.
I am not actually proposing this syntax,
as I agree with the proposal’s preference to avoid the ambiguities arising
from overloading either angle brackets or square brackets for this purpose.
But merely taking C++-style syntax as a comparison point for now,
the above Print
generic function might become:
func Print<T>(s []T) {
for _, v := range s {
fmt.Println(v)
}
}
But must all generic parameters be types and only types? It is often useful in practice to be able to write code that can be specialized at compile-time according to non-type parameters such as integers.
The Go language, in fact,
already has a built-in generic type that takes a non-type parameter.
An array type, specified as [n]t
, takes a constant-integer parameter n
along with its element-type parameter t
.
So Go already has a limited precedent for non-type parameters to generic types.
Go 2 should should be consistent and compatible with this precedent,
and ideally expand on and generalize it.
It would be nice if Go 2 generics allowed us, for example,
to write generic functions that could be specialized at compile-time
to fixed-length arrays of different lengths.
Suppose, for example, that we attached types to compile-time parameters,
just like we do with run-time parameters.
Again adopting C++-like syntax for now,
we could write a generic dot-product function over float32
arrays
of any length fixed at compile-time as follows:
func DotProduct<N int>(a, b [N]int) float32 {
var sum float32
for i := range a {
sum += a[i] * b[i]
}
return sum
}
Of course it would be nice to make such a function generic
over the element type as well instead of fixing it on float32
,
and that should be possible using the contracts in the current proposal:
either by listing all the built-in types supporting the *
operator,
or by defining a contract with an element-type parameter
supporting a method like Mul
.
But for the moment let’s ignore the contract issues
and pretend that all types magically supported the *
operator.
The above DotProduct
example might then become:
func DotProduct<N int, T type>(a, b [N]T) T {
var sum T
for i := range a {
sum += a[i] * b[i]
}
return sum
}
Notice that we now have both type and non-type compile-time parameters,
each declared in traditional Go “identifier type’ declaration order,
with the keyword type
doubling as a “meta-type” denoting
the type of a type parameter.
As a purely stylistic matter,
this approach feels more consistent with Go’s existing syntax
than the current generics proposal,
where a single run-time parameter is declared like (i int)
while a single type parameter is declared like (type t)
.
We could similarly consider whether the type
keyword in its “meta-type” role
should perhaps become usable in declarations
of ordinary run-time parameters and variables as well.
Go already has considerable precedent for treating types as runtime objects
via its reflection API.
In a runtime parameter list,
the built-in type
type might be either an alias of
or otherwise closely related to the existing reflect.Type
interface,
for example,
which already essentially serves this role in run-time usage of types.
But this would be a separate and orthogonal discussion to be had,
and I have not thought through the potentially nontrivial syntax implications
of the type
keyword becoming usable
anywhere a type name or literal may be used now.
But regardless of how permissive or restrictive Go’s generics prove to be in terms of permitting non-type compile-time parameters, in the interest of keeping future options open if nothing else, it would seem desirable to avoid baking into Go 2’s generics an unchangeable assumption that all compile-time parameters are type parameters and all run-time parameters are non-type parameters. Because both type and non-type parameters certainly make sense as both compile-time and run-time parameters, in all four combinations.
const
?Notice that Go’s built-in array generic restricts its length parameter to be an integer constant, and for good reason: an array’s length must be evaluable at compile time in order for arrays to be embeddable in fixed-length structures without indirection or allocation, and generally in order to serve the role they were designed to fill. The restriction of the array’s length to a compile-time constant is part of the array’s fundamental efficiency contract.
If Go 2 were to allow non-type compile-time parameters to generics,
then it would make sense to restrict the values bound to such parameters
to be constants, for exactly the same reason.
You could instantiate the above DotProduct
example
with any constant integer N
known at compile time,
but not with an integer variable or parameter known only at runtime.
Notice that Go already has a keyword whose purpose
is to bind a declared identifier to a constant evaluable at compile-time:
namely const
.
You can already use const
to assign the result of a complex expression,
as in:
const MeaningOfLife int = 2*3*7
But the compiler checks that it can actually evaluate the expression
at compile time and complains otherwise.
Constants being constants,
one would presumably be allowed to use the above MeaningOfLive
identifier
in instantiating a generic like DotProduct
above
taking a non-type int
parameter,
just like one can already use MeaningOfLive
in declaring a Go array,
while one could not use an ordinary (run-time) integer variable or parameter
for either purpose.
The const
keyword currently serves only this one rather restricted purpose,
but that doesn’t mean that this is the only purpose it can serve.
What if we were to use it also in declaring other things
whose values must be evaluable and bound to constants at compile time —
things like compile-time parameters to generics, for example?
If Go wishes to avoid the ambiguities of C++/Java-style generic syntax
for the reasons I understand and respect,
then I propose that a more flexible, “generic”, rational,
and linguistically-consistent
syntax would be to use const
instead of type
to introduce
compile-time parameter lists for generics.
For example, the basic Print
generic would become:
func Print(const T type)(s []T) {
for _, v := range s {
fmt.Println(v)
}
}
A generic function with two compile-time type parameters and two run-time parameters would look like this:
func Print2(const T1, T2 type)(s1 []T1, s2 []T2) { ... }
The example DotProduct
generic above with two compile-time parameters,
one a non-type parameter and the other a type parameter,
would now look like this:
func DotProduct(const N int, T type)(a, b [N]T) T { ... }
In general, the (const ...)
syntax would declare a parameter list
that must be bound to concrete actual parameters at compile-time
rather than at run-time,
but otherwise looks and works like standard Go parameter list syntax.
I believe this approach is compatible with the semantics and syntax
of contracts in the current Go 2 generics proposal.
The new contract
construct effectively declares additional meta-types
more restrictive than type
.
For example, after declaring the proposal’s equal
contract
defining types with an Equal
method,
the example generic Index
function becomes:
// Index returns the index of e in s, or -1.
func Index(const T equal)(s []T, e T) int { ... }
Notice that equal
is used here instead of type
as a more-restrictive meta-type denoting any particular type
evaluable at compile-time that has an Equal
method according to the contract.
The current proposal’s syntax, (type T equal)
,
is conceptually redundant in that both the type
keyword before T
and the equal
identifier after T
both effectively declare T
as a type parameter,
the latter by virtue of equal
having been declared via a contract.
The (const T equal)
syntax, in contrast, is non-redundant
in that the const
keyword declares this to be a
compile-time constant parameter list,
while the equal
alone declares T
to be a type parameter
(and not an int
or string
or…)
restricted to some particular type satisfying the equal
contract.
Assuming we declared a multype
contract,
either of built-in types supporting the *
operator
or of arbitrary types supporting a Mul
method,
we could then properly “fix” the above DotProduct
example as follows:
func DotProduct(const N int, T multype)(a, b [N]T) T { ... }
There are many ways in which generics over non-type compile-time constant values are likely to be useful. I will briefly cover two classes of such uses here: first, generics over arrays, and second, allowing compile-time configuration and specialization of packages and algorithms for which efficiency matters.
Without support for non-type parameters to generics, it will presumably be possible to define functions or types that are generic over (say) all arrays of a particular size over a parameterized type T, but not over all arrays, leaving a major gap in the language, both conceptually and pragmatically in terms of code efficiency.
There are many common uses of arrays for which their compile-time fixed length is important for efficiency, particularly for arrays of small fixed lengths. In image processing, for example, manipulation of RGB 3-tuples or RGBA 4-tuples (with an alpha channel) is extremely common, pervasive, and performance-critical. In 2D and 3D modeling, rendering, gaming, etc., manipulation of 2-, 3-, or 4-element vectors and corresponding 2×2, 3×3, and 4×4 transformation matrices are fundamental and performance-critical. Using arrays instead of slices in short-vector code like this allows the compiler to unroll loops and eliminate range checks effectively.
Currently, developers who care about getting these efficiency benefits must write manually-specialized versions of all the tuple, vector, or matrix manipulation functions they need, once for each relevant combination of tuple/vector/matrix dimensions and element types. The current Go generics proposal will presumably allow semi-generic functions to range over different element types but not different dimensions. It would be nice if vector/matrix operations could be generic over both element types and dimensions.
For example, an image processing library that wishes to process both RGB and RGBA tuples leveraging the maximum efficiency of arrays will have to provide two sets of generic pixel-manipulation functions, one for RGB 3-tuples and the other for RGBA 4-tuples, even though each of these can be semi-generic over different numeric types. With support for generics over non-type parameters, the image processing library could provide just one set of generic pixel-manipulation functions, which clients can use — and which the Go compiler can specialize efficiently for — both RGB 3-tuple arrays and RGBA 4-tuple arrays at compile time, depending on the needs of a particular client of the library.
Similarly, a 2D or 3D library (or even the Go standard library)
must currently take and process variable-length slices as arguments
in order to handle vectors and matrices of different sizes,
or else provide a separate set of code specialized
to each of the performance-critical common cases:
e.g., 2D vectors and 2×2 matrices,
3D vectors and 3×3 matrices, and
4D vectors and 4×4 perspective and orthographic projection matrices.
The current Go generics proposal will allow semi-generic functions
parameterized with a numeric type (e.g., float32
versus float64
)
but will still require manually-coded versions for each common-case dimension.
Suppporting non-type parameters to generics
(or even just constant int
parameters alone)
would fill this gap and allow libraries to include
generic functions specializable at compile time
to the particular vector/matrix dimension(s) needed by the application,
without either giving up the efficiency benefits of arrays
or duplicating large amounts of manually-specialized code.
Through appropriate use of the “either-or” type disjunction mechanism already in the Go generics proposal, it should even be possible to build generic libraries compatible with either slices or [pointers to] arrays, not just one or the other. For example, the Go standard library or other libraries could include generic vector operations taking either runtime-dimensioned slices or compile-time-dimensioned vectors:
contract Numeric(T) {
T float32, float64, int, int32, ...
}
contract Vector(const N int, E Numeric, V type) {
V []E, *[N]E // a slice of E or a pointer to an array of E
}
func (r V) Add(const N, E, V Vector)(a, b V) V {
for i := range r {
r[i] = a[i] + b[i]
}
return r
}
func DotProduct(const N, E, V Vector)(a, b V) E {
var sum E
for i := range r {
sum += a[i] * b[i]
}
return sum
}
This capability would get even more useful and powerful considering generics for matrix operations using either slices or multidimensional arrays to represent matrices. In the latter case, the compilar can not only specialize these generics to both the types and dimensions needed by applications, but also enforce appropriate dimension invariants on arrays at compile-time.
A generic function for matrix transposition could, when applied to matrices represented as 2-dimensional arrays, ensure at compile-time that the matrix is square:
contract Transposable(const n int, E Numeric, M type) {
M [][]E, *[n][n]E
}
func (r Mr) Transpose(const n, E, M Transposable)(a M) M {
for i := range r {
for j := range r {
r[i][j] = a[j][i]
}
}
return r
}
Similarly, a generic function for matrix multiplication could work with not only different element types but also with either slices or [pointers to] arrays, and in the latter case the compiler can check at compile time the correct dimension correspondences via type unification, and specialize the code to the specific types and dimensions an application actually uses:
contract Multable(const n, m, p int, E Numeric, Ma, Mb, Mr type) {
Ma [][]E, *[n][m]E
Mb [][]E, *[m][p]E
Mr [][]E, *[n][p]E
}
func (r Mr) MatMult(const n,m,p,E,Ma,Mb,Mr Multable)(a Ma, b Mb) Mr {
for i := range r {
for j := range r[i] {
var sum E
for k := range a[i] {
sum += a[i][k] * b[k][j]
}
r[i][j] = sum
}
}
return r
}
If we add run-time checks to the above code for proper dimension correspondence, then a specializing compiler will automatically compile those checks out to nothing whenever MatMult is used with the array representation, simply as a result of ordinary constant propagation.
These slice-and-array generics examples all have the subtlety
of depending on the Go compiler to do something sensible
with the const int parameters that aren’t needed
when the above generic functions are used with slices.
Type unification will not find anything to bind n
to
when Transpose
is called with a slice, for example.
This should be OK, however, if the compiler
an avoid complaining about unconstrained generic parameters
until/unless something in the generated code actually tries to use
an unconstrained generic parameter.
Beyond arrays and code that depends on them, supporting non-type compile-time parameters to generics would enable library modules to be made configurable at compile time in many ways to support both general and specialized use-cases efficiently.
Consider the Go standard library’s `big.Int’ facility for arbitrary-precision integer arithmetic, for example. While general and highly useful, a fairly common case it does not address is the need for integers slightly larger than the natively-supported 64-bit size but still small and fixed-length for efficiency. For example, there have been calls for 128-bit integers in Go, and the Ethereum virtual machine uses 256-bit integers pervasively.
A developer who needs 128- or 256-bit signed or unsigned integers in particular,
and cares about performance,
must currently choose between using bit.Int
and giving up the performance advantages of code specialized to
2-word or 4-word integers and signed or unsigned arithmetic specifically,
or writing a manually-specialized arithmetic package
that may not benefit from the architecture-specific underlying optimizations
that big.Int
does without a lot of additional effort and code maintenance.
With non-type compile-time parameters to generics,
the math/big
package could support a variation of big.Int
compile-time configurable at compile time and specializable to
a particular bit width and/or signedness configuration.
For example:
type GenericInt(const bits int, signed Signedness) struct { ... }
type Signedness int
const (
SignBit Signedness = iota, // default: separate sign bit
Unsigned, // no sign bit, unsigned values only
TwosComplement, // topmost integer bit is sign bit
)
Configuring GenericInt
with bits == 0
would allow its length to vary at runtime as in big.Int
,
and setting signed == SignBit
would similarly configure it to keep a sign bit
stored separately from the significant integer bits as in big.Int
.
Thus, the configuration GenericInt(0, SignBit)
would be equivalent to the current big.Int
.
The internal implementation of GenericInt
would of course have a variety of internal tests and dependencies
on the bits
and signed
,
but a specializing compiler would generally resolve and optimize away
these tests at compile time.
If Go 2 could allow generic and non-generic forms of the same identifier
without getting confused,
then GenericInt
could effectively replace big.Int
,
with the non-parameterized Int
just representing
the default configuration of the generic Int
:
type Int(const Bits int, signed Signedness) struct { ... }
type Int Int(0, SignBit)
Specialization might have storage and garbage collection efficiency benefits,
which may be significant.
With implementation care, for example,
configurations of GenericInt
for a fixed number of bits
and Unsigned
or TwosComplement
mode
could internally use a fixed-length array with no space overhead.
For example, an Int(128, Unsigned)
or an Int(128, TwosComplement)
would always take exactly 128 bits of storage and never cause allocations.
Compile-time configurable big integers
might also be an element of a clean solution to the problem of providing
constant-time big integers for cryptography.
Go 2 might, for example,
reasonably guarantee guarantee that operations on a big.Int
configured with all parameters fixed at compile-time
contain no control-flow that depends on data at runtime.
Alternatively, since expensive operations such as modular exponentiation
can often be done more efficiently when constant-time operation is not required
(e.g., for verifying rather than creating digital signatures),
constant-time operation might be a separate compile-time parameter to big.Int
.
To be clear, this is not a specific proposal at the moment
that big.Int
should be generalized in this particular (or any) way.
This is merely an example of how types like big.Int
could potentially
be made configurable and specializable at compile time
if Go 2 generics support non-type generic parameters.
The following examples should be taken in similar spirit.
The same principles apply naturally to big.Float
,
where other natural configuration options
such as mantissa precision and rounding mode
may naturally apply in addition to signedness.
While Go natively supports only 32-bit and 64-bit IEEE floating point,
compile-time configuration parameters to a generic big.Float
could provide reasonably-efficient, compile-time specialized implementations of
the other binary floating-point formats
standardized by IEEE 754-2008,
such as 16-bit, 128-bit, and 256-bit floating-point numbers.
Many applications typically use only one floating-point rounding mode,
and a great deal of internal control-flow complexity in big.Float
is dedicated to testing the rounding mode and handling all of the cases.
Another large part of the internal complexity
is dedicated to handling sign bits and zeros in all combinations.
An application that, for example,
needs arbitrary- or fixed-precision floating-point numbers
that are always positive and processed with only one rounding mode, for example,
could configure a generic big.Float
for positive-only operation
with only that rounding mode,
obtaining a much more efficient compile-time-specialized version
with a tiny fraction of the runtime conditionals and branches
that the general big.Float
contains.
While the hardware branch prediction units of modern processors
of course already mitigate some of the costs
of these numerous internal branches
by “learning” which direction a branch typically follows,
appropriately-specialized versions of code such as big.Float
would place much less pressure on hardware branch prediction —
leaving that branch prediction cache space for effective use
by the code calling big.Float
—
at a cost, of course, of code size
in applications that use multiple configurations of big.Float
.
From a software engineering perspective,
introducing compile-time configuration options to generics
in an incremental, backward-compatible way would be facilitated
if Go 2 were also enhanced to support const struct
s.
Go already supports struct
value literals,
which are effectively constant struct
s anyway;
the addition would merely to be to allow struct
literals
to to be used in const
declarations and generic parameter lists.
In this way, compile-time-configurable generics
could simply take an extensible Config
structure
as their one compile-time parameter.
Such a compile-time Config
struct for big.Float
might look something like this:
type FloatConfig struct {
MaxPrec int, // Mantissa precision in bits, 0 for variable
Nonzero bool, // Zero values disallowed
Unsigned bool, // Negative values disallowed
Mode RoundngMode, // Fixed rounding mode, Any for variable
_ struct{}, // Ensure extensibility with future parameters
}
type Float(const config FloatConfig) struct { ... }
type Float Float(FloatConfig{}) // default configuration
There are a wide variety of circumstances in which such compile-time configurability and specialization may be useful. More algorithmic choices and optimizations are often available when the data being processed is known to have certain properties; compile-time configuration parameters to generics may be used to signal to libraries in an extensible way that the data has such properties and that relevant optimizations may be activated.
For example, many graph algorithms and optimizations
work only when the graph is known to be undirected (two-way edges),
or when all edge weights are known to be positive,
or when the graph is known to be directed acyclic, or a tree, or fast mixing.
Generic graph algorithm libraries
could use declarative compile-time parameters via Config
structures
to specialize their general algorithms to special cases such as these.
Matrix and linear algebra operations can often be optimized considerably
if the matrix is known to be upper- or lower-triangular
or to have other properties the client might declare to enable specialization.
As mentioned above, cryptographic algorithms and arithmetic
can often be optimized in situations where the data being operated on
is known to be non-sensitive so that arithmetic need not be constant-time.
const
Parameters?No. C++ allows programmers to attach the const
keyword
to individual parameters
(and even to individual pointer-indirection levels within a parameter)
in order to declare the contents of a run-time parameter to be read-only.
Experience has shown that that use of const
leads to insane complexity,
and still never quite gets you what you actually want,
due to the complex interactions with pointer indirections and mutable types.
The use of the const
keyword proposed here applies to entire parameter lists,
not to individual parameters or components of parameters or types,
and it declares all the parameters to be completely evaluated
and bound to constants at compile time as part of generic instantiation.
We’re not using const
here to try and tweak run-time parameters
to be (unreliably) read-only, as in C++.
Instead we’re using const
to declare that the entire parameter list is to be evaluated
and bound to constant actuals at compile-time and,
at least conceptually if perhaps not always in implementation,
to disappear entirely at runtime through generic instantiation.
I respect and agree with the current Go 2 Generics proposal’s desire to avoid the complexity of the kinds of metaprogramming that C++ templates support, not to mention the risk of a source program accidentally or deliberately making the compiler run forever or diverge. It might be claimed that allowing non-type parameters to generics constitutes support for metaprogramming, compromising that philosophy.
To such a claim, however, I would argue that Go already includes some
(albeit extremely restricted) support for metaprogramming,
such as the const
declaration’s ability to evaluate an expression
(i.e., run a program!) at compile time.
And the current generics proposal already inherently adds the still-limited
“metaprogramming” functionality
of “calling” parameterized “functions” (generics) at compile time
to produce multiple specializations or instantiations of the same code.
The important question is not so much whether we’re doing metaprogramming:
we are anyway, inevitably.
The question is what the right model and set of constraints is.
Given the fact that the Go language already has considerable machinery for dealing with compile-time constants of a variety of types, I don’t see any reason to believe that supporting non-type compile-type parameters to generics will somehow launch Go on an inevitable slide toward the template insanity of C++. The rules for declaring and invoking generics in Go can and should ensure that the compiler can decide not just in finite time, but quickly, whether a proposed generic instantiation is legally evaluable at compile time. A lot of the machinery to do so (e.g., constants) is already there.
Allowing the compile-time parameter lists of generics to include
non-type constant parameters as well as type parameters
could substantially increase the power of the generics mechanism.
Non-type compile-time parameters to generics
could enable users to obtain a lot of the desirable combination
of code efficiency and conciseness that programmers often want from
macro (or hygenic macro) facilities, for example.
A clean syntactic distinction between compile-time (const ...)
and run-time (...)
parameter lists
will enable programmers to maintain a clear understanding
of what will be evaluated at compile-time versus run-time
and avoid performance surprises,
regardless of whether those parameters happen to be types or non-types.
Even if it is decided that Go 2 should initially support
only type parameters to generics, however,
considerations of “future-proofing” and maintaining maximum flexibility alone
still seem to favor the adoption of a syntax and semantics
that does not “bake in” forever an unchangeable assumption
that compile-time parameters are always types
while run-time parameters are always non-types.
Adopting a syntax like (const T1, T2 type)
for compile-time parameters
instead of (type T1, T2)
keeps the syntax focused on what’s fundamentally important,
namely that the parameters to generics are bound to constants at compile time.
This approach also
maintains greater linguistic consistency with existing declaration styles,
and avoids painting the language into an evolutionary corner it may not escape,
even if generics initially support only type parameters.
Updated 6-August-2019 with the section on example uses.
Topics: Programming Languages Syntax | Bryan Ford |