Home - Topics - Papers - Talks - Theses - Blog - CV - Photos - Funny |
In this post I would like to introduce
VIPcode, a library that encodes and decodes structured data
specified by verifiable interface presentations or VIPs
in the Go language.
In brief, VIPs are presentations or mappings
of an abstract wire-format interface
into the concrete data types of a specific target language (in this case Go).
While embodying all information needed to marshal and unmarshall complex data,
VIPs may also be verified automatically for compliance
with a language-neutral interface specification,
such as a
.proto
file
or a
JSON schema.
While this initial VIPcode library is specific to Go,
the philosophy and development workflow it embodies is language-neutral,
could readily be ported to other languages, and I hope it will be.
VIPcode is inspired by the idea of separately but synergistically specifying the interface and presentation of interoperable structured data formats. An interface definition is independent of any programming language, and defines only the essence of the contract required for wire-compatibility. A presentation definition, in contrast, is specific to a target programming language such as Go, and offers developers convenience and customizability in the way a wire-format interface is mapped to in-memory data structures. For example, while the interface might specify only that a data value is an integer or list, the interface’s presentation might separately define which specific language-specific integer or container type the value maps to. VIPcode makes presentations verifiable or checkable against a language-independent interface, however, to ensure that a language-specific presentation indeed conforms to its intended abstract interface.
This idea of separately defining presentation and interface is one I proposed over two decades ago, in interface definition research done in the context of the now-obsolete Mach kernel and the equally-obsolete CORBA interface definition language (IDL). The fundamental concept of separately specifying presentation and interface, however, is an idea that I feel is far from obsolete and worth updating to modern programming languages and IDLs. Features of modern typesafe languages like Java, Go, and Swift such as reflection and struct tags, in particular, now make it convenient to embed presentation definitions into the type definitions of the target language itself, while ensuring that these presentations may be verified against language-neutral interface definitions.
The VIPcode library offers functionally comparable to, and wire-compatible with, marshaling and unmarshaling stubs generated by IDL compilers like protoc. VIPcode requires no IDL compiler, however, but instead marshals and unmarshals data structures as specified directly using verifiable interface presentations (VIP) definitions embedded in the target language (Go). VIPcode’s presentation/interface separation philosophy is not incompatible with stub compilation for performance, however, as discussed below.
As an example,
the following Go struct
definition describes a data structure
that VIPcode can marshal or unmarshal
in either protobuf or
JSON format,
and other wire formats in the future:
type Example struct {
b bool // a boolean field
i int // an integer field
f float32 // a 32-bit floating-point number
s string // a text string
a []byte // a variable-length byte array
v [3]float32 // a fixed-length vector of three floats
r []Example // a recursive list of structs
}
Using the
vipcode/protobuf package,
you can marshal an instance of the Example
struct into
protobuf wire format as follows:
var e Example
// ... (fill in fields of e) ...
buf, err := protobuf.Encode(&e)
You can similarly unmarshal a serialized instance as follows:
err := protobuf.Decode(buf, &e)
Using the vipcode/json package, you can similarly marshal and unmarshal the same structure into or out of JSON format.
As you can see from the examples above,
the vipcode codec for protobuf format does not require a .proto
file
as with the standard
protobuf package for Go.
Instead,
more like the existing
reflective protobuf codec for Go,
vipcode reflectively uses the Example
struct
definition in Go
to infer the wire-format interface.
This way, you don’t need to guess or remember
which specific Go type or field naming style
(e.g., with_underscores
or camelCase
)
the IDL compiler will produce from a particular .proto
file definition,
or dig through the long and unreadable output of protoc
to figure that out.
You know the exact Go type and field name for use from your code
because you defined it.
While convenient,
this approach might be dangerous if you are trying to be interoperable
with a language-independent definition in a .proto
file,
and vipcode unexpectedy infers a different wire-format type
from what you were expecting or from what the .proto
file specifies.
This is a real risk in using our prior
reflective protobuf codec,
however convenient.
There are excellent reasons to use language-independent interface definitions,
and vipcode does not pretend otherwise.
But the vipcode philosophy is that the proper use of interface definition files
is not to generate code from them,
but rather to verify language-specific presentations against them.
This philosophy is consistent with the way interface
types work
within the Go language itself, in fact.
You define an interface
type separately
from object types you intend to conform to that interface,
and the Go compiler verifies that those objects indeed conform
by implementing the necessary methods with the correct signatures.
To demonstrate interface verification in vipcode,
suppose that you write (or were given)
a .proto
file called example.proto
specifying this language-independent “message” definition:
message Example {
bool b = 1;
int32 i = 2;
float f = 3;
string s = 4;
bytes a = 5;
repeated float v = 6;
repeated Example e = 7;
}
In one of your _test.go
files
for the package containing the Go type Example struct
definition above,
you should include a test that looks like this:
import "vipcode/protobuf"
func TestExampleProto(t *testing.T) {
protobuf.TestProto(t, "example.proto", &Example{})
}
This test will verify that the Go type of the object passed to TestProto
matches the language-independent message definition of the same name
in the example.proto
file.
If you change the Example
definition in either the Go type definition
or the example.proto
file
to make their wire-formats incompatible,
then TestExampleProto
will fail.
For example, since protobuf encoding uses different wire encodings
for signed and unsigned integers,
the test will fail if the field i
is changed to an unsigned integer type
in one definition but not the other.
For full details on the types and verification rules vipcode supports, please see the package’s README and API documentation. However, this section briefly covers a few ways in which specifying target-language presentation separately from language-interface can be useful when using vipcode.
Although the protobuf wire format uses
varints
that can in principle encode arbitrary-precision integers,
the protobuf language constrains
integer scalars
to be either 32-bit or 64-bit.
The Go language supports both fixed-width integer types like int32
and the “generic” machine-word integer type int
, however,
and the latter is pervasively used throughout Go code
and often most convenient when we just need
“an integer we’re pretty sure is big enough for our purposes.”
In Relaxed
interface verification mode, therefore,
vipcode allows an int32
or int32
protobuf type to be
“presented” as an int
type in Go.
Go slices versus vectors
Message field IDs: automatic, and manual
integer types
…
Using reflection to interpret Go language types dynamically, as the VIPcode library currently does, is undoubtably not the way to get the maximum possible performance in marshaling and unmarshaling. The presentation/interface separation and VIPcode philosophy is not in any way incompatible with or opposed to the idea of IDL stub compilation, however. We can easily envision, and I hope someone will create, a tool that compiles Go-language presentations into highly-optimized statically-typed marshaling stubs and verifies those presentations against abstract interface definitions at the same time. The result should combine the convenience and customizability of target-language presenation definitions, the interoperability of language-neutral interface definitions, and the performance and efficiency of compiled marshaling stubs.
Bryan Ford |