Home - Topics - Papers - Theses - Blog - CV - Photos - Funny

VIPcode: Encoding and Decoding Structured Data with Verifiable Interface Presentations (VIPs)

In this post I would like to introduce VIPcode, a library that encodes and decodes structured data specified by verifiable interface presentations or VIPs in the Go language. In brief, VIPs are presentations or mappings of an abstract wire-format interface into the concrete data types of a specific target language (in this case Go). While embodying all information needed to marshal and unmarshall complex data, VIPs may also be verified automatically for compliance with a language-neutral interface specification, such as a .proto file or a JSON schema. While this initial VIPcode library is specific to Go, the philosophy and development workflow it embodies is language-neutral, could readily be ported to other languages, and I hope it will be.

Defining interface and presentation separately

VIPcode is inspired by the idea of separately but synergistically specifying the interface and presentation of interoperable structured data formats. An interface definition is independent of any programming language, and defines only the essence of the contract required for wire-compatibility. A presentation definition, in contrast, is specific to a target programming language such as Go, and offers developers convenience and customizability in the way a wire-format interface is mapped to in-memory data structures. For example, while the interface might specify only that a data value is an integer or list, the interface’s presentation might separately define which specific language-specific integer or container type the value maps to. VIPcode makes presentations verifiable or checkable against a language-independent interface, however, to ensure that a language-specific presentation indeed conforms to its intended abstract interface.

This idea of separately defining presentation and interface is one I proposed over two decades ago, in interface definition research done in the context of the now-obsolete Mach kernel and the equally-obsolete CORBA interface definition language (IDL). The fundamental concept of separately specifying presentation and interface, however, is an idea that I feel is far from obsolete and worth updating to modern programming languages and IDLs. Features of modern typesafe languages like Java, Go, and Swift such as reflection and struct tags, in particular, now make it convenient to embed presentation definitions into the type definitions of the target language itself, while ensuring that these presentations may be verified against language-neutral interface definitions.

Specifying interface presentation via Go type definitions

The VIPcode library offers functionally comparable to, and wire-compatible with, marshaling and unmarshaling stubs generated by IDL compilers like protoc. VIPcode requires no IDL compiler, however, but instead marshals and unmarshals data structures as specified directly using verifiable interface presentations (VIP) definitions embedded in the target language (Go). VIPcode’s presentation/interface separation philosophy is not incompatible with stub compilation for performance, however, as discussed below.

As an example, the following Go struct definition describes a data structure that VIPcode can marshal or unmarshal in either protobuf or JSON format, and other wire formats in the future:

	type Example struct {
		b bool		// a boolean field
		i int		// an integer field
		f float32	// a 32-bit floating-point number
		s string	// a text string
		a []byte	// a variable-length byte array
		v [3]float32	// a fixed-length vector of three floats
		r []Example	// a recursive list of structs
	}

Using the vipcode/protobuf package, you can marshal an instance of the Example struct into protobuf wire format as follows:

	var e Example
	// ... (fill in fields of e) ...
	buf, err := protobuf.Encode(&e)

You can similarly unmarshal a serialized instance as follows:

	err := protobuf.Decode(buf, &e)

Using the vipcode/json package, you can similarly marshal and unmarshal the same structure into or out of JSON format.

Verifying compliance with an interface definition or schema

As you can see from the examples above, the vipcode codec for protobuf format does not require a .proto file as with the standard protobuf package for Go. Instead, more like the existing reflective protobuf codec for Go, vipcode reflectively uses the Example struct definition in Go to infer the wire-format interface. This way, you don’t need to guess or remember which specific Go type or field naming style (e.g., with_underscores or camelCase) the IDL compiler will produce from a particular .proto file definition, or dig through the long and unreadable output of protoc to figure that out. You know the exact Go type and field name for use from your code because you defined it.

While convenient, this approach might be dangerous if you are trying to be interoperable with a language-independent definition in a .proto file, and vipcode unexpectedy infers a different wire-format type from what you were expecting or from what the .proto file specifies. This is a real risk in using our prior reflective protobuf codec, however convenient. There are excellent reasons to use language-independent interface definitions, and vipcode does not pretend otherwise.

But the vipcode philosophy is that the proper use of interface definition files is not to generate code from them, but rather to verify language-specific presentations against them. This philosophy is consistent with the way interface types work within the Go language itself, in fact. You define an interface type separately from object types you intend to conform to that interface, and the Go compiler verifies that those objects indeed conform by implementing the necessary methods with the correct signatures.

To demonstrate interface verification in vipcode, suppose that you write (or were given) a .proto file called example.proto specifying this language-independent “message” definition:

	message Example {
		bool b = 1;
		int32 i = 2;
		float f = 3;
		string s = 4;
		bytes a = 5;
		repeated float v = 6;
		repeated Example e = 7;
	}

In one of your _test.go files for the package containing the Go type Example struct definition above, you should include a test that looks like this:

	import "vipcode/protobuf"

	func TestExampleProto(t *testing.T) {
		protobuf.TestProto(t, "example.proto", &Example{})
	}

This test will verify that the Go type of the object passed to TestProto matches the language-independent message definition of the same name in the example.proto file.

If you change the Example definition in either the Go type definition or the example.proto file to make their wire-formats incompatible, then TestExampleProto will fail. For example, since protobuf encoding uses different wire encodings for signed and unsigned integers, the test will fail if the field i is changed to an unsigned integer type in one definition but not the other.

Some ways separate presentation definition is useful

For full details on the types and verification rules vipcode supports, please see the package’s README and API documentation. However, this section briefly covers a few ways in which specifying target-language presentation separately from language-interface can be useful when using vipcode.

Integers

Although the protobuf wire format uses varints that can in principle encode arbitrary-precision integers, the protobuf language constrains integer scalars to be either 32-bit or 64-bit. The Go language supports both fixed-width integer types like int32 and the “generic” machine-word integer type int, however, and the latter is pervasively used throughout Go code and often most convenient when we just need “an integer we’re pretty sure is big enough for our purposes.” In Relaxed interface verification mode, therefore, vipcode allows an int32 or int32 protobuf type to be “presented” as an int type in Go.

Floating-point types

Lists

Go slices versus vectors

Inferring and verifying against protobuf interface definitions

Message field IDs: automatic, and manual

integer types

Inferring and verifying against JSON schemas

Performance considerations

Using reflection to interpret Go language types dynamically, as the VIPcode library currently does, is undoubtably not the way to get the maximum possible performance in marshaling and unmarshaling. The presentation/interface separation and VIPcode philosophy is not in any way incompatible with or opposed to the idea of IDL stub compilation, however. We can easily envision, and I hope someone will create, a tool that compiles Go-language presentations into highly-optimized statically-typed marshaling stubs and verifies those presentations against abstract interface definitions at the same time. The result should combine the convenience and customizability of target-language presenation definitions, the interoperability of language-neutral interface definitions, and the performance and efficiency of compiled marshaling stubs.



Bryan Ford