| Home - Topics - Papers - Talks - Theses - Blog - CV - Photos - Funny |
Goals:
An XSON stream starts with xson[imports],
where imports is a comma-separated list of syntax imports.
These can be simple identifiers for extensions defined in this document,
or may (how?) refer to external syntax extensions by URI/CRI.
If multiple syntax extensions “claim” the same shorthand notation and neither of them builds on the other to provide a disambiguation rule, then the text is considered a syntax error as if it had not matched any extension.
Syntax extensions may have and attempt to encode using shorthand forms, which are convenient for concise expressiveness. Extensions cannot know if their shorthand will be ambiguous with the shorthand syntax of other extensions that may be in use, however,
Therefore, whenever an extension produces an untagged shorthand form, the XSON encoder framework immediately re-decodes the encoded text in order to check for syntactic collisions. If no syntactic collision occurs, the framework uses the shorthand. If a collision does occur, the element is re-encoded in syntax-tagged form, which is guaranteed to be unambiguously decodable.
Only two primitives providing minimal structure: strings and arrays.
JSON actually corresponds to a particular XSON format
with a certain set of well-defined extensions on top of minXSON:
namely null, bool, number, and object.
By design, any data conforming to any XSON format can be converted to or from JSON or minXSON.
Perhaps minXSON is so minimal that nothing is representable?
An XSON stream is a CTS text string
containing hierarchical tree of elements.
Each element has a general syntax-tagged form of tag[content].
The tag names the syntactic extension invoked by this element,
while the content has a syntax entirely defined by that extension,
aside from obeying the global CTS rule that “matchers must match.”
nullnull[]nullbooleanboolean[true] or boolean[false]true or falseThe full syntax of a boolean value is either
bool[true] or bool[false],
which may be abbreviated to simply true or false
when there is no potential ambiguity that requires syntax tagging.
naturalinteger[digits] where digits is a sequence of decimal digitsintegerinteger[sign digits] where sign is an optional - sign and digits is a sequence of decimal digits.numberintegernumber[opt-minus digits e plus-minus expdigits]stringstring[abcdef]"abcdef"Supports the standard JSON-defined character escapes
\", \\, \/, \b backspace, \f form feed, \n line feed,
\r carriage return, \t tab.
Unicode character escapes have the form \u[hex]
where hex is the Unicode code point in hexadecimal.
Learning from Swift,
this notation avoids JSON’s need to use surrogate pairs
to express code points above 0xffff,
and the need to have two separate forms
like the 4-digit \u and 8-digit \U as in Go.
arrayarray[item1,item2,…]mapmap{k1:v2, k2:v2, ...}{k1:v1, k2:v2, ...}Maps may in principle have any values as keys, not just strings. However, JSON restricts map keys to be strings.
binarybinary[0110101001]0b0110101001octaloctal[1234567]0o1234567This syntax is defined for consistency with the octal notation
supported in many programming languages such as
Python,
Haskell,
OCaml,
Raku, and
Ruby,
Tcl.
It consciously avoids the C language tradition of using a single leading 0
to indicate octal,
which is easily confused with a decimal number with leading zeros.
hexadecimalhexadecimal[abcdef]0xabcdefThis definition is compatible with that proposed in JSON5.
The infinithy-nan extension builds on number to add support for
the special infinity and not-a-number (NaN) values
that the IEEE 754 floating-point standard defines but JSON omits.
This definition is compatible with that proposed in JSON5.
| Bryan Ford |