Home - Topics - Papers - Talks - Theses - Blog - CV - Photos - Funny |
Goals:
An XSON stream starts with xson[
imports]
,
where imports is a comma-separated list of syntax imports.
These can be simple identifiers for extensions defined in this document,
or may (how?) refer to external syntax extensions by URI/CRI.
If multiple syntax extensions “claim” the same shorthand notation and neither of them builds on the other to provide a disambiguation rule, then the text is considered a syntax error as if it had not matched any extension.
Syntax extensions may have and attempt to encode using shorthand forms, which are convenient for concise expressiveness. Extensions cannot know if their shorthand will be ambiguous with the shorthand syntax of other extensions that may be in use, however,
Therefore, whenever an extension produces an untagged shorthand form, the XSON encoder framework immediately re-decodes the encoded text in order to check for syntactic collisions. If no syntactic collision occurs, the framework uses the shorthand. If a collision does occur, the element is re-encoded in syntax-tagged form, which is guaranteed to be unambiguously decodable.
Only two primitives providing minimal structure: strings and arrays.
JSON actually corresponds to a particular XSON format
with a certain set of well-defined extensions on top of minXSON:
namely null
, bool
, number
, and object
.
By design, any data conforming to any XSON format can be converted to or from JSON or minXSON.
Perhaps minXSON is so minimal that nothing is representable?
An XSON stream is a CTS text string
containing hierarchical tree of elements.
Each element has a general syntax-tagged form of tag[
content]
.
The tag names the syntactic extension invoked by this element,
while the content has a syntax entirely defined by that extension,
aside from obeying the global CTS rule that “matchers must match.”
null
null[]
null
boolean
boolean[true]
or boolean[false]
true
or false
The full syntax of a boolean value is either
bool[true]
or bool[false]
,
which may be abbreviated to simply true
or false
when there is no potential ambiguity that requires syntax tagging.
natural
integer[
digits]
where digits is a sequence of decimal digitsinteger
integer[
sign digits]
where sign is an optional -
sign and digits is a sequence of decimal digits.number
integer
number[
opt-minus digits e
plus-minus expdigits]
string
string[abcdef]
"abcdef"
Supports the standard JSON-defined character escapes
\"
, \\
, \/
, \b
backspace, \f
form feed, \n
line feed,
\r
carriage return, \t
tab.
Unicode character escapes have the form \u[
hex]
where hex is the Unicode code point in hexadecimal.
Learning from Swift,
this notation avoids JSON’s need to use surrogate pairs
to express code points above 0xffff,
and the need to have two separate forms
like the 4-digit \u
and 8-digit \U
as in Go.
array
array[
item1,
item2,
…]
map
map{k1:v2, k2:v2, ...}
{k1:v1, k2:v2, ...}
Maps may in principle have any values as keys, not just strings. However, JSON restricts map keys to be strings.
binary
binary[0110101001]
0b0110101001
octal
octal[1234567]
0o1234567
This syntax is defined for consistency with the octal notation
supported in many programming languages such as
Python,
Haskell,
OCaml,
Raku, and
Ruby,
Tcl.
It consciously avoids the C language tradition of using a single leading 0
to indicate octal,
which is easily confused with a decimal number with leading zeros.
hexadecimal
hexadecimal[abcdef]
0xabcdef
This definition is compatible with that proposed in JSON5.
The infinithy-nan
extension builds on number
to add support for
the special infinity and not-a-number (NaN) values
that the IEEE 754 floating-point standard defines but JSON omits.
This definition is compatible with that proposed in JSON5.
Bryan Ford |