PBON (Portable Binary Object Notation) is a light-weight binary data interchange format inspired by JSON.

PBON is built on the same structures as JSON, namely:

- A collection of key/value pairs (realized as an
*object*in some programming languages) - An ordered list of values (realized as an
*array*in some programming languages)

The primary difference between JSON and PBON is that PBON is a more compact format that allows binary data (including Unicode strings) to be represented directly in the encoding without escaping. The trade-off is that PBON loses some human-readability to achieve this compactness.

PBON prefers the simplicity of JSON over implementing the most compact binary representation possible, in order to maintain some human-readability and to make implementation straightforward.

PBON supports backward compatibility by enabling implementations to skip values with unrecognized keys, for example, key/value pairs added as part of a newer version of an object.

In PBON, an *object* is an unordered set of key/value pairs. An object begins with a
left brace { and ends with a right brace }.

object

A *key* is a positive integer encoded as a variable-length integer.

An *array* is an ordered collection of values. An array begins with a
left bracket [ and ends with a right bracket ].

array

A *value* can be binary, a string, an integer, a float, an object, an array,
true (t) or false (f) or null (~).

value

A *binary* value is a length-prefixed sequence of bytes.

A *length-prefix* is a non-negative integer value, encoded as variable-length integer.

A *string* is a length-prefixed sequence of UTF-8 encoded characters.

An *integer* is a length-prefixed base-256 encoded value in big-endian
order. Negative integer values are stored using the bitwise complement of the negative value with the most significant bit
(the sign bit) set to 1.

A *float* is a length-prefixed IEEE 754 encoded
floating point value in big-endian order.

A *variable-length integer* is stored as an initial byte followed by one or more trailing bytes.

The most-significant bit of each byte (C) is a continuation indicator, which is set to 1 for all but the last byte.

The second most-significant bit of the first byte (S) is reserved as the sign indicator, which is set to 1 for negative values.

The remaining bits (Vn) contain the value in big-endian byte order (most significant digits first).

Negative values are transformed using a bitwise complement operation before encoding. This ensures that small negative numbers will also occupy a small amount of encoded space.

For example, here's the value 1. It's a single byte so the continuation bit (C) is not set, and it's a positive value so the sign bit (S) is not set:

```
0000 0001
```

And here's the value 300:

1000 0010 0010 1100

To calculate this value, start with the binary encoding of 300 and split it into 7-bit groups starting from the least-significant digit:

300 → 10 0101100

And finally, set the sign bit (S) to 0 to indicate a positive value and the continuation bit of each byte (C) except the last to 1:

→ 10000010 00101100

Here's the value -300:

1100 0010 0010 1011

To encode a negative number, start with the binary encoding of the value, take the bitwise complement, and split it into 7-bit groups starting from the least-significant digit:

-300 → 1111 1110 1101 0100 ~(-300) → 1 0010 1011 → 10 0101011

And finally, set the sign bit (S) to 1 to indicate a negative value and the continuation bit of each byte (C) except the last to 1:

→ 11000010 00101011

Let's start with the following simple message:

class Message1 { string Name = "Foo"; }

This message would be encoded as the following bytes:

7B 01 03 46 6F 6F 7D

Let's break this down:

The first and last bytes (7B ... 7D) are the braces { } that surround every object.

The second byte (01) is the first member key (1) encoded as a variable-length integer.

The third byte (03) is the length of the string member value to follow (3 bytes), also encoded as a variable-length integer.

And finally, bytes 4-6 are the UTF-8 encoded bytes of the string (46 6F 6F).

Now let's add a second member with an integer value:

class Message2 { string Name = "Foo"; int Score = 100; }

This message could be encoded as the following bytes:

```
7B 01 03 46 6F 6F 02 01 64 7D
```

The 7th byte (02) is the new member key (2) encoded as a variable-length integer.

The 8th byte (01) is the length of the integer member value to follow (1 byte), also encoded as a variable-length integer.

And finally, the 9th byte (64) is the base-256 encoded value 100.

Now let's say that we want an array of scores:

class Message3 { string Name = "Foo"; int[] Scores = new int[] { 1, 2, 3 }; }

This message could be encoded as the following bytes:

```
7B 01 03 46 6F 6F 03 5B 01 01 01 02 01 03 5D 7D
```

The 7th byte (03) is the new member key (3) encoded as a variable-length integer.

*Note that we chose 3 as the member key so as not to conflict with the definition of
the previous Message2 class and possibly break backward compatibility.*

The 9th and 16th bytes (5B ... 5D) are the brackets [ ] that surround every array.

Bytes 10-15 are the values in the array, each of which is the length of the encoded value (encoded as a variable-length integer) followed by the integer value (encoded as base 256).

object = "{" members "}" / "{}" / null members = pair / pair members pair = key value key = varint ; > 0 array = "[" elements "]" / "[]" / null elements = value / value elements value = binary / string / integer / float / object / array / true / false / null ----- binary = length octets length = varint ; >= 0 string = length utf-8 ; length is utf-8 octet count integer = length base-256 ; big endian float = length ieee-754 ; big endian true = "t" false = "f" null = "~" varint = variable-length-integer ; Big-endian variable-length quantity, continuation bit in MSB of each octet, sign in bit 6 of first octet