message-size-increase #3

Open
sashakoshka wants to merge 75 commits from message-size-increase into main
Showing only changes of commit c4a985f622 - Show all commits

View File

@ -40,92 +40,73 @@ designed to allow applications to be presented with data they are not equipped
to handle while continuing to function normally. This enables backwards
compatibile application protocol changes.
The length of a TAPE structure is assumed to be given by the surrounding
protocol, which is usually METADAPT-A or B. The root of a TAPE structure can be
any data value, but is usually a table, which can contain several values that
each have a numeric key. Values can also be nested. Both sides of the connection
must agree on what data type should be the root value, the data type of each
known table value, etc.
TAPE expresses types using tags. A tag is 8 bits in size, and is divided into
two parts: the Type Number (TN), and the Configuration Number (CN). The TN is 3
bits, and the CN is 5 bits. Both are interpreted as unsigned integers. Both
sides of the connection must agree on the semantic meaning of the values and
their arrangement.
TAPE is based on an encoding method previously developed by silt.
### Data Value Types
The table below lists all data value types supported by TAPE.
The table below lists all data value types supported by TAPE. They are discussed
in detail in the following sections.
| Name | Size | Description | Encoding Method
| ----------- | --------------: | --------------------------- | ---------------
| I8 | 1 | A signed 8-bit integer | BETC
| I16 | 2 | A signed 16-bit integer | BETC
| I32 | 4 | A signed 32-bit integer | BETC
| I64 | 8 | A signed 64-bit integer | BETC
| U8 | 1 | An unsigned 8-bit integer | BEU
| U16 | 2 | An unsigned 16-bit integer | BEU
| U32 | 4 | An unsigned 32-bit integer | BEU
| U64 | 8 | An unsigned 64-bit integer | BEU
| Array[^1] | | An array of any above type | PASTA
| String | | A UTF-8 string | UTF-8
| StringArray | | An array the String type | VILA
| Table | | A table of any type | TTLV
| TN | Bits | Name | Description
| -: | ---: | ---- | -----------
| 0 | 000 | SI | Small integer
| 1 | 001 | LI | Large integer
| 2 | 010 | FP | Floating point
| 3 | 011 | SBA | Small byte array
| 4 | 100 | LBA | Large byte array
| 5 | 101 | OTA | One-tag array
| 6 | 110 | KTV | Key-tag-value table
| 7 | 111 | N/A | Reserved
[^1]: Array types are written as <E>Array, where <E> is the element type. For
example, an array of I32 would be written as I32Array. StringArray still follows
this rule, even though it is encoded differently from other arrays.
#### No Value (NIL)
NIL is used to encode the absence of a value where there would otherwise be one.
The CN of a NIL is ignored. It has no payload.
[^2]: SOP (sum of parts) refers to the sum of the size of every item in a data
structure.
#### Small Integer (SI)
SI encodes an integer of up to 5 bits, which are stored in the CN. It has no
payload. Whether the bits are interpreted as unsigned or as signed two's
complement is semantic information and must be agreed upon by both sides of the
connection. Thus, the value may range from 0 to 31 if unsigned, and from -16 to
17 if signed.
### Encoding Methods
Below are all encoding methods supported by TAPE.
#### Large Integer (LI)
LI encodes an integer of up to 256 bits, which are stored in the payload. The CN
determine the length of the payload in bytes. The integer is big-endian. Whether
the payload is interpreted as unsigned or as signed two's complement is semantic
information and must be agreed upon by both sides of the connection. Thus, the
value may range from 0 to 31 if unsigned, and from -16 to 17 if signed.
#### BETC
Big-Endian, Two's Complement signed integer. The size is defined as the least
amount of whole octets which can fit all bits in the integer, regardless if the
bits are on or off. Therefore, the size cannot change at runtime.
#### Floating Point (FP)
FP encodes an IEEE 754 floating point number of up to 256 bits, which are stored
in the payload. The CN determines the length of the payload in bytes, and it may
only be one of these values: 16, 32, 64, 128, or 256.
#### BEU
Big-Endian, Unsigned integer. The size is defined as the least amount of whole
octets which can fit all bits in the integer, regardless if the bits are on or
off. Therefore, the size cannot change at runtime.
#### Small Byte Array (SBA)
SBA encodes an array of up to 32 bytes, which are stored in the paylod. The
CN determines the length of the payload in bytes.
#### GBEU
Growing Big-Endian, Unsigned integer. The integer is broken up into 8-bit
chunks, where the first bit of each chunk is a CCB. The chunk with its CCB set
to zero instead of one is the last chunk in the integer. Chunks are ordered from
most significant to least significant (big endian). The size is defined as the
least amount of whole octets which can fit all chunks of the integer. The size
of this type is not fixed and may change at runtime, so this needs to be
accounted for during use.
#### Large Byte Array (LBA)
LBA encodes an array of up to 2^256 bytes, which are stored in the second part
of the payload, directly after the length. The length of the data length field
in bytes is determined by the CN.
#### PASTA
Packed Single-Type Array. The size is defined as the size of an individual item
times the number of items. Items are placed one after the other with no gaps
in-between them, except as required to align the start of each item to the
nearest whole octet. Items should be of the same type and must be of the same
size.
#### One-Tag Array (OTA)
OTA encodes an array of up to 2^256 items, which are stored in the payload after
the length field and the item tag, where the length field comes first. Each item
must be the same length, as they all share the same tag. The length of the data
length field in bytes is determined by the CN.
#### UTF-8
UTF-8 string. The size is defined as the least amount of whole octets which can
fit all bits in the string, regardless if the bits are on or off. The size of
this type is not fixed and may change at runtime, so this needs to be accounted
for during use.
#### VILA
Variable Item Length Array. The size is defined as the least amount of whole
octets which can fit each item plus one GBEU per item describing that item's
size. The size of this type is not fixed and may change at runtime, so this
needs to be accounted for during use. The amount of items must be greater than
zero. Items are each prefixed by their size (in octets) encoded as a GBEU, and
they are placed one after the other with no gaps in-between them, except as
required to align the start of each item to the nearest whole octet. Items
should be of the same type but do not need to be of the same size.
#### TTLV
TAPE Tag Length Value. The size is defined as the least amount of whole octets
which can fit each item plus one U16 and one GBEU per item, where the latter of
which describes that item's size. The size of this type is not fixed and may
change at runtime, so this needs to be accounted for during use. Items are each
prefixed by their numerical tag encoded as a U16, and their size (in octets)
encoded as a GBEU. Items are placed one after the other with no gaps in-between
them, except as required to align the start of each item to the nearest whole
octet. Items need not be of the same type nor the same size.
#### Key-Tag-Value Table (KTV)
KTV encodes a table of up to 2^256 key/value pairs, which are stored in the
payload after the length field. The pairs themselves consist of a 16-bit
unsigned big-endian key followed by a tag and then the payload. Pair values can
be of different types and sizes. The order of the pairs is not significant and
should never be treated as such.
## Transports
A transport is a protocol that HOPP connections can run on top of. HOPP
@ -176,7 +157,6 @@ sun will have expanded to swallow earth by then. Your connection will not last
that long.
#### Message Chunking
The most significant bit of the payload size field of an MMB is called the Chunk
Control Bit (CCB). If the CCB of a given MMB is zero, the represented message is
interpreted as being self-contained and the data is processed immediately. If