message-size-increase #3

Open
sashakoshka wants to merge 81 commits from message-size-increase into main
Showing only changes of commit c4a985f622 - Show all commits

View File

@ -40,92 +40,73 @@ designed to allow applications to be presented with data they are not equipped
to handle while continuing to function normally. This enables backwards to handle while continuing to function normally. This enables backwards
compatibile application protocol changes. compatibile application protocol changes.
The length of a TAPE structure is assumed to be given by the surrounding TAPE expresses types using tags. A tag is 8 bits in size, and is divided into
protocol, which is usually METADAPT-A or B. The root of a TAPE structure can be two parts: the Type Number (TN), and the Configuration Number (CN). The TN is 3
any data value, but is usually a table, which can contain several values that bits, and the CN is 5 bits. Both are interpreted as unsigned integers. Both
each have a numeric key. Values can also be nested. Both sides of the connection sides of the connection must agree on the semantic meaning of the values and
must agree on what data type should be the root value, the data type of each their arrangement.
known table value, etc.
TAPE is based on an encoding method previously developed by silt.
### Data Value Types ### Data Value Types
The table below lists all data value types supported by TAPE. The table below lists all data value types supported by TAPE. They are discussed
in detail in the following sections.
| Name | Size | Description | Encoding Method | TN | Bits | Name | Description
| ----------- | --------------: | --------------------------- | --------------- | -: | ---: | ---- | -----------
| I8 | 1 | A signed 8-bit integer | BETC | 0 | 000 | SI | Small integer
| I16 | 2 | A signed 16-bit integer | BETC | 1 | 001 | LI | Large integer
| I32 | 4 | A signed 32-bit integer | BETC | 2 | 010 | FP | Floating point
| I64 | 8 | A signed 64-bit integer | BETC | 3 | 011 | SBA | Small byte array
| U8 | 1 | An unsigned 8-bit integer | BEU | 4 | 100 | LBA | Large byte array
| U16 | 2 | An unsigned 16-bit integer | BEU | 5 | 101 | OTA | One-tag array
| U32 | 4 | An unsigned 32-bit integer | BEU | 6 | 110 | KTV | Key-tag-value table
| U64 | 8 | An unsigned 64-bit integer | BEU | 7 | 111 | N/A | Reserved
| Array[^1] | | An array of any above type | PASTA
| String | | A UTF-8 string | UTF-8
| StringArray | | An array the String type | VILA
| Table | | A table of any type | TTLV
[^1]: Array types are written as <E>Array, where <E> is the element type. For #### No Value (NIL)
example, an array of I32 would be written as I32Array. StringArray still follows NIL is used to encode the absence of a value where there would otherwise be one.
this rule, even though it is encoded differently from other arrays. The CN of a NIL is ignored. It has no payload.
[^2]: SOP (sum of parts) refers to the sum of the size of every item in a data #### Small Integer (SI)
structure. SI encodes an integer of up to 5 bits, which are stored in the CN. It has no
payload. Whether the bits are interpreted as unsigned or as signed two's
complement is semantic information and must be agreed upon by both sides of the
connection. Thus, the value may range from 0 to 31 if unsigned, and from -16 to
17 if signed.
### Encoding Methods #### Large Integer (LI)
Below are all encoding methods supported by TAPE. LI encodes an integer of up to 256 bits, which are stored in the payload. The CN
determine the length of the payload in bytes. The integer is big-endian. Whether
the payload is interpreted as unsigned or as signed two's complement is semantic
information and must be agreed upon by both sides of the connection. Thus, the
value may range from 0 to 31 if unsigned, and from -16 to 17 if signed.
#### BETC #### Floating Point (FP)
Big-Endian, Two's Complement signed integer. The size is defined as the least FP encodes an IEEE 754 floating point number of up to 256 bits, which are stored
amount of whole octets which can fit all bits in the integer, regardless if the in the payload. The CN determines the length of the payload in bytes, and it may
bits are on or off. Therefore, the size cannot change at runtime. only be one of these values: 16, 32, 64, 128, or 256.
#### BEU #### Small Byte Array (SBA)
Big-Endian, Unsigned integer. The size is defined as the least amount of whole SBA encodes an array of up to 32 bytes, which are stored in the paylod. The
octets which can fit all bits in the integer, regardless if the bits are on or CN determines the length of the payload in bytes.
off. Therefore, the size cannot change at runtime.
#### GBEU #### Large Byte Array (LBA)
Growing Big-Endian, Unsigned integer. The integer is broken up into 8-bit LBA encodes an array of up to 2^256 bytes, which are stored in the second part
chunks, where the first bit of each chunk is a CCB. The chunk with its CCB set of the payload, directly after the length. The length of the data length field
to zero instead of one is the last chunk in the integer. Chunks are ordered from in bytes is determined by the CN.
most significant to least significant (big endian). The size is defined as the
least amount of whole octets which can fit all chunks of the integer. The size
of this type is not fixed and may change at runtime, so this needs to be
accounted for during use.
#### PASTA #### One-Tag Array (OTA)
Packed Single-Type Array. The size is defined as the size of an individual item OTA encodes an array of up to 2^256 items, which are stored in the payload after
times the number of items. Items are placed one after the other with no gaps the length field and the item tag, where the length field comes first. Each item
in-between them, except as required to align the start of each item to the must be the same length, as they all share the same tag. The length of the data
nearest whole octet. Items should be of the same type and must be of the same length field in bytes is determined by the CN.
size.
#### UTF-8 #### Key-Tag-Value Table (KTV)
UTF-8 string. The size is defined as the least amount of whole octets which can KTV encodes a table of up to 2^256 key/value pairs, which are stored in the
fit all bits in the string, regardless if the bits are on or off. The size of payload after the length field. The pairs themselves consist of a 16-bit
this type is not fixed and may change at runtime, so this needs to be accounted unsigned big-endian key followed by a tag and then the payload. Pair values can
for during use. be of different types and sizes. The order of the pairs is not significant and
should never be treated as such.
#### VILA
Variable Item Length Array. The size is defined as the least amount of whole
octets which can fit each item plus one GBEU per item describing that item's
size. The size of this type is not fixed and may change at runtime, so this
needs to be accounted for during use. The amount of items must be greater than
zero. Items are each prefixed by their size (in octets) encoded as a GBEU, and
they are placed one after the other with no gaps in-between them, except as
required to align the start of each item to the nearest whole octet. Items
should be of the same type but do not need to be of the same size.
#### TTLV
TAPE Tag Length Value. The size is defined as the least amount of whole octets
which can fit each item plus one U16 and one GBEU per item, where the latter of
which describes that item's size. The size of this type is not fixed and may
change at runtime, so this needs to be accounted for during use. Items are each
prefixed by their numerical tag encoded as a U16, and their size (in octets)
encoded as a GBEU. Items are placed one after the other with no gaps in-between
them, except as required to align the start of each item to the nearest whole
octet. Items need not be of the same type nor the same size.
## Transports ## Transports
A transport is a protocol that HOPP connections can run on top of. HOPP A transport is a protocol that HOPP connections can run on top of. HOPP
@ -176,7 +157,6 @@ sun will have expanded to swallow earth by then. Your connection will not last
that long. that long.
#### Message Chunking #### Message Chunking
The most significant bit of the payload size field of an MMB is called the Chunk The most significant bit of the payload size field of an MMB is called the Chunk
Control Bit (CCB). If the CCB of a given MMB is zero, the represented message is Control Bit (CCB). If the CCB of a given MMB is zero, the represented message is
interpreted as being self-contained and the data is processed immediately. If interpreted as being self-contained and the data is processed immediately. If