hopp/design/branched-generated-encoder.md

6.8 KiB

Branched Generated Decoder

Pasted here because Tebitea is down

The problem

TAPE is designed so that the decoder can gloss over data it does not understand. Technically the protocol allows for this, but I completely forgot to implement this in the generated decoder, oops. This would be trivial if TAPE messages were still flat tables, but they aren't, because those aren't useful enough. So, let's analyze the problem.

When it happens

There are two reasons something might not match up with the expected data:

The first and most obvious is unrecognized keys. If the key is not in the set of recognized keys for a KTV, it should leave the corresponding struct field blank. Once #6 has been implemented, throw an error if the data was not optional.

The second is wrong types. If we are expecting KTV and get SBA, we should leave the data as empty. The aforementioned concern about #6 also applies here. We don't need to worry about special cases at the structure root, because it would be technically possible to make the structure root an option, so it really is just a normal value. Until #6, we will leave that blank too.

Preliminary ideas

The first is going to be pretty simple. All we need to do is have a skimmer function that skims over TAPE data very, and then call that on the KTV value each time we run into a mystery key. It should only return an error if the structure of the data is malformed in such a way that it cannot continue to the next one. This should be stored in the tape package alongside the dynamic decoding functions, because they will essentially function the same way and could probably share lots of code.

The second is a bit more complicated because of the existence of KTV and OTA because they are aggregate types. Go types work a bit differently, as if you have an array of an array of an array of ints, that information is represented in one place, whereas TAPE doesn't really do that. All of that information is sort of buried within the data structure, so we don't know what we will be decoding before we actually do it. Whenever we encounter a type we don't expect, we would need to abort decoding of the entire data structure, and then skim over whatever detritus is left, which would literally be in a half-decoded state. The fact that the code is generated flat and thus cannot use return or defer statements contributes to the complexity of this problem. We need to go up, but we can't. There is no up, only forward.

Of course, the dynamic decoder does not have this problem in the first place because it doesn't expect anything, and constructs the destination to fit whatever it sees in the TAPE structure as it is decoding it. KTVs are completely dynamic because they are implemented as maps, so the only time it needs to completely comprehend a type is with OTAs. There is a function called typeOf that gets the type of the current tag and returns it as a reflect.Type, which necessitates recursion and peeking at OTAs and their elements.

We could try to do the same thing in the generated decoder, comparing the determined type against the expected type to try to figure out whether we should decode an array or a table, etc. This is immediately problematic as it requires memory to be allocated, both for the peek buffer and the resulting tree of type information. If we end up with some crazy way to keep track of the types, that's only one half of the allocation problem and we would still be spending extra cycles going over all of that twice.

Performance constraints

The generated decoder is supposed to blaze through data, and it can't do that if it does all the singing and dancing that the dynamic decoder does. It's time for some performance constraints:

  • No allocations, except as required to build the destination for the data
  • No redundant work
  • So, no freaking peeking
  • It should take well under 500 lines of generated code to decode one message of reasonable size (i.e. be careful not to bloat the binary)

I'm not really going to do my usual thing here of making a slow version and speeding it up over time based on evidence and experimentation because these constraints inform the design so much it would be impossible to continue without them. I am 99% confident that these constraints will allow for an acceptable baseline of performance (for generated code) and we can still profile and micro-optimize later. This is good enough for me. Heavy solution

There is a solution that might work very well which involves completely redoing the generated decoding code. We could create a function for every source type to destination type mapping that exists in protocol, and then compose them all together. The decoding methods for each message or type would be wrappers around the correct function for their root TAPE -> Go type mapping. The main benefit of this is it would make this problem a lot more manageable because the interface points between the data would be represented by function boundaries. This would allow the use of return and defer statements, and would allow more code sharing, producing a smaller binary. Go would probably inline these where needed.

Would this work? Probably. More investigation is required to make sure. I want to stop re-writing things I don't need to. On the other hand, it is just the decoder.

Light solution

TODO: find a solution that satisfies the performance constraints, keeps the same identical interface, and works off the same code. I am convinced this is doable, and it might even allow us to extract more data from an unexpected structure. However, continuing this way might introduce unmanageable complexity. It is already a little unmanageable and I am just one pony (kind of).

Implementation

Heavy solution is going to work here, applied to only the points of Generator.generateDecodeValue where it decodes an aggregate data structure. That way, only minimal amounts of code need to be redone.

Whenever a branch needs to happen, a call shall be generated, a deferred implementation request shall be added to a special FIFO queue within the generator. After generating data structures and their root decoding functions, the generator shall pick away at this queue until no requests remain. The generator shall accept new items during this process, so that recursion is possible. This is all to ensure it is only ever writing one function at a time

The functions shall take a pointer to a type that accepts any type like (~) the destination's base type. We should also probably just call Generator.generateDecodeValue directly on user defined types this way, keeping their public Decode methods just for convenience.

The tape package shall contain a skimming function that takes a decoder and a tag, and recursively consumes the decoder given the context of the tag. This shall be utilized by the decoder functions to skip over values if their tags or keys do not match up with what is expected.