Number parsing routine #94
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Rust has good number parsing with the
parse
method, meanwhile C's stuck in 1989. atoi(3) sucks. It returns 0 if the value is 0 or if there's an error. strtol(3) is the grown-up version of the command but is a real pain to use (set errno to 0, check after and also check the end pointer). I really want a good number parser to use in C programs.Things to consider:
libbonsai
or should we have specific libraries for specific sets of functionality?advocating for only accepting ascii 0-9. homoglyphs are annoying to deal with and often end up being a source of sanitization-related security issues. perhaps an additional unicode-aware version would be warranted.
related: https://util.unicode.org/UnicodeJsps/confusables.jsp?a=47&r=None
I could write C bindings to the Rust functions, that way they just accept anything Rust does.
I’m not really sure what this is all about, though. Can you explain in-depth what this issue is for?
Integer parsing has to be done in our C programs in a couple of places:
I'm starting to implement pg(1) from #44. The code is bad and the branch in which I'm implementing it is mainly serving as a playground in which I can toss shit around and see what works. pg(1) needs to parse numeric arguments to configure page lengths.
This will be my third time figuring out integer parsing with strtol(3) and I have found it awkward and difficult to read every time I've done it. strtol(3) seems incredibly overengineered for this task (base configuration? an end pointer?) but scanning a string with isdigit(3p) and using atoi(3p) seems crude and checks each byte in the input at least twice. There are probably even some subtle inconsistencies between integer parsing in dj(1) and intcmp(1) (I really hope not) - I don't wanna take a third risk at inconsistent behavior.
Rust parsing is nice and it would be very welcome in our C utilities. Rust makes this very easy.