Formatting #142
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
joke
question
wontfix
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Depends on
#138 dj(1): fix yet another stdin skipping bug
bonsai/harakit
Reference: bonsai/harakit#142
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "dj-formatting"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
dj(1): formattingto FormattingThis started as a
dj(1)
-specific branch but I wanted to work through everything.oops, commit
acc3cf3e90
is not forswab(1)
but fornpc(1)
.@trinity please remove the
goto
andpass
label fromstr(1)
, I couldn’t figure out how to make it work otherwise.mm(1)
will be complex to do, I can finish it later.@ -25,3 +26,1 @@
#if !defined EX_OK || !defined EX_OSERR || !defined EX_USAGE
# include <sysexits.h>
#endif
#include <sysexits.h>
Say which ones you're using.
@ -69,3 +68,2 @@
static struct Io *
Io_read(struct Io *io){
static struct Io * Io_read(struct Io *io) {
This is so I can move to
/^function/
to find its definition.This should stay that way because it's somewhat common practice among greybeards (I know many in the 9front community do it).
@ -180,3 +177,2 @@
if(argc > 0){
int c;
if (!argc < 0) { usage(program_name); }
if (!(argc < 0))
https://en.cppreference.com/w/c/language/operator_precedence@ -216,3 +221,1 @@
if((c == 'b' && (io[i].bs = parse(optarg)) > 0)
|| (c == 's' && (io[i].seek = parse(optarg)) >= 0))
break;
c |= 0b00100000; /* (ASCII) make lowercase */
C89 can't do
0b
literals, which is why I used the more portable hex with a comment.@ -272,3 +289,2 @@
t = io[0].bufuse;
if(Io_read(&io[0])->bufuse == t && !noerror && io[0].error == 0)
size_t t = io[0].bufuse;
Variables should be declared at the start of the scope; this is invalid C89.
@ -35,0 +35,4 @@
int usage(char *s) {
fprintf(
stderr, "Usage: %s [-egl] integer integer...\n",
s == NULL ? program_name : s
This ternary should happen somewhere in
main
.@ -44,2 +51,3 @@
if(argc < 3) { return usage(argv[0]); }
while((c = getopt(argc, argv, "egl")) != -1)
while ((c = getopt(argc, argv, "egl")) != -1) {
Test if argc is zero before doing this.
@ -26,3 +27,1 @@
#if !defined EX_IOERR || !defined EX_OK || !defined EX_OSERR \
|| !defined EX_USAGE
# include <sysexits.h>
#include <sysexits.h>
Which sysexits?
@ -37,3 +39,1 @@
case 't': showtab = 1; break;
default: goto usage;
}
if(!argc > 0) { usage(argv[0]); }
if(!(argc > 0))
. Also, this program should work whenargc==0
.I can't add a second comment on the same line so I had to unresolve this irrelevant issue.
program_name
should be set here toargv[0]
, which would be consistent with my suggested change to mm(1) and how dj(1) was before this patchset.This should be moved to the
if (argc > 0)
block.@ -20,2 +42,2 @@
else if(*argv[i-1] < *argv[i]++)
return -1; /* actually 255 */
} else if (*argv[i-1] < *argv[i]++) {
return 255;
This should be -1. While it is in reality 255, the negative is consistent with strcmp(3p) and may be useful on certain systems. Also, at some point retvals here should be more significant.
This code needs to be commented two to three times as much as it is.
I just pushed a draft style guide.
@ -0,0 +1,61 @@
- Braces are mandatory for all control flow
This stands to clutter code significantly. I guess because Rust does it, it makes sense for this project, but C programmers who do this bug the hell out of me.
They increase readability substantially.
seconded. braces are a must.
I think I've come around on this having tried writing this style of code a bit to practice.
@ -0,0 +1,61 @@
- Braces are mandatory for all control flow
- Indentation should be kept to a minimum
Nesting specifically.
@ -0,0 +1,61 @@
- Braces are mandatory for all control flow
- Indentation should be kept to a minimum
- Empty lines should be placed between different kinds of statements:
This is fine. Maybe even good.
@ -0,0 +21,4 @@
return io;
- Cases in switch statements and matches in match statements should be indented
one level
Provide an example.
I am fervently against this:
Indenting cases gives me a headache. All case lines in C always start with
case
which may as well be an indentation itself.Unindented cases are a pain in the ass to read.
from intcmp:
this is hell to read. part of it is the lack of braces for
while
's block, but the switch itself is a big issue, too. the contents of blocks are to be indented, period. switches aren't magically exempt from standard readability guidelines.Bummer but I can accept this.
@ -0,0 +37,4 @@
);
- If Rust function arguments or fields are on their own lines, they should
always have a trailing comma.
This shouldn't be required. Typos are easy to spot in Rust and I believe C disallows this.
This rule is imported directly from the default behavior of
rustfmt(1)
.@ -0,0 +47,4 @@
- If a control flow statement is short enough to be easily understood in a
glance, it may be placed on a single line:
if (!argc < 0) { usage(program_name); }
This is good.
@ -0,0 +52,4 @@
- If a do while loop in C is longer than ~25 lines, place the while statement
in a comment after the opening brace:
do { /* while(count == 0 || --count > 0); */
This is what indentation is for. This introduces extra complexity to program maintenance - what if the command and while statement are mismatched? It's less important to know the constraints of repetition in a do/while because the code inside will always run at least once; the while constraint is only on subsequent runs.
I agree with this generally; however,
dj(1)
’s source code should be special-cased to utilize this practice as it is difficult to figure whichdo
applies to whichwhile
.if we're special-casing it for dj because of its wack source code, then it probably shouldn't a hard rule like this. i agree that it introduces more potential issues and complexity for program maintenance, and i'm really not sure what to do about that. indentation is not enough to make dj's control flow very clear, though.
Recent changes to STYLE reflect this.
I'd like to suggest prohibiting the use of do-while loops. There is almost always a better way, and that better way is usually a normal while loop.
I can sort of get behind this because as far as I know they aren't in Rust. I think having similar C practices to Rust practices is useful for maintainability (at least, remembering practice).
They are not in Rust. I can add it to the style guidelines if it is wanted.
I can go about refactoring all the code that uses them tonight. I would be fine with outright disallowing do-whiles.
@ -185,2 +184,2 @@
.B -b
and
Option variants that have uppercase and lowercase forms could be confused for
each other. The former affects input and the latter affects output.
The opposite is true.
@ -187,3 +188,2 @@
.B -B
could be confused for each other, and so could
.B -s
option could be mistaken for write size, meaning the count in bytes of data
This is write size. But dj(1) writes everything it reads.
I meant the total size of the written data; I will say that.
@ -192,1 +194,3 @@
The lowercase option affects input and the capitalized option affects output.
.B -b
options. The latter sets the size of blocks to be read and the former sets the
number of blocks to be read. The
This sounds wack.
What do you think of my changes?
I'm okay with this.
@ -193,0 +196,4 @@
number of blocks to be read. The
.B -B
option is similar to the latter but sets the size of blocks to be written,
regardless of the amount of data that will actually be written.
That's because the amount of data in total that will be written is totally irrelevant to -B.
It should be highlighted that the input buffer should be very large (e.g. 4M) to take advantage of modern hardware I/O speeds, and for most uses mm(1) is better.
@ -139,3 +140,1 @@
return (*s == '\0' /* no chars left unparsed */ && errno == 0)
? r
: -1;
return (*s == '\0' && errno == 0) ? r : -1; /* no chars left unparsed */
Moving this comment makes it wrong. This ternary makes sure strtol(3p) succeeded.
*s == 0
, which is on its own vague, checks if no chars are left unparsed.@ -180,3 +178,2 @@
if(argc > 0){
int c;
if (!(argc < 0)) { usage(program_name); }
getopt(3p) will still happen when
argc==0
. And dj(1) should run whenargc<0
. If the user's system is broken, don't deny them the basic tools to fix it.@ -186,3 +181,1 @@
switch(c){
case 'i': case 'o': i = (c == 'o');
if(optarg[0] == '-' && optarg[1] == '\0'){ /* optarg == "-" */
int c;
This should be scope-limited to just the getopt(3p) part, which the if statement did.
@ -189,0 +183,4 @@
program_name = argv[0];
while ((c = getopt(argc, argv, "a:b:B:c:i:hHns:S:o:")) != -1) {
switch (c) {
case 'i': case 'o': /* input, output */
Indenting this sucks and looks ugly even in practice. See my criticisms in the style guide.
@ -216,3 +222,1 @@
if((c == 'b' && (io[i].bs = parse(optarg)) > 0)
|| (c == 's' && (io[i].seek = parse(optarg)) >= 0))
break;
c |= 0x20 /* ASCII make lowercase 0b 0010 0000 */
This looks like I threw my code line through an AI blender. The bits should be next to the hex, separate from the unrelated comment. This statement is missing a semicolon.
@ -252,2 +268,2 @@
return oserr(io[1].fn, io[1].error);
}while((io[1].seek -= (t - io[1].bufuse)) > 0 && io[1].bufuse != t);
}
if (io[1].error != 0) { return oserr(io[1].fn, io[1].error); }
This is substantially less readable than before.
Yea, this should be on another line, I think this one was done in a sleepless fugue.
@ -260,3 +278,3 @@
}
do{
do { /* while(count == 0 || --count > 0); */
Again, this commenting is bad. C isn't its preprocessor. If this kind of commenting is needed I would straight-up rather remove C from this project than continue with it. The alternative to do/whiles is infinite loops with conditions at the end which are what do/whiles exist to solve. Commenting the function of basic C language features clutters code significantly.
@ -59,0 +48,4 @@
while ((c = getc(stdin)) != EOF) {
if ((c & 0x80) != 0) { fputs("M-", stdout); }
switch (c ^ 0x80) { /* 0b 1000 0000 */
These bits should be next to the hex.
@ -23,3 +23,1 @@
#ifndef EX_USAGE
# include <sysexits.h>
#endif
#include <sysexits.h>
What macros are used?
Either way, formatting shouldn't be done manually, and I can't approve this until there's a way to do it automatically.
@trinity, @silt: I am waiting on y’all for this to be ready to merge.
I think my work here is done.
This is looking really good.
@ -0,0 +36,4 @@
}
4. In C, spaces should be placed in control flow statements after the keyword
and before the opening brace:
parenthesis*
Not “after the keyword, before the opening parenthesis”, but “after the keyword and before the opening brace”.
got it, thanks. misread.
@trinity: Are you cool with this being merged without
.clang-format
?Didn't mean to make this a review, just comments.
@ -32,3 +31,3 @@
extern int errno;
char *program_name = "dj";
static char *program_name = "dj";
This wasn't static because the program name should show up if something horrendous happens during linking, to make debugging easier.
Should they all not be static, then?
Yes, I've slowly been changing this in all the C code but need to document the rationale.
@ -148,3 +152,2 @@
"\t[-i file] [-b block_size] [-s offset]\n"
"\t[-o file] [-B block_size] [-S offset]\n",
program_name);
"\t[-o file] [-B block_size] [-S offset]\n", s
This is good.
@ -159,3 +163,4 @@
size_t i; /* side of io being modified */
char noerror; /* 0=exits (default) 1=retries on partial reads or writes */
struct Io io[2 /* { in, out } */];
char *s = (argv[0] == NULL ? program_name : argv[0]);
Use program name for this.
I don’t follow.
Instead of making a new variable overwrite the existing program name variable with argv 0. My phone keyboard lacks the underscore.
I didn't see it but this case is already covered on L189;
if argc > 0 { program_name = argv[0]; ...
@ -42,1 +37,3 @@
goto usage;
int main(int argc, char *argv[]) {
char sel[(sizeof args) / (sizeof *args)];
char *s = (argv[0] == NULL ? program_name : argv[0]);
This change was unnecessary.
How’s that?
You made this change to a number of programs, and in some it was warranted, but to replace an argv termary that only occurs when argc is less than some value misses the point of that ternary.
In the set of possible invocations with too few parameters - all of which are invalid - there is one invocation without argv 0. If there are enough parameters, there is an argv 0. Checking whether argv 0 is NULL inside the argc filter makes sense because it is a real possibility, but checking it outside the filter means that check has to be done even in the event of a correct invocation. which in turn means that computation is wasted. I hope a majority of the invocations of this utility are correct because I tried to make this utility easy to use. But with this change a - hopefully - majority of invocations would waste cycles on a redundant check.
argv 0 only needs to be checked if str was invoked incorrectly. and then only to pass a valid value to usage. After that initial check argv 0 can be assumed to be valid. This thoughtfulness is also present in other utilities you changed in this commit. I understand the value of consistency but these utilities' usage varies and so too does argv handling to ensure code is efficient and only errors when it has to.
There is no problem with "wasting" such an incredibly infinitesimal amount of computation here. This code is not in a hot path, and if my understanding is correct, it only runs once within the programs lifespan. It takes vastly more effort for the computer to allocate space for the program, load it into that space, and orchestrate its execution. None of this matters.
If it does not contribute value to the program, the code should be included/excluded based on whether or not its existence makes the code more readable, and the program's control flow easier to understand.
I regard the harakit as art as well as code. Each utility should be simple, but also the best at what it does ("do one thing well"). Lost cycles add up. I don't care how small the issue is, I don't even want my name on sub-par code in a repository dedicated to making good, performant programs.
I've made some suggestions regarding consistency here that not only settle previous gripes about consistency but also are as efficient with regards to comparisons as I see possible. I strongly believe having simple, but not boring and cookie-cutter one-size-fits-all solutions, is better than having ugly kludges like this copy-pasted ternary, or a
goto dumb
, or aNOARGVZERO
macro, all of which I've tried over the last couple years. Ultimately the best solution is to treat programs individually, with shared practices, rather than trying to duplicate code among them (in which case the macro would work better anyway).I would like to be clear: Code duplication that oversteps the lines of shared practice is heedless, ugly, uninteresting - in the worst ways, inefficient, and the opposite of good craftsmanship. Grepping for argv[0] or program_name should be more than enough to trace the very sparse use of either in any program (usually used with an also consistent
usage()
). Though I occasionally struggle to decipher my past code (which is where my best commenting comes from) I've never had an issue withargv[0]
orprogram_name
or even really had it occur to me because they're rarely buggy, or when they are buggy, very easy to debug.This says the code is outdated but as of commenting the relevant snippet is not.
@ -0,0 +65,4 @@
8. If a control flow statement is short enough to be easily understood in a
glance, it may be placed on a single line:
if (!argc < 0) { usage(program_name); }
!any_int < 0
will always evaluate to1
.@ -0,0 +89,4 @@
12. Follow the rules from the paper The Power of 10: Rules for Developing
Safety-Critical Code [0]:
1. Avoid complex flow constructs, such as goto and recursion.
2. All loops must have fixed bounds. This prevents runaway code.
Examples of this would be good.
Further explanation is available in the linked paper.
@ -0,0 +91,4 @@
1. Avoid complex flow constructs, such as goto and recursion.
2. All loops must have fixed bounds. This prevents runaway code.
3. Avoid heap memory allocation.
4. Restrict functions to a single printed page.
the length of, or so
@ -0,0 +92,4 @@
2. All loops must have fixed bounds. This prevents runaway code.
3. Avoid heap memory allocation.
4. Restrict functions to a single printed page.
5. Use a minimum of two runtime assertions per function.
That's intense. An example:
@ -0,0 +94,4 @@
4. Restrict functions to a single printed page.
5. Use a minimum of two runtime assertions per function.
6. Restrict the scope of data to the smallest possible.
7. Check the return value of all non-void functions, or cast to void to
Prepare to cast all fprintf(3p)s.
@ -0,0 +97,4 @@
7. Check the return value of all non-void functions, or cast to void to
indicate the return value is useless.
8. Use the preprocessor sparingly.
9. Limit pointer use to a single dereference, and do not use function
Is this possible for analyzing arrays of strings?
Please elaborate.
Iterating through characters in arguments looks something like:
Assigning multiple bindings to avoid consecutive dereferences is in the spirit of this rule and is also enforced by the Rust compiler with lifetimes.
Got it.
@ -41,3 +49,2 @@
if(argc < 3)
goto usage;
if (argc == 0 | argc < 3) { return usage(program_name); }
The compiler will filter this if down to one condition anyway. Why are there two here?
argc==0
isn't a special case with regards to intcmp(1), except that theprogram_name
fallback needs to be used.Moving the
program_name
ternary over here would be more efficient.To line 50?
Yes, not setting an intermediate variable but relying on
argv[0]
elsewhere, only falling back toprogram_name
here.So:
@ -122,3 +125,4 @@
size_t j;
size_t k; /* loop index but also unbuffer status */
int retval;
program_name = (argv[0] == NULL ? program_name : argv[0]);
Why move this? It was more efficient before this changed. Anyone curious about the self-explanatory
program_name
can just grep for it in the file.@ -153,3 +153,1 @@
break;
case 'e':
if(Files_append(&files[1], stderr, stderr_name) != NULL)
if (argc > 1) {
The previous version of this wasn't pretty, but adopting an
if (argc > 0) { program_name = argv[0]; ...
like dj(1) used to have would be consistent (these are both utilities that can run without arguments, but which take options) and efficient.I would put this in the same block as getopt(3p) to reduce cognitive load; That One Block is where arguments are interpreted.
@ -234,3 +264,4 @@
}
terminate;
}
I'm reviewing this hastily because the Rust rewrite is a superior version anyway.
@ -57,0 +53,4 @@
int main(int argc, char *argv[]){
size_t ctype; /* selected from ctypes.h; index of ctype */
int retval;
char *s = (argv[0] == NULL ? program_name : argv[0]);
This should be
program_name
or, ideally, merged back into theif (argc < 3)
check from whence it came.I couldn’t figure out how the
s
variable works. This code is extremely unreadable even in its current state.I was under the impression you added the
s
variable to this program?Having a silly time with this.
s
didn't work because theargv[0]
you replaced later in the program wasn't the program name but an iterating pointer.@ -57,0 +55,4 @@
int retval;
char *s = (argv[0] == NULL ? program_name : argv[0]);
if (argc < 3) { return usage(s); }
The
s
ternary should be here, without an intermediate variable.Reminder to myself to include this Kernighan quote from The Elements of Programming Style:
@ -45,3 +45,2 @@
Ok(_) => {
/* unwrap because Err(OptError::MissingArg) will be returned if
* opt.arg() is None */
/* delimiter */
Instead of marking that this is to set the delimiter, I'd prefer a more descriptive name for the variable
d
(likedelimiter
). I'd also prefer:Considering the delimiter's purpose is documenting in the man page, after incorporating these changes, I wouldn't even comment this match block.
If I make the
Ok()
arm match onlyOk("d")
, then I have to addOk(_)
to the existing Err arm or the Rust compiler will not compile the program.Could you match
_
instead ofErr(_)
?@ -55,38 +54,46 @@ fn main() {
};
Would it be better to just return from main rather than calling a function?
Using
exit()
throughout the program is consistent.@ -61,2 +60,4 @@
/* argv[0] of the operator command */
let operator = argv.get(command_arg).unwrap_or_else(|| {
eprintln!("{}", usage);
exit(EX_USAGE);
Emma: you told me that returning from main here would be impossible from a closure. Would it be better to match the Result here to facilitate that?
I'm not partial to either matching or using the clojure, but I think it should be consistent between programs.
It would be more verbose.
Got it.
@ -72,3 +76,4 @@
let mut fields = buf.split(&d).collect::<Vec<&str>>();
/* collect arguments for the operator command */
let opts = argv
Instead of
opts
here I would call this a variant ofarg
.opts
is a little misleading; this is used for all given arguments, not just parsed options.@ -58,3 +58,3 @@
#[derive(Clone, PartialEq, PartialOrd, Debug)]
// enum CalcType is a type containing operations used in the calculator
/* enum CalcType is a type containing operations used in the calculator */
Maybe it should be
OpType
to be self-documenting.I dislike this name because the type doesn’t only include options. The CalcType name is inherited from the library from which i forked this code.
Fair enough.
@ -121,2 +120,3 @@
// repeating and it seems this can give it to me
/* I’m no math nerd but I want the highest possible approximation of 0.9
* repeating and it seems this can give it to me */
const PRECISION_MOD: f64 = 0.9 + f64::EPSILON * 100.0;
What does this do?
Sets the float I’m rounding to to be the highest approximation of 0.9 repeating possible on the hardware
Yeah, we can do it later down the line.
@trinity: let’s look this over once more together so we can get it merged.
I can get behind merging this.
This is set to be merged.
Merged.
Pull request closed