Formatting #142

Closed
emma wants to merge 0 commits from dj-formatting into main
Owner
No description provided.
emma added the
enhancement
label 2024-07-12 15:33:46 -06:00
emma added 4 commits 2024-07-12 15:33:46 -06:00
emma added a new dependency 2024-07-12 15:34:18 -06:00
emma added 1 commit 2024-07-12 15:43:19 -06:00
emma changed title from dj(1): formatting to Formatting 2024-07-12 15:43:59 -06:00
emma added 1 commit 2024-07-12 15:44:06 -06:00
Author
Owner

This started as a dj(1)-specific branch but I wanted to work through everything.

This started as a `dj(1)`-specific branch but I wanted to work through everything.
emma added 1 commit 2024-07-12 15:55:30 -06:00
Author
Owner

oops, commit acc3cf3e90 is not for swab(1) but for npc(1).

oops, commit acc3cf3e90 is not for `swab(1)` but for `npc(1)`.
emma added 1 commit 2024-07-12 15:56:31 -06:00
emma added 1 commit 2024-07-12 16:05:02 -06:00
emma added 1 commit 2024-07-12 16:16:50 -06:00
emma added 1 commit 2024-07-12 16:19:43 -06:00
emma requested review from trinity 2024-07-12 16:19:47 -06:00
emma requested review from silt 2024-07-12 16:19:47 -06:00
Author
Owner

@trinity please remove the goto and pass label from str(1), I couldn’t figure out how to make it work otherwise.

@trinity please remove the `goto` and `pass` label from `str(1)`, I couldn’t figure out how to make it work otherwise.
emma added 1 commit 2024-07-12 16:23:50 -06:00
Author
Owner

mm(1) will be complex to do, I can finish it later.

`mm(1)` will be complex to do, I can finish it later.
trinity requested changes 2024-07-13 00:44:10 -06:00
src/dj.c Outdated
@ -25,3 +26,1 @@
#if !defined EX_OK || !defined EX_OSERR || !defined EX_USAGE
# include <sysexits.h>
#endif
#include <sysexits.h>
Owner

Say which ones you're using.

Say which ones you're using.
emma marked this conversation as resolved
src/dj.c Outdated
@ -69,3 +68,2 @@
static struct Io *
Io_read(struct Io *io){
static struct Io * Io_read(struct Io *io) {
Owner

This is so I can move to /^function/ to find its definition.

This is so I can move to `/^function/` to find its definition.
Owner

This should stay that way because it's somewhat common practice among greybeards (I know many in the 9front community do it).

This should stay that way because it's somewhat common practice among greybeards (I know many in the 9front community do it).
emma marked this conversation as resolved
src/dj.c Outdated
@ -180,3 +177,2 @@
if(argc > 0){
int c;
if (!argc < 0) { usage(program_name); }
Owner
`if (!(argc < 0))` https://en.cppreference.com/w/c/language/operator_precedence
emma marked this conversation as resolved
src/dj.c Outdated
@ -216,3 +221,1 @@
if((c == 'b' && (io[i].bs = parse(optarg)) > 0)
|| (c == 's' && (io[i].seek = parse(optarg)) >= 0))
break;
c |= 0b00100000; /* (ASCII) make lowercase */
Owner

C89 can't do 0b literals, which is why I used the more portable hex with a comment.

C89 can't do `0b` literals, which is why I used the more portable hex with a comment.
emma marked this conversation as resolved
src/dj.c Outdated
@ -272,3 +289,2 @@
t = io[0].bufuse;
if(Io_read(&io[0])->bufuse == t && !noerror && io[0].error == 0)
size_t t = io[0].bufuse;
Owner

Variables should be declared at the start of the scope; this is invalid C89.

Variables should be declared at the start of the scope; this is invalid C89.
emma marked this conversation as resolved
src/intcmp.c Outdated
@ -35,0 +35,4 @@
int usage(char *s) {
fprintf(
stderr, "Usage: %s [-egl] integer integer...\n",
s == NULL ? program_name : s
Owner

This ternary should happen somewhere in main.

This ternary should happen somewhere in `main`.
emma marked this conversation as resolved
src/intcmp.c Outdated
@ -44,2 +51,3 @@
if(argc < 3) { return usage(argv[0]); }
while((c = getopt(argc, argv, "egl")) != -1)
while ((c = getopt(argc, argv, "egl")) != -1) {
Owner

Test if argc is zero before doing this.

Test if argc is zero before doing this.
emma marked this conversation as resolved
src/mm.c Outdated
@ -26,3 +27,1 @@
#if !defined EX_IOERR || !defined EX_OK || !defined EX_OSERR \
|| !defined EX_USAGE
# include <sysexits.h>
#include <sysexits.h>
Owner

Which sysexits?

Which sysexits?
emma marked this conversation as resolved
src/npc.c Outdated
@ -37,3 +39,1 @@
case 't': showtab = 1; break;
default: goto usage;
}
if(!argc > 0) { usage(argv[0]); }
Owner

if(!(argc > 0)). Also, this program should work when argc==0.

`if(!(argc > 0))`. Also, this program should work when `argc==0`.
Owner

I can't add a second comment on the same line so I had to unresolve this irrelevant issue.

program_name should be set here to argv[0], which would be consistent with my suggested change to mm(1) and how dj(1) was before this patchset.

I can't add a second comment on the same line so I had to unresolve this irrelevant issue. `program_name` should be set here to `argv[0]`, which would be consistent with my suggested change to mm(1) and how dj(1) was before this patchset.
Owner

This should be moved to the if (argc > 0) block.

This should be moved to the `if (argc > 0)` block.
emma marked this conversation as resolved
src/strcmp.c Outdated
@ -20,2 +42,2 @@
else if(*argv[i-1] < *argv[i]++)
return -1; /* actually 255 */
} else if (*argv[i-1] < *argv[i]++) {
return 255;
Owner

This should be -1. While it is in reality 255, the negative is consistent with strcmp(3p) and may be useful on certain systems. Also, at some point retvals here should be more significant.

This should be -1. While it *is* in reality 255, the negative is consistent with strcmp(3p) and may be useful on certain systems. Also, at some point retvals here should be more significant.
emma marked this conversation as resolved
emma added 8 commits 2024-07-13 17:05:33 -06:00
emma added 1 commit 2024-07-13 17:24:27 -06:00
Author
Owner

This code needs to be commented two to three times as much as it is.

This code needs to be commented two to three times as much as it is.
emma added 2 commits 2024-07-13 18:19:27 -06:00
emma added 1 commit 2024-07-13 18:30:23 -06:00
emma added 1 commit 2024-07-14 02:11:21 -06:00
Author
Owner

I just pushed a draft style guide.

I just pushed a draft style guide.
emma added 1 commit 2024-07-14 02:16:02 -06:00
emma added 1 commit 2024-07-14 02:38:38 -06:00
emma added 1 commit 2024-07-14 02:42:53 -06:00
emma added 1 commit 2024-07-14 02:43:55 -06:00
emma added 3 commits 2024-07-14 03:26:47 -06:00
emma requested review from trinity 2024-07-14 18:43:58 -06:00
trinity requested changes 2024-07-15 02:26:36 -06:00
STYLE Outdated
@ -0,0 +1,61 @@
- Braces are mandatory for all control flow
Owner

This stands to clutter code significantly. I guess because Rust does it, it makes sense for this project, but C programmers who do this bug the hell out of me.

This stands to clutter code significantly. I guess because Rust does it, it makes sense for this project, but C programmers who do this bug the hell out of me.
Author
Owner

They increase readability substantially.

They increase readability substantially.
Owner

seconded. braces are a must.

seconded. braces are a must.
Owner

I think I've come around on this having tried writing this style of code a bit to practice.

I think I've come around on this having tried writing this style of code a bit to practice.
trinity marked this conversation as resolved
STYLE Outdated
@ -0,0 +1,61 @@
- Braces are mandatory for all control flow
- Indentation should be kept to a minimum
Owner

Nesting specifically.

Nesting specifically.
emma marked this conversation as resolved
STYLE Outdated
@ -0,0 +1,61 @@
- Braces are mandatory for all control flow
- Indentation should be kept to a minimum
- Empty lines should be placed between different kinds of statements:
Owner

This is fine. Maybe even good.

This is fine. Maybe even good.
emma marked this conversation as resolved
STYLE Outdated
@ -0,0 +21,4 @@
return io;
- Cases in switch statements and matches in match statements should be indented
one level
Owner

Provide an example.

I am fervently against this:

switch (c) {
    case 'a':
         puts("a option\n");
         if (c == 'a') {
            puts("moar nest\n");
        }
        break;
}

Indenting cases gives me a headache. All case lines in C always start with case which may as well be an indentation itself.

Provide an example. I am fervently against this: ``` switch (c) { case 'a': puts("a option\n"); if (c == 'a') { puts("moar nest\n"); } break; } ``` Indenting cases gives me a headache. All case lines in C always start with `case` which may as well be an indentation itself.
Author
Owner

Unindented cases are a pain in the ass to read.

Unindented cases are a pain in the ass to read.
Owner

from intcmp:

while((c = getopt(argc, argv, "egl")) != -1)
  switch(c){
  case 'e': mode |= EQUAL;   break;
  case 'g': mode |= GREATER; break;
  case 'l': mode |= LESS;    break;
  default:  goto usage;
  }

this is hell to read. part of it is the lack of braces for while's block, but the switch itself is a big issue, too. the contents of blocks are to be indented, period. switches aren't magically exempt from standard readability guidelines.

from intcmp: ```c while((c = getopt(argc, argv, "egl")) != -1) switch(c){ case 'e': mode |= EQUAL; break; case 'g': mode |= GREATER; break; case 'l': mode |= LESS; break; default: goto usage; } ``` this is hell to read. part of it is the lack of braces for `while`'s block, but the switch itself is a big issue, too. the contents of blocks are to be indented, *period*. switches aren't magically exempt from standard readability guidelines.
Owner

Bummer but I can accept this.

Bummer but I can accept this.
trinity marked this conversation as resolved
STYLE Outdated
@ -0,0 +37,4 @@
);
- If Rust function arguments or fields are on their own lines, they should
always have a trailing comma.
Owner

This shouldn't be required. Typos are easy to spot in Rust and I believe C disallows this.

This shouldn't be required. Typos are easy to spot in Rust and I believe C disallows this.
Author
Owner

This rule is imported directly from the default behavior of rustfmt(1).

This rule is imported directly from the default behavior of `rustfmt(1)`.
emma marked this conversation as resolved
STYLE Outdated
@ -0,0 +47,4 @@
- If a control flow statement is short enough to be easily understood in a
glance, it may be placed on a single line:
if (!argc < 0) { usage(program_name); }
Owner

This is good.

This is good.
emma marked this conversation as resolved
STYLE Outdated
@ -0,0 +52,4 @@
- If a do while loop in C is longer than ~25 lines, place the while statement
in a comment after the opening brace:
do { /* while(count == 0 || --count > 0); */
Owner

This is what indentation is for. This introduces extra complexity to program maintenance - what if the command and while statement are mismatched? It's less important to know the constraints of repetition in a do/while because the code inside will always run at least once; the while constraint is only on subsequent runs.

This is what indentation is for. This introduces extra complexity to program maintenance - what if the command and while statement are mismatched? It's less important to know the constraints of repetition in a do/while because the code inside will always run at least once; the while constraint is only on subsequent runs.
Author
Owner

I agree with this generally; however, dj(1)’s source code should be special-cased to utilize this practice as it is difficult to figure which do applies to which while.

I agree with this generally; however, `dj(1)`’s source code should be special-cased to utilize this practice as it is difficult to figure which `do` applies to which `while`.
Owner

if we're special-casing it for dj because of its wack source code, then it probably shouldn't a hard rule like this. i agree that it introduces more potential issues and complexity for program maintenance, and i'm really not sure what to do about that. indentation is not enough to make dj's control flow very clear, though.

if we're special-casing it for dj because of its wack source code, then it probably shouldn't a hard rule like this. i agree that it introduces more potential issues and complexity for program maintenance, and i'm really not sure what to do about that. indentation is *not* enough to make dj's control flow very clear, though.
Author
Owner

if we're special-casing it for dj because of its wack source code, then it probably shouldn't a hard rule like this.

Recent changes to STYLE reflect this.

> if we're special-casing it for dj because of its wack source code, then it probably shouldn't a hard rule like this. Recent changes to STYLE reflect this.
First-time contributor

I'd like to suggest prohibiting the use of do-while loops. There is almost always a better way, and that better way is usually a normal while loop.

I'd like to suggest prohibiting the use of do-while loops. There is almost always a better way, and that better way is usually a normal while loop.
Owner

I'd like to suggest prohibiting the use of do-while loops.

I can sort of get behind this because as far as I know they aren't in Rust. I think having similar C practices to Rust practices is useful for maintainability (at least, remembering practice).

> I'd like to suggest prohibiting the use of do-while loops. I can sort of get behind this because as far as I know they aren't in Rust. I think having similar C practices to Rust practices is useful for maintainability (at least, remembering practice).
Author
Owner

They are not in Rust. I can add it to the style guidelines if it is wanted.

They are not in Rust. I can add it to the style guidelines if it is wanted.
Owner

I can go about refactoring all the code that uses them tonight. I would be fine with outright disallowing do-whiles.

I can go about refactoring all the code that uses them tonight. I would be fine with outright disallowing do-whiles.
emma marked this conversation as resolved
@ -185,2 +184,2 @@
.B -b
and
Option variants that have uppercase and lowercase forms could be confused for
each other. The former affects input and the latter affects output.
Owner

The opposite is true.

The opposite is true.
emma marked this conversation as resolved
docs/dj.1 Outdated
@ -187,3 +188,2 @@
.B -B
could be confused for each other, and so could
.B -s
option could be mistaken for write size, meaning the count in bytes of data
Owner

This is write size. But dj(1) writes everything it reads.

This *is* write size. But dj(1) writes everything it reads.
Author
Owner

I meant the total size of the written data; I will say that.

I meant the total size of the written data; I will say that.
emma marked this conversation as resolved
@ -192,1 +194,3 @@
The lowercase option affects input and the capitalized option affects output.
.B -b
options. The latter sets the size of blocks to be read and the former sets the
number of blocks to be read. The
Owner

This sounds wack.

This sounds wack.
Author
Owner

What do you think of my changes?

What do you think of my changes?
Owner

I'm okay with this.

I'm okay with this.
trinity marked this conversation as resolved
docs/dj.1 Outdated
@ -193,0 +196,4 @@
number of blocks to be read. The
.B -B
option is similar to the latter but sets the size of blocks to be written,
regardless of the amount of data that will actually be written.
Owner

That's because the amount of data in total that will be written is totally irrelevant to -B.

That's because the amount of data in total that will be written is totally irrelevant to -B.
Owner

It should be highlighted that the input buffer should be very large (e.g. 4M) to take advantage of modern hardware I/O speeds, and for most uses mm(1) is better.

It should be highlighted that the input buffer should be very large (e.g. 4M) to take advantage of modern hardware I/O speeds, and for most uses mm(1) is better.
emma marked this conversation as resolved
src/dj.c Outdated
@ -139,3 +140,1 @@
return (*s == '\0' /* no chars left unparsed */ && errno == 0)
? r
: -1;
return (*s == '\0' && errno == 0) ? r : -1; /* no chars left unparsed */
Owner

Moving this comment makes it wrong. This ternary makes sure strtol(3p) succeeded. *s == 0, which is on its own vague, checks if no chars are left unparsed.

Moving this comment makes it wrong. This ternary makes sure strtol(3p) succeeded. `*s == 0`, which is on its own vague, checks if no chars are left unparsed.
emma marked this conversation as resolved
src/dj.c Outdated
@ -180,3 +178,2 @@
if(argc > 0){
int c;
if (!(argc < 0)) { usage(program_name); }
Owner

getopt(3p) will still happen when argc==0. And dj(1) should run when argc<0. If the user's system is broken, don't deny them the basic tools to fix it.

getopt(3p) will still happen when `argc==0`. And dj(1) should run when `argc<0`. If the user's system is broken, don't deny them the basic tools to fix it.
emma marked this conversation as resolved
src/dj.c Outdated
@ -186,3 +181,1 @@
switch(c){
case 'i': case 'o': i = (c == 'o');
if(optarg[0] == '-' && optarg[1] == '\0'){ /* optarg == "-" */
int c;
Owner

This should be scope-limited to just the getopt(3p) part, which the if statement did.

This should be scope-limited to just the getopt(3p) part, which the if statement did.
emma marked this conversation as resolved
src/dj.c Outdated
@ -189,0 +183,4 @@
program_name = argv[0];
while ((c = getopt(argc, argv, "a:b:B:c:i:hHns:S:o:")) != -1) {
switch (c) {
case 'i': case 'o': /* input, output */
Owner

Indenting this sucks and looks ugly even in practice. See my criticisms in the style guide.

Indenting this sucks and looks ugly even in practice. See my criticisms in the style guide.
emma marked this conversation as resolved
src/dj.c Outdated
@ -216,3 +222,1 @@
if((c == 'b' && (io[i].bs = parse(optarg)) > 0)
|| (c == 's' && (io[i].seek = parse(optarg)) >= 0))
break;
c |= 0x20 /* ASCII make lowercase 0b 0010 0000 */
Owner

This looks like I threw my code line through an AI blender. The bits should be next to the hex, separate from the unrelated comment. This statement is missing a semicolon.

This looks like I threw my code line through an AI blender. The bits should be next to the hex, separate from the unrelated comment. This statement is missing a semicolon.
emma marked this conversation as resolved
src/dj.c Outdated
@ -252,2 +268,2 @@
return oserr(io[1].fn, io[1].error);
}while((io[1].seek -= (t - io[1].bufuse)) > 0 && io[1].bufuse != t);
}
if (io[1].error != 0) { return oserr(io[1].fn, io[1].error); }
Owner

This is substantially less readable than before.

This is substantially less readable than before.
Author
Owner

Yea, this should be on another line, I think this one was done in a sleepless fugue.

Yea, this should be on another line, I think this one was done in a sleepless fugue.
emma marked this conversation as resolved
src/dj.c Outdated
@ -260,3 +278,3 @@
}
do{
do { /* while(count == 0 || --count > 0); */
Owner

Again, this commenting is bad. C isn't its preprocessor. If this kind of commenting is needed I would straight-up rather remove C from this project than continue with it. The alternative to do/whiles is infinite loops with conditions at the end which are what do/whiles exist to solve. Commenting the function of basic C language features clutters code significantly.

Again, this commenting is bad. C isn't its preprocessor. If this kind of commenting is needed I would straight-up rather remove C from this project than continue with it. The alternative to do/whiles is infinite loops with conditions at the end which are what do/whiles exist to solve. Commenting the function of basic C language features clutters code significantly.
emma marked this conversation as resolved
src/npc.c Outdated
@ -59,0 +48,4 @@
while ((c = getc(stdin)) != EOF) {
if ((c & 0x80) != 0) { fputs("M-", stdout); }
switch (c ^ 0x80) { /* 0b 1000 0000 */
Owner

These bits should be next to the hex.

These bits should be next to the hex.
emma marked this conversation as resolved
src/scrut.c Outdated
@ -23,3 +23,1 @@
#ifndef EX_USAGE
# include <sysexits.h>
#endif
#include <sysexits.h>
Owner

What macros are used?

What macros are used?
emma marked this conversation as resolved
Owner

Either way, formatting shouldn't be done manually, and I can't approve this until there's a way to do it automatically.

Either way, formatting shouldn't be done manually, and I can't approve this until there's a way to do it automatically.
emma added 2 commits 2024-07-15 13:05:22 -06:00
emma added 5 commits 2024-07-15 13:17:58 -06:00
emma added 1 commit 2024-07-15 13:30:13 -06:00
emma added 1 commit 2024-07-15 13:32:35 -06:00
emma added 2 commits 2024-07-15 13:39:24 -06:00
emma added 1 commit 2024-07-19 16:42:00 -06:00
emma added 1 commit 2024-07-19 17:06:24 -06:00
Author
Owner

@trinity, @silt: I am waiting on y’all for this to be ready to merge.

@trinity, @silt: I am waiting on y’all for this to be ready to merge.
emma added 1 commit 2024-07-19 17:32:25 -06:00
trinity added 1 commit 2024-07-19 18:27:20 -06:00
trinity added 1 commit 2024-07-19 18:40:35 -06:00
trinity added 1 commit 2024-07-19 19:03:39 -06:00
trinity added 3 commits 2024-07-19 19:34:43 -06:00
Owner

I think my work here is done.

I think my work here is done.
Author
Owner

This is looking really good.

This is looking really good.
silt reviewed 2024-07-20 06:29:46 -06:00
@ -0,0 +36,4 @@
}
4. In C, spaces should be placed in control flow statements after the keyword
and before the opening brace:
Owner

parenthesis*

parenthesis*
Author
Owner

Not “after the keyword, before the opening parenthesis”, but “after the keyword and before the opening brace”.

Not “after the keyword, before the opening parenthesis”, but “after the keyword *and* before the opening brace”.
Owner

got it, thanks. misread.

got it, thanks. misread.
silt marked this conversation as resolved
emma added 1 commit 2024-07-20 07:19:55 -06:00
Author
Owner

@trinity: Are you cool with this being merged without .clang-format?

@trinity: Are you cool with this being merged without [`.clang-format`](https://git.tebibyte.media/bonsai/harakit/issues/140#issuecomment-5387)?
trinity reviewed 2024-07-20 09:53:39 -06:00
trinity left a comment
Owner

Didn't mean to make this a review, just comments.

Didn't mean to make this a review, just comments.
src/dj.c Outdated
@ -32,3 +31,3 @@
extern int errno;
char *program_name = "dj";
static char *program_name = "dj";
Owner

This wasn't static because the program name should show up if something horrendous happens during linking, to make debugging easier.

This wasn't static because the program name should show up if something horrendous happens during linking, to make debugging easier.
Author
Owner

Should they all not be static, then?

Should they all not be static, then?
Owner

Yes, I've slowly been changing this in all the C code but need to document the rationale.

Yes, I've slowly been changing this in all the C code but need to document the rationale.
emma marked this conversation as resolved
src/dj.c Outdated
@ -148,3 +152,2 @@
"\t[-i file] [-b block_size] [-s offset]\n"
"\t[-o file] [-B block_size] [-S offset]\n",
program_name);
"\t[-o file] [-B block_size] [-S offset]\n", s
Owner

This is good.

This is good.
emma marked this conversation as resolved
src/dj.c Outdated
@ -159,3 +163,4 @@
size_t i; /* side of io being modified */
char noerror; /* 0=exits (default) 1=retries on partial reads or writes */
struct Io io[2 /* { in, out } */];
char *s = (argv[0] == NULL ? program_name : argv[0]);
Owner

Use program name for this.

Use program name for this.
Author
Owner

I don’t follow.

I don’t follow.
Owner

Instead of making a new variable overwrite the existing program name variable with argv 0. My phone keyboard lacks the underscore.

Instead of making a new variable overwrite the existing program name variable with argv 0. My phone keyboard lacks the underscore.
Owner

I didn't see it but this case is already covered on L189; if argc > 0 { program_name = argv[0]; ...

I didn't see it but this case is already covered on L189; `if argc > 0 { program_name = argv[0]; ...`
emma marked this conversation as resolved
src/scrut.c Outdated
@ -42,1 +37,3 @@
goto usage;
int main(int argc, char *argv[]) {
char sel[(sizeof args) / (sizeof *args)];
char *s = (argv[0] == NULL ? program_name : argv[0]);
Owner

This change was unnecessary.

This change was unnecessary.
Author
Owner

How’s that?

How’s that?
Owner

You made this change to a number of programs, and in some it was warranted, but to replace an argv termary that only occurs when argc is less than some value misses the point of that ternary.

In the set of possible invocations with too few parameters - all of which are invalid - there is one invocation without argv 0. If there are enough parameters, there is an argv 0. Checking whether argv 0 is NULL inside the argc filter makes sense because it is a real possibility, but checking it outside the filter means that check has to be done even in the event of a correct invocation. which in turn means that computation is wasted. I hope a majority of the invocations of this utility are correct because I tried to make this utility easy to use. But with this change a - hopefully - majority of invocations would waste cycles on a redundant check.

argv 0 only needs to be checked if str was invoked incorrectly. and then only to pass a valid value to usage. After that initial check argv 0 can be assumed to be valid. This thoughtfulness is also present in other utilities you changed in this commit. I understand the value of consistency but these utilities' usage varies and so too does argv handling to ensure code is efficient and only errors when it has to.

You made this change to a number of programs, and in some it was warranted, but to replace an argv termary that only occurs when argc is less than some value misses the point of that ternary. In the set of possible invocations with too few parameters - all of which are invalid - there is one invocation without argv 0. If there are enough parameters, there is an argv 0. Checking whether argv 0 is NULL inside the argc filter makes sense because it is a real possibility, but checking it outside the filter means that check has to be done even in the event of a correct invocation. which in turn means that computation is wasted. I hope a majority of the invocations of this utility are correct because I tried to make this utility easy to use. But with this change a - hopefully - majority of invocations would waste cycles on a redundant check. argv 0 only needs to be checked if str was invoked incorrectly. and then only to pass a valid value to usage. After that initial check argv 0 can be assumed to be valid. This thoughtfulness is also present in other utilities you changed in this commit. I understand the value of consistency but these utilities' usage varies and so too does argv handling to ensure code is efficient and only errors when it has to.
First-time contributor

computation is wasted

There is no problem with "wasting" such an incredibly infinitesimal amount of computation here. This code is not in a hot path, and if my understanding is correct, it only runs once within the programs lifespan. It takes vastly more effort for the computer to allocate space for the program, load it into that space, and orchestrate its execution. None of this matters.

If it does not contribute value to the program, the code should be included/excluded based on whether or not its existence makes the code more readable, and the program's control flow easier to understand.

> computation is wasted There is no problem with "wasting" such an incredibly infinitesimal amount of computation here. This code is not in a hot path, and if my understanding is correct, it only runs *once within the programs lifespan*. It takes vastly more effort for the computer to allocate space for the program, load it into that space, and orchestrate its execution. None of this matters. If it does not contribute value to the program, the code should be included/excluded based on whether or not its existence makes the code more readable, and the program's control flow easier to understand.
Owner

I regard the harakit as art as well as code. Each utility should be simple, but also the best at what it does ("do one thing well"). Lost cycles add up. I don't care how small the issue is, I don't even want my name on sub-par code in a repository dedicated to making good, performant programs.

I've made some suggestions regarding consistency here that not only settle previous gripes about consistency but also are as efficient with regards to comparisons as I see possible. I strongly believe having simple, but not boring and cookie-cutter one-size-fits-all solutions, is better than having ugly kludges like this copy-pasted ternary, or a goto dumb, or a NOARGVZERO macro, all of which I've tried over the last couple years. Ultimately the best solution is to treat programs individually, with shared practices, rather than trying to duplicate code among them (in which case the macro would work better anyway).

I would like to be clear: Code duplication that oversteps the lines of shared practice is heedless, ugly, uninteresting - in the worst ways, inefficient, and the opposite of good craftsmanship. Grepping for argv[0] or program_name should be more than enough to trace the very sparse use of either in any program (usually used with an also consistent usage()). Though I occasionally struggle to decipher my past code (which is where my best commenting comes from) I've never had an issue with argv[0] or program_name or even really had it occur to me because they're rarely buggy, or when they are buggy, very easy to debug.

I regard the harakit as art as well as code. Each utility should be simple, but also the best at what it does ("do one thing well"). Lost cycles add up. I don't care how small the issue is, I don't even want my name on sub-par code in a repository dedicated to making good, performant programs. I've made some suggestions regarding consistency here that not only settle previous gripes about consistency but also are as efficient with regards to comparisons as I see possible. I strongly believe having simple, but not boring and cookie-cutter one-size-fits-all solutions, is better than having ugly kludges like this copy-pasted ternary, or a [`goto dumb`](https://git.tebibyte.media/trinity/src/src/commit/4393dcac21884f53bdd84395ff9829e8fe6d99ed/cat/cat.c#L22), or a [`NOARGVZERO` macro](https://git.tebibyte.media/trinity/src/src/commit/293436c5ad1903f197c421dc93ac2507e24382f0/include/usefulmacros.h#L9), all of which I've tried over the last couple years. Ultimately the best solution is to treat programs individually, with shared practices, rather than trying to duplicate code among them (in which case the macro would work better anyway). I would like to be clear: Code duplication that oversteps the lines of shared practice is heedless, ugly, uninteresting - in the worst ways, inefficient, and the opposite of good craftsmanship. Grepping for argv[0] or program_name should be more than enough to trace the very sparse use of either in any program (usually used with an also consistent `usage()`). Though I occasionally struggle to decipher my past code (which is where my best commenting comes from) I've never had an issue with `argv[0]` or `program_name` or even really had it occur to me because they're rarely buggy, or when they are buggy, very easy to debug.
Owner

This says the code is outdated but as of commenting the relevant snippet is not.

This says the code is outdated but as of commenting the relevant snippet is not.
emma marked this conversation as resolved
emma added 1 commit 2024-07-24 19:37:54 -06:00
trinity requested changes 2024-07-25 20:21:41 -06:00
STYLE Outdated
@ -0,0 +65,4 @@
8. If a control flow statement is short enough to be easily understood in a
glance, it may be placed on a single line:
if (!argc < 0) { usage(program_name); }
Owner

!any_int < 0 will always evaluate to 1.

`!any_int < 0` will always evaluate to `1`.
emma marked this conversation as resolved
@ -0,0 +89,4 @@
12. Follow the rules from the paper The Power of 10: Rules for Developing
Safety-Critical Code [0]:
1. Avoid complex flow constructs, such as goto and recursion.
2. All loops must have fixed bounds. This prevents runaway code.
Owner

Examples of this would be good.

Examples of this would be good.
Author
Owner

Further explanation is available in the linked paper.

Further explanation is available in the linked paper.
trinity marked this conversation as resolved
STYLE Outdated
@ -0,0 +91,4 @@
1. Avoid complex flow constructs, such as goto and recursion.
2. All loops must have fixed bounds. This prevents runaway code.
3. Avoid heap memory allocation.
4. Restrict functions to a single printed page.
Owner

the length of, or so

the length of, or so
emma marked this conversation as resolved
STYLE Outdated
@ -0,0 +92,4 @@
2. All loops must have fixed bounds. This prevents runaway code.
3. Avoid heap memory allocation.
4. Restrict functions to a single printed page.
5. Use a minimum of two runtime assertions per function.
Owner

That's intense. An example:

/* silly impractical function */
void *
realloc_and_zero(void *p, size_t n) {
    /* if n==0 realloc(3p) will free(3p); return NULL without zeroing */
    if ((p = realloc(n)) == NULL || n == 0) { return NULL; }
    /* what's there to assert? the edge cases have been handled */
    while (n --> 0) { ((char *)p)[n] = '\0'; }
    /* 1 could be bounds-checking n, which could be expensive. but 2 assertions? */
    return p;
}
That's intense. An example: ```c /* silly impractical function */ void * realloc_and_zero(void *p, size_t n) { /* if n==0 realloc(3p) will free(3p); return NULL without zeroing */ if ((p = realloc(n)) == NULL || n == 0) { return NULL; } /* what's there to assert? the edge cases have been handled */ while (n --> 0) { ((char *)p)[n] = '\0'; } /* 1 could be bounds-checking n, which could be expensive. but 2 assertions? */ return p; } ```
emma marked this conversation as resolved
@ -0,0 +94,4 @@
4. Restrict functions to a single printed page.
5. Use a minimum of two runtime assertions per function.
6. Restrict the scope of data to the smallest possible.
7. Check the return value of all non-void functions, or cast to void to
Owner

Prepare to cast all fprintf(3p)s.

Prepare to cast all fprintf(3p)s.
emma marked this conversation as resolved
@ -0,0 +97,4 @@
7. Check the return value of all non-void functions, or cast to void to
indicate the return value is useless.
8. Use the preprocessor sparingly.
9. Limit pointer use to a single dereference, and do not use function
Owner

Is this possible for analyzing arrays of strings?

Is this possible for analyzing arrays of strings?
Author
Owner

Please elaborate.

Please elaborate.
Owner

Iterating through characters in arguments looks something like:

for(size_t i = optind; i < argc; ++i) {
    for(size_t j = 0; argv[i][j] != '\0'; ++j) { // two dereferences (p[i][j])
        if (argv[i][j] == 'b') { (void)fprintf(stderr, "there's a b\n"); }
    }
}
Iterating through characters in arguments looks something like: ```c for(size_t i = optind; i < argc; ++i) { for(size_t j = 0; argv[i][j] != '\0'; ++j) { // two dereferences (p[i][j]) if (argv[i][j] == 'b') { (void)fprintf(stderr, "there's a b\n"); } } } ```
Author
Owner

Assigning multiple bindings to avoid consecutive dereferences is in the spirit of this rule and is also enforced by the Rust compiler with lifetimes.

Assigning multiple bindings to avoid consecutive dereferences is in the spirit of this rule and is also enforced by the Rust compiler with lifetimes.
Owner

Got it.

Got it.
trinity marked this conversation as resolved
src/intcmp.c Outdated
@ -41,3 +49,2 @@
if(argc < 3)
goto usage;
if (argc == 0 | argc < 3) { return usage(program_name); }
Owner

The compiler will filter this if down to one condition anyway. Why are there two here? argc==0 isn't a special case with regards to intcmp(1), except that the program_name fallback needs to be used.

The compiler will filter this if down to one condition anyway. Why are there two here? `argc==0` isn't a special case with regards to intcmp(1), except that the `program_name` fallback needs to be used.
Owner

Moving the program_name ternary over here would be more efficient.

Moving the `program_name` ternary over here would be more efficient.
Author
Owner

To line 50?

To line 50?
Owner

Yes, not setting an intermediate variable but relying on argv[0] elsewhere, only falling back to program_name here.

So:

if (argc < 3) { return usage(argv[0] == NULL ? program_name : argv[0]); }
Yes, not setting an intermediate variable but relying on `argv[0]` elsewhere, only falling back to `program_name` here. So: ```c if (argc < 3) { return usage(argv[0] == NULL ? program_name : argv[0]); } ```
emma marked this conversation as resolved
src/mm.c Outdated
@ -122,3 +125,4 @@
size_t j;
size_t k; /* loop index but also unbuffer status */
int retval;
program_name = (argv[0] == NULL ? program_name : argv[0]);
Owner

Why move this? It was more efficient before this changed. Anyone curious about the self-explanatory program_name can just grep for it in the file.

Why move this? It was more efficient before this changed. Anyone curious about the self-explanatory `program_name` can just grep for it in the file.
trinity marked this conversation as resolved
src/mm.c Outdated
@ -153,3 +153,1 @@
break;
case 'e':
if(Files_append(&files[1], stderr, stderr_name) != NULL)
if (argc > 1) {
Owner

The previous version of this wasn't pretty, but adopting an if (argc > 0) { program_name = argv[0]; ... like dj(1) used to have would be consistent (these are both utilities that can run without arguments, but which take options) and efficient.

The previous version of this wasn't pretty, but adopting an `if (argc > 0) { program_name = argv[0]; ...` like dj(1) used to have would be consistent (these are both utilities that can run without arguments, but which take options) and efficient.
Owner

I would put this in the same block as getopt(3p) to reduce cognitive load; That One Block is where arguments are interpreted.

I would put this in the same block as getopt(3p) to reduce cognitive load; That One Block is where arguments are interpreted.
trinity marked this conversation as resolved
src/mm.c Outdated
@ -234,3 +264,4 @@
}
terminate;
}
Owner

I'm reviewing this hastily because the Rust rewrite is a superior version anyway.

I'm reviewing this hastily because the Rust rewrite is a superior version anyway.
emma marked this conversation as resolved
src/str.c Outdated
@ -57,0 +53,4 @@
int main(int argc, char *argv[]){
size_t ctype; /* selected from ctypes.h; index of ctype */
int retval;
char *s = (argv[0] == NULL ? program_name : argv[0]);
Owner

This should be program_name or, ideally, merged back into the if (argc < 3) check from whence it came.

This should be `program_name` or, ideally, merged back into the `if (argc < 3)` check from whence it came.
Author
Owner

I couldn’t figure out how the s variable works. This code is extremely unreadable even in its current state.

I couldn’t figure out how the `s` variable works. This code is extremely unreadable even in its current state.
Owner

I was under the impression you added the s variable to this program?

I was under the impression [you added the `s` variable to this program](https://git.tebibyte.media/bonsai/harakit/commit/0282b60e650305182ab13448e58284da5067fb6c)?
Author
Owner

Having a silly time with this.

Having a silly time with this.
Owner

s didn't work because the argv[0] you replaced later in the program wasn't the program name but an iterating pointer.

`s` didn't work because the `argv[0]` you replaced later in the program wasn't the program name but an iterating pointer.
trinity marked this conversation as resolved
emma added 2 commits 2024-07-25 21:34:18 -06:00
trinity reviewed 2024-07-26 07:15:27 -06:00
src/str.c Outdated
@ -57,0 +55,4 @@
int retval;
char *s = (argv[0] == NULL ? program_name : argv[0]);
if (argc < 3) { return usage(s); }
Owner

The s ternary should be here, without an intermediate variable.

The `s` ternary should be here, without an intermediate variable.
trinity marked this conversation as resolved
Author
Owner

Reminder to myself to include this Kernighan quote from The Elements of Programming Style:

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?

Reminder to myself to include this Kernighan quote from *The Elements of Programming Style*: > Everyone knows that debugging is twice as hard as writing a program in the first place. So if you're as clever as you can be when you write it, how will you ever debug it?
trinity reviewed 2024-07-27 17:13:27 -06:00
@ -45,3 +45,2 @@
Ok(_) => {
/* unwrap because Err(OptError::MissingArg) will be returned if
* opt.arg() is None */
/* delimiter */
Owner

Instead of marking that this is to set the delimiter, I'd prefer a more descriptive name for the variable d (like delimiter). I'd also prefer:

/* this code is obviously addressing -d */
Ok("d") => { ...}

/* this is equivalent to `default:` in C which would be more idiomatic in this context */
_ => { eprintln!... }

Considering the delimiter's purpose is documenting in the man page, after incorporating these changes, I wouldn't even comment this match block.

Instead of marking that this is to set the delimiter, I'd prefer a more descriptive name for the variable `d` (like `delimiter`). I'd also prefer: ```rs /* this code is obviously addressing -d */ Ok("d") => { ...} /* this is equivalent to `default:` in C which would be more idiomatic in this context */ _ => { eprintln!... } ``` Considering the delimiter's purpose is documenting in the man page, after incorporating these changes, I wouldn't even comment this match block.
Author
Owner

If I make the Ok() arm match only Ok("d"), then I have to add Ok(_) to the existing Err arm or the Rust compiler will not compile the program.

If I make the `Ok()` arm match only `Ok("d")`, then I have to add `Ok(_)` to the existing Err arm or the Rust compiler will not compile the program.
Owner

Could you match _ instead of Err(_)?

Could you match `_` instead of `Err(_)`?
emma marked this conversation as resolved
trinity reviewed 2024-07-27 17:16:14 -06:00
@ -55,38 +54,46 @@ fn main() {
};
Owner

Would it be better to just return from main rather than calling a function?

Would it be better to just return from main rather than calling a function?
Author
Owner

Using exit() throughout the program is consistent.

Using `exit()` throughout the program is consistent.
emma marked this conversation as resolved
trinity reviewed 2024-07-27 17:22:55 -06:00
src/fop.rs Outdated
@ -61,2 +60,4 @@
/* argv[0] of the operator command */
let operator = argv.get(command_arg).unwrap_or_else(|| {
eprintln!("{}", usage);
exit(EX_USAGE);
Owner

Emma: you told me that returning from main here would be impossible from a closure. Would it be better to match the Result here to facilitate that?

Emma: you told me that returning from main here would be impossible from a closure. Would it be better to match the Result here to facilitate that?
Owner

I'm not partial to either matching or using the clojure, but I think it should be consistent between programs.

I'm not partial to either matching or using the clojure, but I think it should be consistent between programs.
Author
Owner

It would be more verbose.

It would be more verbose.
Owner

Got it.

Got it.
trinity marked this conversation as resolved
trinity reviewed 2024-07-27 17:25:26 -06:00
src/fop.rs Outdated
@ -72,3 +76,4 @@
let mut fields = buf.split(&d).collect::<Vec<&str>>();
/* collect arguments for the operator command */
let opts = argv
Owner

Instead of opts here I would call this a variant of arg. opts is a little misleading; this is used for all given arguments, not just parsed options.

Instead of `opts` here I would call this a variant of `arg`. `opts` is a little misleading; this is used for all given arguments, not just parsed options.
trinity marked this conversation as resolved
trinity reviewed 2024-07-27 17:27:03 -06:00
@ -58,3 +58,3 @@
#[derive(Clone, PartialEq, PartialOrd, Debug)]
// enum CalcType is a type containing operations used in the calculator
/* enum CalcType is a type containing operations used in the calculator */
Owner

Maybe it should be OpType to be self-documenting.

Maybe it should be `OpType` to be self-documenting.
Author
Owner

I dislike this name because the type doesn’t only include options. The CalcType name is inherited from the library from which i forked this code.

I dislike this name because the type doesn’t only include options. The CalcType name is inherited from the library from which i forked this code.
Owner

Fair enough.

Fair enough.
trinity marked this conversation as resolved
trinity reviewed 2024-07-27 17:27:19 -06:00
@ -121,2 +120,3 @@
// repeating and it seems this can give it to me
/* Im no math nerd but I want the highest possible approximation of 0.9
* repeating and it seems this can give it to me */
const PRECISION_MOD: f64 = 0.9 + f64::EPSILON * 100.0;
Owner

What does this do?

What does this do?
Author
Owner

Sets the float I’m rounding to to be the highest approximation of 0.9 repeating possible on the hardware

Sets the float I’m rounding to to be the highest approximation of 0.9 repeating possible on the hardware
trinity marked this conversation as resolved
trinity added 1 commit 2024-07-27 18:00:00 -06:00
trinity added 1 commit 2024-07-27 18:18:42 -06:00
trinity added 3 commits 2024-07-27 18:32:11 -06:00
trinity added 1 commit 2024-07-27 18:35:09 -06:00
Owner

@trinity: Are you cool with this being merged without .clang-format?

Yeah, we can do it later down the line.

> @trinity: Are you cool with this being merged without [`.clang-format`](https://git.tebibyte.media/bonsai/harakit/issues/140#issuecomment-5387)? Yeah, we can do it later down the line.
emma added 1 commit 2024-07-28 00:00:17 -06:00
emma added 1 commit 2024-07-28 00:13:11 -06:00
emma added 2 commits 2024-07-28 00:36:00 -06:00
emma added 2 commits 2024-07-28 01:17:40 -06:00
emma added 1 commit 2024-07-28 17:50:36 -06:00
emma added 1 commit 2024-07-28 18:28:04 -06:00
emma added 1 commit 2024-07-28 18:31:07 -06:00
emma added 1 commit 2024-07-28 18:33:54 -06:00
trinity added 2 commits 2024-07-28 21:26:25 -06:00
emma added 1 commit 2024-07-29 02:36:24 -06:00
Author
Owner

@trinity: let’s look this over once more together so we can get it merged.

@trinity: let’s look this over once more together so we can get it merged.
trinity added 1 commit 2024-07-29 10:54:23 -06:00
trinity added 1 commit 2024-07-29 11:04:06 -06:00
trinity added 2 commits 2024-07-29 11:10:23 -06:00
trinity added 2 commits 2024-07-29 11:14:04 -06:00
trinity added 1 commit 2024-07-29 11:14:54 -06:00
trinity added 1 commit 2024-07-29 11:30:53 -06:00
Owner

I can get behind merging this.

I can get behind merging this.
trinity added 1 commit 2024-07-29 13:44:05 -06:00
trinity requested review from trinity 2024-07-29 13:53:45 -06:00
emma added 1 commit 2024-07-29 14:40:10 -06:00
trinity added 2 commits 2024-07-29 21:15:01 -06:00
Owner

This is set to be merged.

This is set to be merged.
Author
Owner

Merged.

Merged.
emma closed this pull request 2024-07-29 22:47:05 -06:00
emma deleted branch dj-formatting 2024-07-29 22:47:14 -06:00

Pull request closed

Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: bonsai/harakit#142
No description provided.