WIP: docs: fixed formatting of many manpages #86
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
joke
question
wontfix
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: bonsai/coreutils#86
Loading…
Reference in New Issue
No description provided.
Delete Branch "docs"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Upon review I realize my man pages had a lot of errors and wrinkles. Your changes are in sum an improvement but I find some of them to be questionable.
@ -61,1 +52,4 @@
Takes no arguments and pads with nuls.
.RE
.B -B
It would make more sense to order the options
-ibscaAoBSHqd
in explanation. Input, input options, alightment, output, output options, diagnostic options. Capitals before lowercases is especially confusing in the context of dj(1).I am still under the persuasion that we should order them alphabetically
I am still not.
@silt what are your thoughts
i think that in general, it makes sense to sort/group options logically, rather than alphabetically. when i go to a manpage to look at an option, options that are in some way related are far more useful to have close by than options that just happen to be alphabetical neighbors. the only benefit i see to alphabetical ordering is being able to quickly find a specific option in a long list of them, but that's not a good reason. quickly finding some text is what
grep
and its siblings are for.@trinity what are your thoughts on the latest change I made?
Love it.
@ -69,0 +61,4 @@
.B -H
.RS
Prints diagnostics messages in an alternate manner as described in the
"in an alternate, human-readable format" would be better; the 'H' stands for Human.
@ -71,2 +69,4 @@
.RS
Skips a number of bytes through the output before starting to write from
the input. If the input is a stream the bytes are read and discarded. If the
output is a stream, nul characters are printed.
"If the output is a stream, nul bytes are printed." Input is irrelevant here (this may be my own error).
@ -80,1 +76,3 @@
The
.RS
Takes one argument of one byte in length and pads the input buffer with
that byte in the event that a read doesn’t fill the input buffer, and the
"and the"?..
-a
pads the input buffer with the given byte in the event of an incomplete read from the input file.-A
instead pads with the nul byte.@ -85,0 +92,4 @@
.B -d
.RS
Prints all debug information, user-specified or otherwise, before program
Specifically prints information related to invocation.
Please elaborate.
See src/dj.c:370 which is the only thing that happens when the
debug
level is greater than2
(the default).Here's the stderr of
dj -d
:align
is shown to beff
here because the two's complement representation of its sentry value (-1
) is0b 1111 1111 1111 1111
and only the lower 8b are used for alignment, or allowed when taking a new value as an argument (thus the sentry value can never be chosen by the user), or shown in the debug output.@ -85,0 +98,4 @@
.B -i
.RS
Takes a path as an argument to open and use in place of standard input.
-
can be used to mean standard input or standard output. This may be noted elsewhere but is relevant here as well.@ -85,0 +103,4 @@
.B -n
.RS
Causes dj to exit on two consecutive empty reads instead of one.
Causes dj to give failed reads or writes a second try.
@ -85,0 +111,4 @@
Does the same as
.B -i
but in place of standard output. Dj does not truncate output
files and instead writes over the bytes in the existing file.
I think this would be more appropriate in a BUGS or CAVEATS section, perhaps with a "See BUGS" in the "-o" option.
@ -85,0 +117,4 @@
.B -s
.RS
Takes a numeric argument as the number of bytes to skip into the input
before starting to read.
If standard input is used, the bytes are read and discarded.
@ -85,0 +125,4 @@
Suppresses error messages which print when a read or write is partial or
empty. When
.B -q
is specified twice suppresses diagnostic output entirely.
It should be mentioned that
-q
and-d
respectively decrement and increment the debug level of the program.@ -151,1 +180,3 @@
use.
The dd(1p) utility specified in POSIX was the basis of this program.
It includes additional features: typical option formatting, allowing seeks to be
What is "it"?
@ -152,0 +183,4 @@
specified in bytes rather than in blocks, allowing arbitrary bytes as padding,
and printing in a format that’s easy to parse for machines. It also neglects
character conversion. This may have been the original intent of dd(1p) but it is
irrelevant to its modern use as a disk utility.
"its modern use". Its modern use is more as a file utility in some contexts (
doas dd of=/root/accessible/only
,dd bs=bytes count=1
) and a disk utility (doas dd of=/dev/disk
,dd if=/dev/hd bs=512 count=1 of=disktable
) in other contexts - distinguished by user stress. dd(1p) is no more a disk utility than any other UNIX utility and probably not even a great tool for the job (a 512B buffer sucks for disk image writing - it's way too small!).It varies greatly per user per context so leaving it ambiguous would be best.
@ -15,2 +15,2 @@
False does nothing regardless of operands or standard input.
False will always return an exit code of 1.
Do nothing regardless of operands or standard input.
An exit code of 1 will always be returned.
Better: "Do nothing, unsuccessfully."
I’m worried that slogan would force us under the GNU Free Documentation license as it is from the GNU man page for their implementation of
false(1)
.@trinity did you have any thoughts on this?
I think you're right here.
@ -20,3 +20,3 @@
.SH DESCRIPTION
Intcmp compares integers.
Compare integers.
This infinitive present tense for descriptions feels off and I think this is a good example of why. "Compare integers" - who, what, when, where, why, how? It's easy to reference but more difficult to puzzle out for the casual reader.
@ -33,0 +29,4 @@
.B -g
or
.B -l
, only adjacent integers in the argument sequence can be equal.
Every comparison only compares with the integers next to it.
Please elaborate on what is wrong here.
See src/intcmp.c:62:
c
is the current integer,r
is the reference integer to whichc
is compared. Only adjacent integers are ever compared. Equality is always cohingent on adjacency; perhaps argv [1] and [3] can be equal in1 == 1 == 1
whereas1 >= 2 >= 1
is an invalid equation, but that's just the function of the comparisons there.I’m not really sure what I need to change here, then.
"Permits adjacent integers to be equal to each other" is sufficient to describe the full functionality.
@ -48,3 +66,3 @@
There are multiple ways to express compound comparisons; “less than or equal
to” can be -le or -el, for example.
.PP
Is this replacement portable?
I could not find reference to
.PP
inroff(7)
.Huh. Now that you mention it, I can't either.
@ -39,2 +28,2 @@
standard output. Standard output itself can be specified by giving the
path '-'. Standard error itself can be specified with the
.RS
Opens subsequent outputs for appending rather than updating.
s/subsequent//
. I realize options are only supported prior to positional arguments.@ -43,2 +33,2 @@
.PP
The
.RS
Set the output to the standard error.
Use standard error as an output.
@ -61,3 +68,3 @@
.SH BUGS
Mm does not truncate existing files, which may lead to unexpected results.
Existing files are not truncated, which may lead to unexpected results.
This is inconsistent with the changes made to the dj(1) man page, possibly to my original man pages but I haven't checked.
@ -67,1 +74,3 @@
Mm was modeled after the cat and tee utilities specified in POSIX.
The cat(1p) and tee(1p) programs specified in POSIX provide equivalent
functionality. The separation of the two sets of functionality into separate
APIs seemed unncessary.
cat(1p) and tee(1p) don't provide equivalent functionality; cat(1p) doesn't specify a way to ignore SIGINT and tee(1p) doesn't specify a way to ensure output is unbuffered.
Perhaps
sh -ec 'trap SIGINT true; cat'
would ignoreSIGINT
with cat(1p) and sh(1p), andsh -c 'dd bs=1 >>file' would append, unbuffered, to
file. But I'm not sure if sh(1p)'s
trap` works like this, and I don't know if it buffers file redirections, and these are still only achievable with the addition of sh(1p).@ -25,0 +21,4 @@
The program reads from standard input and writes to standard output, replacing
non-printing characters with printable equivalents. Control characters print as
a carat (“^”) followed by the character “@” through “_” corresponding to the
These were single quoted to indicate, following C conventions, specifically that they are ASCII bytes and not strings.
@ -20,3 +20,2 @@
Str tests each character in an arbitrary quantity of string arguments against
the function of the same name within ctype(3).
Test string arguments against each other.
Against... each other?
@ -38,1 +39,3 @@
('').
Originally, there was an isvalue type as an extension to ctype.h(3), but it
was removed in favor of using strcmp(1) to compare strings against the empty
string ('').
I think we can remove this as I am probably the only one that used
isvalue
.@ -39,2 +39,2 @@
Unicode strings may need to be normalized if the intent is to check visual
similarity and not byte similarity.
The program will exit unsuccessfully if the given strings are not identical;
therefore, unicode strings may need to be normalized if the intent is to check
Unicode is a proper noun.
Just some nitpicks and one goof on my end.
@ -48,0 +51,4 @@
.RS
If the output is a stream, nul bytes are printed. In other words, it does what
.B -a
does but with null bytes instead.
Nul bytes; "nul" as in ASCII NUL, as in the zero byte (
'\0'
).But in writing it is a null byte, because it is a byte that is null. The NUL representation is used in the context of displaying control characters.
While that is true, nul with a single L is used to refer to an eight-bit (theoretically, maybe even a seven-bit) zero value given in text encoding. Null with two Ls is often used to refer to the null (zero) memory address, which is typically somewhere between 16 and 48 bits. While the word "null" itself just refers to a zero value, "nul" implies a length in bits that "null" leaves ambiguous. Because otherwise they function identically I prefer to refer to any instance of the byte or character
'\0'
as nul regardless of use.There are also contexts where dd(1p) pads with
' '
(a literal0x20
if I recall my ASCII). If someone is unreasonably knowledgeable regarding particular dd(1p) usages, calling it a "nul" byte because it is a single ASCII character may make it clear for them that our alignment is an analogue to dd(1p)'sconv=sync
without, in the dj(1) man page), discussing in depth a tool that is not dj(1).But is it not clear if it is a null byte (which is 8 bits in length)?
"Null byte" makes sense but "nul byte" makes sense faster and indicates specifically the ASCII zero value versus an arbitrarily-sized ("byte" is sadly not always specific enough) value.
The problem is that null is a word and nul is a representation in practice.
I think I'm willing to cede this hill.
@ -70,2 +71,3 @@
option skips a number of bytes through the output before starting to write from
.RS
Skips a number of bytes through the output before starting to write from
the input. If the input is a stream the bytes are read and discarded. If the
-S
only configures the output.Please elaborate.
Whether or not the input is a stream is irrelevant to the function of the option
-S
.@ -85,0 +129,4 @@
.SH STANDARD INPUT
The standard input shall be used as an input if one or more of the input files
is “-”.
Or by default.
@ -39,2 +28,2 @@
standard output. Standard output itself can be specified by giving the
path '-'. Standard error itself can be specified with the
.RS
Opens outputs for appending rather than updating.
I know I corrected this but upon further reflection I have to fix this:
-a
opens subsequent outputs for appending, because outputs aren't specified positionally but optionally and therefore invocations likemm -o - -o start -ao append
do open standard output andstart
for writing to the start and openappend
for appending. I was mistaken.@ -45,0 +37,4 @@
.B -i
.RS
Opens a path as an input. Without any inputs specified mm will use the
standard input.
"-"
will use standard input or standard output.@ -66,2 +78,3 @@
Mm was modeled after the cat and tee utilities specified in POSIX.
The cat(1p) and tee(1p) programs specified in POSIX together provide nearly
equivalent functionality. The separation of the two sets of functionality into
"similar functionality".
Increasingly rarer nitpicks.
@ -96,0 +76,4 @@
.B -a
.RS
Takes one argument of one byte in length and pads the input buffer with it in
Could you change
^.*length
to "Accepts a single literal byte"?@ -96,0 +88,4 @@
.B -c
.RS
Specifies an amount of reads to make, and if 0 (the default) dj will
It would be better if this was two sentences - "
. If zero,
".@ -107,4 +147,4 @@
.R {records read} {ASCII unit separator} {partial records read}
.R {ASCII record separator} {records written} {ASCII unit separator}
.R {partial records written} {ASCII group separator} {bytes read}
.R {ASCII record separator} {bytes written} {ASCII file separator}
I don't know if this should be noted in the man page but this diagnostic output is intended to be machine readable to make scripting easier. I've found dd(1p) to be not only needlessly verbose but also a pain in the ass in this regard.
I really like dj(1)'s
-H
. It made debugging very easy. Though I would be happy to be corrected with an even better output format.@ -116,3 +157,4 @@
.RS
.R {records read} '+' {partial records read} '>' {records written}
.R '+' {partial records written} ';' {bytes read} '>' {bytes written}
.R {ASCII line feed}
Though this output prioritizes human readability it was also meant to be machine readable in case that was necessary. I couldn't imagine why and I hope it never would be, but if it is, it's easy.
@ -20,3 +20,3 @@
.SH DESCRIPTION
Intcmp compares integers.
Compare integers to each other.
With what else would the integers be compared?
#86 (comment)
Rare self own. I think my much younger, more foolhardy self of March 2024 is overly cocky here; all that is necessary is to know that intcmp (who) compares integers (what) (and the rest of the man page is why, how). My problem was specifically with the infinitive. Next time I will be less arrogant.
To be fair to me, this is not an infinitive, it is the second-person conjugation of the verb. The infinitive would be “to compare integers”, but it reads “[you] compare integers”. This phrasing is in line with many other man page descriptions I have read and I find it to be the best solution to the problem of program names being hard to fit into grammar (to capitalize or not to capitalize).
I don't know if this is exactly what you were trying to convey but I understand now - it's the same tense (and you are right, I was mistaken about it being infinitive :P) as the program names themselves. I like that.
docs: fixed formatting of many manpagesto WIP: docs: fixed formatting of many manpages@ -65,1 +56,3 @@
The
.RS
Takes a numeric argument as the size in bytes of the input buffer, with the
default being 1024 bytes or one kibibyte (KiB).
Perhaps that this is a kibibyte shouldn't be noted here. It may give the false impression that one could specify a SI prefix, e.g.
dj -b 1KiB
.@ -94,0 +100,4 @@
If the output is a stream, null bytes are printed. This option is equivalent to
specifying
.B -a
with a null byte instead of a character.
"-a but with null bytes; pads the input buffer with null bytes in the event of an incomplete read.
It's impossible to specify a null byte instead of a character. This may imply that doing so is possible.
(There is the workaround of having an empty argument; if I recall the sh(1p) builtin
read
supports this at least with its-b
option - I remember this because an article trended on Hacker News recently where the crux of the issue was that the writer didn't understand nul termination. I might consided using that here but it would be as much of a special case code-wise as-A
is.)Option descriptions should start on the same line as the options themselves. See
ls(1)
for an example of what I mean.Again, comparing the manpages with
ls(1)
, we use far too many newlines to separate items. There should only be one newline before a header, including the first header (NAME
).Correct:
Whatever we're doing:
@ -56,2 +51,2 @@
.PP
The
.RS
Takes a file path as an argument to open and use as an input.
This could probably be rephrased for clarity. It's not hard to read this as the argument being the thing that gets opened, rather than the file specified by the path within that argument. Obviously that reading makes no sense, but I still think it could be rephrased.
@ -69,0 +67,4 @@
.B -o
.RS
Takes a file path as an argument to open and use as an output.
See https://git.tebibyte.media/bonsai/coreutils/pulls/86/files#issuecomment-4463
@ -73,2 +81,2 @@
.PP
The
.RS
Skips a number of bytes through the output before starting to write from
Should clarify the difference between skipping n bytes and seeking to the nth byte.
cc @trinity
The former. I'd have to think about how to word it.
@ -92,2 +93,2 @@
.B -q
is specified a second time. The
.RS
Specifies a number of reads to make. If set to zero (the default), reading will
@emma and I spoke verbally about this; fae wants to rewrite this to be less clunky.
@ -81,1 +89,4 @@
of an incomplete read from the input file.
.RE
.B -c
Reminder to @emma to swap the positions of
-c
and-A
.@ -96,0 +129,4 @@
.SH STANDARD INPUT
The standard input shall be used as an input if no inputs are specified one or
@ -106,1 +137,3 @@
.PP
On a partial or empty read, a diagnostic message is printed (unless the
.B -q
option is specified) and the program exits (unless the
error: unmatched parenthesis on line 139
Good catch.
@ -107,0 +140,4 @@
.B -n
option is specified.
By default statistics are printed for input and output to the standard error in
Note the added comma.
@ -128,0 +164,4 @@
If the
.B -d
option is specified, debug output will be printed at the beginning of execution.
This debug information contains information regarding how the program was
Merge this line with the above line.
@ -128,2 +179,4 @@
diagnostic message is printed and the program exits with the appropriate
sysexits.h(3) status.
.SH BUGS
Removed superfluous word.
@ -96,0 +118,4 @@
.B -n
.RS
Retries failed reads once more before exiting.
Default is 1, but a number can be specified. This should be made clear here.
A number can be specified for -n in dj(1)?
https://git.tebibyte.media/bonsai/coreutils/pulls/86/files#issuecomment-4481
That meant using
-n
and-c
in tandem.@ -136,25 +189,29 @@ expected (the product of the count multiplied by the input block size). If the
or
.B -A
options are used this could make data written nonsensical.
Added comma.
@ -138,3 +191,3 @@
options are used this could make data written nonsensical.
.PP
Many lowercase options have capitalized variants and vice-versa which can be
@emma says that this should be moved to
CAVEATS
.@ -148,0 +203,4 @@
This program was based on the dd(1p) utility as specified in POSIX. While
character conversion may have been the original intent of dd(1p), it is
irrelevant to its modern use. Because of this, it eschews character conversion
and adds typical option formatting, allowing seeks to be specified in bytes
Again, seeking vs skipping. We need to land on one of them and stick to it, and make the behavior clear.
@ -147,1 +203,3 @@
features: typical option formatting, allowing seeks to be specified in bytes
This program was based on the dd(1p) utility as specified in POSIX. While
character conversion may have been the original intent of dd(1p), it is
irrelevant to its modern use. Because of this, it eschews character conversion
Clarify the "it" here.
@ -149,3 +208,1 @@
format that's easy to parse for machines. It also neglects character
conversion, which may be dd's original intent but is irrelevant to its modern
use.
format that’s easy to parse for machines.
Step 1:
From your project repository, check out a new branch and test the changes.Step 2:
Merge the changes and update on Forgejo.