234 lines
6.8 KiB
Groff
234 lines
6.8 KiB
Groff
.\" Copyright (c) 2024 DTB <trinity@trinity.moe>
|
||
.\" Copyright (c) 2024–2025 Emma Tebibyte <emma@tebibyte.media>
|
||
.\"
|
||
.\" This work is licensed under CC BY-SA 4.0. To see a copy of this license,
|
||
.\" visit <http://creativecommons.org/licenses/by-sa/4.0/>.
|
||
.\"
|
||
.TH DJ 1 2024-07-14 "Harakit X.X.X"
|
||
.SH NAME
|
||
dj \(en disk jockey
|
||
.\"
|
||
.SH SYNOPSIS
|
||
|
||
dj
|
||
.RB [ -Hn ]
|
||
.RB [ -a\ byte ]
|
||
.RB [ -c\ count ]
|
||
|
||
.RB [ -i\ file ]
|
||
.RB [ -b\ block_size ]
|
||
.RB [ -s\ offset ]
|
||
|
||
.RB [ -o\ file ]
|
||
.RB [ -B\ block_size ]
|
||
.RB [ -S\ offset ]
|
||
.\"
|
||
.SH DESCRIPTION
|
||
|
||
Perform precise read and write operations on files. This utility is useful for
|
||
reading and writing binary data to and from disks.
|
||
|
||
This manual page uses the terms \(lqskip\(rq and \(lqseek\(rq to refer to
|
||
moving to a specified byte by index in the input and output of the program
|
||
respectively. This language is inherited from the
|
||
.BR dd (1p)
|
||
specification in \*(Px and used here to decrease ambiguity.
|
||
|
||
The offset used when skipping or seeking refers to how many bytes are skipped
|
||
or sought. Running
|
||
.BR dj (1)
|
||
with a skip offset of 1 reads from the second byte onwards. A programmer may
|
||
think of a file as a zero-indexed array of bytes; in this analogy, the offset
|
||
given is the index of the byte at which to start reading or writing.
|
||
.\"
|
||
.SH OPTIONS
|
||
|
||
.IP \fB-i\fP\ \fIfile\fP
|
||
Takes a file path as an argument and opens it for use as an input.
|
||
.IP \fB-b\fP\ \fIblock_size\fP
|
||
Takes a numeric argument as the size in bytes of the input buffer. If this
|
||
option is not specified, the size is 1024 bytes.
|
||
.IP \fB-s\fP \fIoffset\fP
|
||
Takes a numeric argument as the index of the byte at which reading will
|
||
commence; the program \(lqskips\(rq that number of \fIbytes\fP. If the standard
|
||
input is used, bytes read to this point are discarded.
|
||
.IP \fB-o\fP \fIfile\fP
|
||
Takes a file path as an argument and opens it for use as an output.
|
||
.IP \fB-B\fP\ \fIblock_size\fP
|
||
Takes a numeric argument as the size in bytes of the output buffer. The default
|
||
size is 1024. Note that this option only affects the size of output writes and
|
||
not the amount of output data itself. See the CAVEATS section.
|
||
.IP \fB-S\fP \fIoffset\fP
|
||
Takes a numeric argument as the index of the byte at which writing will
|
||
commence; the program \(lqseeks\(rq that number of bytes. If the standard
|
||
output is used, null characters are first printed this many times.
|
||
.IP \fB-a\fP
|
||
Accepts a single literal byte with which the input buffer is padded in the
|
||
event of an incomplete read from the input file. If the option argument is
|
||
empty, the null byte is used.
|
||
.IP \fB-c\fP
|
||
Specifies a number of blocks to read. The default is 0, in which case the input
|
||
is read until a partial or empty read is made.
|
||
.IP \fB-H\fP
|
||
Prints diagnostic messages in a human-readable manner as described in the
|
||
DIAGNOSTICS section.
|
||
.IP \fB-n\fP
|
||
Retries failed reads once before exiting.
|
||
.\"
|
||
.SH STANDARD INPUT
|
||
|
||
The standard input shall be used as an input if none are specified or if the
|
||
input file is \(lq-\(rq.
|
||
.\"
|
||
.SH STANDARD OUTPUT
|
||
The standard output shall be used as an output if none are specified or if the
|
||
output file is \(lq-\(rq.
|
||
.\"
|
||
.SH EXAMPLES
|
||
|
||
The following
|
||
.BR sh (1p)
|
||
line:
|
||
|
||
.RS
|
||
printf 'Hello, world!\(rsn' | dj -c 1 -b 7 -s 7 2>/dev/null
|
||
.RE
|
||
|
||
Produces the following output:
|
||
|
||
.RS
|
||
world!
|
||
.RE
|
||
|
||
The following
|
||
.BR sh (1p)
|
||
lines run sequentially:
|
||
|
||
.RS
|
||
tr '\(rs0' 0 </dev/zero | dj -c 1 -b 6 -o hello.txt
|
||
|
||
tr '\(rs0' H </dev/zero | dj -c 1 -b 1 -o hello.txt
|
||
|
||
tr '\(rs0' e </dev/zero | dj -c 1 -b 1 -o hello.txt -S 1
|
||
|
||
tr '\(rs0' l </dev/zero | dj -c 1 -b 2 -o hello.txt -S 2
|
||
|
||
tr '\(rs0' o </dev/zero | dj -c 1 -b 1 -o hello.txt -S 4
|
||
|
||
tr '\(rs0' '\(rsn' </dev/zero | dj -c 1 -b 1 -o hello.txt -S 5
|
||
|
||
dj -i hello.txt
|
||
.RE
|
||
|
||
Produce the following output:
|
||
|
||
.RS
|
||
Hello
|
||
.RE
|
||
|
||
It may be particularly illuminating to print the contents of the example
|
||
.B hello.txt
|
||
after each
|
||
.BR dj (1)
|
||
invocation.
|
||
.\"
|
||
.SH DIAGNOSTICS
|
||
|
||
On a partial or empty read, a diagnostic message is printed. Then, unless the
|
||
.B -n
|
||
option is specified, the program exits.
|
||
|
||
By default, statistics are printed for input and output to the standard error
|
||
in the following format:
|
||
|
||
.RS
|
||
{records read} {ASCII unit separator} {partial records read}
|
||
{ASCII record separator} {records written} {ASCII unit separator}
|
||
{partial records written} {ASCII group separator} {bytes read}
|
||
{ASCII record separator} {bytes written} {ASCII file separator}
|
||
.RE
|
||
|
||
This format for diagnostic output is designed to be machine-parseable for
|
||
convenience. For a more human-readable format, the
|
||
.B -H
|
||
option may be specified. In this event, the following format is used instead:
|
||
|
||
.RS
|
||
{records read} '+' {partial records read} '>' {records written}
|
||
'+' {partial records written} ';' {bytes read} '>' {bytes written}
|
||
{ASCII line feed}
|
||
.RE
|
||
|
||
In non-recoverable errors that don\(cqt pertain to the read-write cycle, a
|
||
diagnostic message is printed and the program exits with the appropriate
|
||
.BR sysexits.h (3)
|
||
status.
|
||
.\"
|
||
.SH BUGS
|
||
|
||
If
|
||
.B -n
|
||
is specified along with the
|
||
.B -c
|
||
option and a count, actual byte output is the product of the count and the
|
||
input block size and therefore may be lower than expected. If the
|
||
.B -a
|
||
option is specified, this could make written data nonsensical.
|
||
.\"
|
||
.SH CAVEATS
|
||
|
||
Existing files are not truncated on ouput and are instead overwritten.
|
||
|
||
Option variants that have lowercase and uppercase forms could be confused for
|
||
each other. The former affects input and the latter affects output.
|
||
|
||
The
|
||
.B -B
|
||
option could be mistaken for the count in bytes of data written to the output.
|
||
This conception is intuitive but incorrect, as the
|
||
.B -c
|
||
option controls the number of blocks to read and the
|
||
.B -b
|
||
option sets the size of the blocks. The
|
||
.B -B
|
||
option is similar to the latter but sets the size of blocks to be written,
|
||
regardless of the amount of data that will actually be written. In practice,
|
||
this means the input buffer should be very large to make use of modern hardware
|
||
input and output speeds.
|
||
|
||
The skipped or sought bytes while processing irregular files, such as streams,
|
||
are reported in the diagnostic output, because they were actually read or
|
||
written. This is as opposed to bytes skipped while processing regular files,
|
||
which are not reported.
|
||
|
||
Much of this program shares its functionality with
|
||
.BR mm (1).
|
||
.\"
|
||
.SH RATIONALE
|
||
|
||
This program was based on the
|
||
.BR dd (1p)
|
||
utility as specified in \*(Px. While character conversion may have been the
|
||
original intent of
|
||
.BR dd (1p),
|
||
it is irrelevant to its modern use. Because of this, this program eschews
|
||
character conversion and adds typical option formatting, allowing seeks to be
|
||
specified in bytes rather than in blocks, allowing arbitrary bytes as padding,
|
||
and printing in a format that\(cqs easy for machines to parse.
|
||
.\"
|
||
.SH AUTHOR
|
||
|
||
Written by DTB
|
||
.MT trinity@trinity.moe
|
||
.ME .
|
||
.\"
|
||
.SH COPYRIGHT
|
||
|
||
Copyright \(co 2023 DTB. License AGPLv3+: GNU AGPL version 3 or later
|
||
<https://gnu.org/licenses/agpl.html>.
|
||
.\"
|
||
.SH SEE ALSO
|
||
.BR mm (1)
|
||
.BR dd (1p)
|
||
.BR lseek (3p)
|