src/cbz/cbarchives.txt

Comic Book Archives (CB7, CBR, CBZ)

A Comic Book Archive is just an archive with a bunch of sequentially-named
images in it which collate ASCIIbetically to the order in which they should be
read. Each image corresponds to a page presented by a comic book reader and a
Comic Book Archive represents a digital comic book.

No guarantees can be made regarding image format, image names (though you can
expect them to match [:digit:]*\..*; that is, a bunch of numbers followed by a
file extension), archive attributes (compression or metadata), or what is in
archives besides the images. In fact, when a comic book only has one image, the
most common type of file in a comic book archive may not be an image at all.

The extension corresponds to what type of archive is being used. I haven't seen
an extension that hasn't fit within DOS 8.3, which makes sense as amateur
skiddies and repackers mostly use Windows.

<http://justsolve.archiveteam.org/wiki/Comic_Book_Archive>
<https://en.wikipedia.org/wiki/Comic_book_archive>

Here's a table of Comic Book Archive types. I can't imagine this list is
comprehensive but I can't find more on-line.
 ____________________
| Ext | Archive used |
|-----|--------------|
| CB7 | 7-Zip        |
| CBA | ACE          |
| CBR | RAR          |
| CBT | tar(1)       |
| CBZ | PKZip        |
'-----'--------------'

I normalize the files I get to the following settings:

ARCHIVE: PKzip. DEFLATE algorithm. No other configuration.
CONTENTS: ONLY images. Whatever encoding in which I found them.
	Sequential naming starting from one, with leading zeroes to ensure file
	names are all the same length.

<https://en.wikipedia.org/wiki/7-Zip>
7-Zip is free and open source. There are a number of implementations and you
can easily extract 7-Zip archives on all modern systems.

<https://en.wikipedia.org/wiki/ACE_(compressed_file_format)>
ACE is a proprietary archive format owned by e-merge GmbH. Nobody uses this.
There is a free software extractor written in Python available from
<https://pypi.org/project/acefile> and free software Python implementations
available for most popular systems.

<https://en.wikipedia.org/wiki/RAR_(file_format)>
RAR is a proprietary archive format owned by win.rar GmbH. It and CBR are both
unfortunately pretty common because RAR is popularly considered better at
compression than PKZip. The reference implementation has a license that is a
little more permissive than a contract with the devil and support for later
versions is spotty in free software. I've found it's best to bite the bullet,
use the reference unrar(1) utility, convert my CBRs to CBZs, and hope I never
need to use it again.

<https://en.wikipedia.org/wiki/Tar_(computing)>
TAR (tape archive) is an archive format released in its first incarnation in
1979. It doesn't do any compression and it's easy to extract files even by hand
with a hex editor if you can read the binary structure (which is thoroughly
documented). Tar extractors are ubiquitous, excellent, and built into every
modern operating system (this notably does not include Windows because
Microsoft sucks). Later varieties of tar (such as ustar) are standardized (in
IEEE 1003.1-2017) and files in this format will likely be readable for a very
long time.

<https://en.wikipedia.org/wiki/ZIP_(file_format)>
PKZip is free, open source, and like tar's later varieties, standardized (in
ISO/IEC 21320-1.2015). Archivers and unarchivers are ubiquitous and available
for all modern operating systems. The format is officially called ZIP but I
call it PKZip after its original implementation (PKZIP, by Phil Katz) and to
differentiate it from other "zip" names such as bzip, gzip, and xzip (which are
all compression algorithms).