find(1p) analogue #60

Open
opened 2024-02-15 07:31:57 +00:00 by emma · 4 comments
Owner

What should our find analogue look like?

What should our find analogue look like?
emma added the
enhancement
label 2024-02-15 07:31:57 +00:00
Owner

Probably nothing like find(1p) which is famously awful (this history lines up with A Research UNIX Reader, pg. 4) and loathed by everybody.

There are two reimplementations notable enough to warrant mention on find(1p)'s Wikipedia page:

  • fd(1), a popular replacement with an unfortunately namespace-collision-prone name written in Rust,
  • and the pair of sor(1) and walk(1), which are shockingly elegant in usage considering they came out of Google, though the Plan 9 inspiration must have done a lot of heavy lifting there.

walk(1) is pleasant and worthy of implementation. ASV could be used by default as a separator between filenames (which record seperator?) with a -d for specifying another seperator, -0 for nuls, and -n for newlines. -l could be used to specify the maximum depth to traverse.

sor(1) seems alright but I don't know how I feel about it. It might be better to have a testeach(1) (Usage: testeach (-!0n) (-d [delimiter]) (!) [command (args...)]; 0, d, and n carrying the same use as my proposed walk(1)) that slots each line as a single argument past the given command and arguments and prints the line to standard output if the used command exits successfully (or, if ! is used, unsuccessfully). This is as opposed to the given sor(1) that is ambiguous in this respect (the man page usage is sor SNIPPET ... but the given example is sor 'test -f' - you could read the code here but it is a little frightening).

Any find(1p) wrapper is going to be complicated no matter what because the spec is descriptive of a number of terrible classic implementations, so the compatibility of walk(1) with find(1p) doesn't really need to be considered for this - there are no clear ways to save work later with decisions made now, and plenty of ways to overcomplicate walk(1) by shoehorning find(1p) features into it. I expect most UNIX users try to avoid find(1p) anyway because it sucks so there won't be much violation of expectation in ignoring find(1p)'s semantics. I don't think ignoring find(1p)'s semantics is an issue of hubris, either; as The Research UNIX Reader mentions, find(1p) was created for The Programmer's Workbench - a way to commercialize UNIX, though perhaps this is an oversimplified take - and never meshed well or made much attempt to integrate with UNIX and its philosophy.

Probably nothing like find(1p) [which is famously awful](https://doc.cat-v.org/unix/find-history) (this history lines up with *[A Research UNIX Reader](https://www.cs.dartmouth.edu/~doug/reader.pdf)*, pg. 4) and [loathed by everybody](https://news.ycombinator.com/item?id=10317964). There are two reimplementations notable enough to warrant mention on [find(1p)'s Wikipedia page](https://en.wikipedia.org/wiki/Find_(Unix)): - [fd(1)](https://github.com/sharkdp/fd), a popular replacement with an unfortunately namespace-collision-prone name written in Rust, - and the pair of [sor(1) and walk(1)](https://github.com/google/walk), which are shockingly elegant in usage considering they came out of Google, though the Plan 9 inspiration must have done a lot of heavy lifting there. walk(1) is pleasant and worthy of implementation. ASV could be used by default as a separator between filenames (which record seperator?) with a `-d` for specifying another seperator, `-0` for nuls, and `-n` for newlines. `-l` could be used to specify the maximum depth to traverse. sor(1) seems alright but I don't know how I feel about it. It might be better to have a testeach(1) (`Usage: testeach (-!0n) (-d [delimiter]) (!) [command (args...)]`; `0`, `d`, and `n` carrying the same use as my proposed walk(1)) that slots each line as a single argument past the given command and arguments and prints the line to standard output if the used command exits successfully (or, if `!` is used, unsuccessfully). This is as opposed to the given sor(1) that is ambiguous in this respect ([the man page usage](https://github.com/google/walk/blob/master/sor.1#L17) is `sor SNIPPET ...` but the given example is `sor 'test -f'` - you could read the code here but [it is a little frightening](https://github.com/google/walk/blob/master/sor#L86)). Any find(1p) wrapper is going to be complicated no matter what because the spec is descriptive of a number of terrible classic implementations, so the compatibility of walk(1) with find(1p) doesn't really need to be considered for this - there are no clear ways to save work later with decisions made now, and plenty of ways to overcomplicate walk(1) by shoehorning find(1p) features into it. I expect most UNIX users try to avoid find(1p) anyway because it sucks so there won't be much violation of expectation in ignoring find(1p)'s semantics. I don't think ignoring find(1p)'s semantics is an issue of hubris, either; as *The Research UNIX Reader* mentions, find(1p) was created for [The Programmer's Workbench](https://en.m.wikipedia.org/wiki/PWB/UNIX) - a way to commercialize UNIX, though perhaps this is an oversimplified take - and never meshed well or made much attempt to integrate with UNIX and its philosophy.
Owner

I'm working on a fork of Google's walk(1) that adds the features necessary for our uses. It's very nearly done (compiles and runs, but segfaults if -l is specified).

I'm working on [a fork of Google's walk(1) that adds the features necessary for our uses](https://git.sr.ht/~trinity/src/tree/main/item/walk/walk.c). It's very nearly done (compiles and runs, but segfaults if `-l` is specified).
trinity self-assigned this 2024-02-18 20:57:21 +00:00
Owner

I'm working on a fork of Google's walk(1) that adds the features necessary for our uses. It's very nearly done (compiles and runs, but segfaults if -l is specified).

I've finished this: https://git.sr.ht/~trinity/src/tree/main/item/walk/walk.c

I also implemented -q for silencing some diagnostics messages.

> I'm working on [a fork of Google's walk(1) that adds the features necessary for our uses](https://git.sr.ht/~trinity/src/tree/main/item/walk/walk.c). It's very nearly done (compiles and runs, but segfaults if `-l` is specified). I've finished this: https://git.sr.ht/~trinity/src/tree/main/item/walk/walk.c I also implemented `-q` for silencing some diagnostics messages.
Author
Owner

Some name ideas:

  • lent (list directory entries)
  • dent (directory entries)
  • rent (recursive directory entries)
Some name ideas: - lent (list directory entries) - dent (directory entries) - rent (recursive directory entries)
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: bonsai/harakit#60
No description provided.