qi(1): The qi shell #8

Closed
opened 2023-12-24 21:08:29 -07:00 by emma · 55 comments
Owner

This issue thread has been created for the discussion of the future features of the Bonsai Computer System’s default shell.

This issue thread has been created for the discussion of the future features of the Bonsai Computer System’s default shell.
emma added the
enhancement
question
labels 2023-12-24 21:08:29 -07:00
emma self-assigned this 2023-12-24 21:08:29 -07:00
silt was assigned by emma 2023-12-24 21:08:29 -07:00
trinity was assigned by emma 2023-12-24 21:08:29 -07:00
Author
Owner

a problem i often run into is prepending and inserting lines to a file or stream of text. i was thinking our shell could have a redirect similarly to >> in POSIX shell, but allow prepending and appending: for prepending, >^ and for appending, >$. this would mean no more >>, though.

a problem i often run into is prepending and inserting lines to a file or stream of text. i was thinking our shell could have a redirect similarly to >> in POSIX shell, but allow prepending and appending: for prepending, `>^` and for appending, `>$`. this would mean no more `>>`, though.
Owner

I would prefer not to support appending or prepending, which can be accomplished with subshells or cat(1p). I would prefer any functionality that can be moved to external utilities be so to keep code minimal and, in the case of scripting using the shell, idiomatic. Currently there are many ways to append in sh(1p) in POSIX:

(dd <in; dd <epilogue) >out
dd <in >out; dd <epilogue >>out
cat in epilogue >out

All of these solutions are also ways to prepend (in becomes prologue, epilogue becomes in).

A stream editor (literally, not like sed(1p)) that supports insertion of text before arbitrary lines in standard input may be nice.

I would prefer not to support appending or prepending, which can be accomplished with subshells or cat(1p). I would prefer any functionality that can be moved to external utilities be so to keep code minimal and, in the case of scripting using the shell, idiomatic. Currently there are many ways to append in sh(1p) in POSIX: ```sh (dd <in; dd <epilogue) >out ``` ```sh dd <in >out; dd <epilogue >>out ``` ```sh cat in epilogue >out ``` All of these solutions are also ways to prepend (`in` becomes `prologue`, `epilogue` becomes `in`). A stream editor (literally, *not* like sed(1p)) that supports insertion of text before arbitrary lines in standard input may be nice.
Author
Owner

trinity and i talked about this already but we want to replace case/esac with match statements that act similarly to Rust matches.

trinity and i talked about this already but we want to replace case/esac with match statements that act similarly to Rust matches.
Author
Owner

i would also like to see curly braces ({ }) used instead of if then fi and for/while do done, but if someone has a better solution im all ears

i would also like to see curly braces ({ }) used instead of if then fi and for/while do done, but if someone has a better solution im all ears
Author
Owner

no elif. else if please

no elif. else if please
Owner

Curly braces could open a block the same way they do in C, where variables created inside the block are local to it and can't escape but variables from outside the block can be modified as usual, and an arbitrary amount of statements can go inside blocks. So if cond { statements... }, if cond statement;. Separating cond from statement would be a problem though. Parens are kinda gaudy for the purpose. if(foo) bar(baz); doesn't visually indicate well enough that if is not a function nor similar.

Then if cond statement; else if cond statement; else statement; can look like (if (cond) (statement) (else (if (cond) (statement) (else (statement)))) internally, the same way C does it, which is pretty nice.

Curly braces could open a block the same way they do in C, where variables created inside the block are local to it and can't escape but variables from outside the block can be modified as usual, and an arbitrary amount of statements can go inside blocks. So `if cond { statements... }`, `if cond statement;`. Separating `cond` from `statement` would be a problem though. Parens are kinda gaudy for the purpose. `if(foo) bar(baz);` doesn't visually indicate well enough that `if` is not a function nor similar. Then `if cond statement; else if cond statement; else statement;` can look like `(if (cond) (statement) (else (if (cond) (statement) (else (statement))))` internally, the same way C does it, which is pretty nice.
Owner

I think && and || should go or they should replace if entirely.

scrute -ef file \
    && {
        ls -l $file
        wc -c $file
    }

That doesn't look terrible but I'm not the biggest fan. But then of course how does if(cond && cond) work?..

I think `&&` and `||` should go or they should replace `if` entirely. ```sh scrute -ef file \ && { ls -l $file wc -c $file } ``` That doesn't look terrible but I'm not the biggest fan. But then of course how does `if(cond && cond)` work?..
Author
Owner

Curly braces could open a block the same way they do in C, where variables created inside the block are local to it and can't escape but variables from outside the block can be modified as usual, and an arbitrary amount of statements can go inside blocks. So if cond { statements... }, if cond statement;. Separating cond from statement would be a problem though. Parens are kinda gaudy for the purpose. if(foo) bar(baz); doesn't visually indicate well enough that if is not a function nor similar.

Then if cond statement; else if cond statement; else statement; can look like (if (cond) (statement) (else (if (cond) (statement) (else (statement)))) internally, the same way C does it, which is pretty nice.

i would prefer not to see scoping in the shell. as for the way conditionals work, what exactly is the issue with this?:

if true {
  # do stuff
} else if another-thing {
  # more stuff
} else {
  # jeez, didnt wanna do that other stuff
}
> Curly braces could open a block the same way they do in C, where variables created inside the block are local to it and can't escape but variables from outside the block can be modified as usual, and an arbitrary amount of statements can go inside blocks. So `if cond { statements... }`, `if cond statement;`. Separating `cond` from `statement` would be a problem though. Parens are kinda gaudy for the purpose. `if(foo) bar(baz);` doesn't visually indicate well enough that `if` is not a function nor similar. > > Then `if cond statement; else if cond statement; else statement;` can look like `(if (cond) (statement) (else (if (cond) (statement) (else (statement))))` internally, the same way C does it, which is pretty nice. i would prefer not to see scoping in the shell. as for the way conditionals work, what exactly is the issue with this?: ``` if true { # do stuff } else if another-thing { # more stuff } else { # jeez, didnt wanna do that other stuff } ```
emma closed this issue 2023-12-26 11:06:31 -07:00
emma reopened this issue 2023-12-26 11:06:41 -07:00
Owner

i would prefer not to see scoping in the shell.

This is probably for the best. I mention it because this is how C functions with statement blocks.

as for the way conditionals work, what exactly is the issue with this?:

if true {
  # do stuff
} else if another-thing {
  # more stuff
} else {
  # jeez, didnt wanna do that other stuff
}

I find how C does statement blocks to be delightful, with statement blocks being collections of statements but operating as a single statement. e.g.

#include <stdio.h>
int main(){
int j; /* this is a statement */

{ /* this is a statement block */
    int i;

    i = 'a';
    putchar(i);
}

for(j = 10; j > 1; --j) /* for's body is a statement */
    putchar('a'); /* this is a statement */

for(j = 1; j < 10;)
{ /* a statement block can function as a statement */
    ++j;
    putchar('a');
}

return 0;
}

This strikes me as elegant and simple to implement, and the issue of separating the condition from the body arises to support the if cond statement pattern. C uses parens (if (cond) statement;).

Probably the best option, though, is yours; to enforce curlies all the time, requiring a statement block specifically as the body of a flow control statement. This is fundamentally more readable and also as a result prevents the classic error of adding statements to the body of an if only to realize you forgot to put all the statements in a block.

> i would prefer not to see scoping in the shell. This is probably for the best. I mention it because this is how C functions with statement blocks. > as for the way conditionals work, what exactly is the issue with this?: > > ``` > if true { > # do stuff > } else if another-thing { > # more stuff > } else { > # jeez, didnt wanna do that other stuff > } > ``` I find how C does statement blocks to be delightful, with statement blocks being collections of statements but operating as a single statement. e.g. ```c #include <stdio.h> int main(){ int j; /* this is a statement */ { /* this is a statement block */ int i; i = 'a'; putchar(i); } for(j = 10; j > 1; --j) /* for's body is a statement */ putchar('a'); /* this is a statement */ for(j = 1; j < 10;) { /* a statement block can function as a statement */ ++j; putchar('a'); } return 0; } ``` This strikes me as elegant and simple to implement, and the issue of separating the condition from the body arises to support the `if cond statement` pattern. C uses parens (`if (cond) statement;`). Probably the best option, though, is yours; to enforce curlies all the time, requiring a statement block specifically as the body of a flow control statement. This is fundamentally more readable and also as a result prevents the classic error of adding statements to the body of an `if` only to realize you forgot to put all the statements in a block.
Owner

Both Emma and I plan to have math be in a separate utility or utilities and not built into the shell at all.

Both Emma and I plan to have math be in a separate utility or utilities and not built into the shell at all.
Owner

I don't like the shell built-in shift.

  • It seems to mainly exist as an equivalent to the C idiom ++argv; --argc; which is (given the benefit of >30 years of hindsight since C89) a complicating and not very readable way to save memory processing arguments.

  • Its definition in POSIX makes it unintuitive for shell scripters. Emma should elaborate here when fae has time, fae has more experience with this.

  • It necessitates storing "$0" as a new shell variable to be able to continue using it, which shell authors writing diagnostics messages and following good practice always want to do.

  • It's inflexible; it can only be used for the one task of shifting through arguments.

shift is the only real way to iterate over arguments ($1, $2, $3...) in shell. Particularly with the following idiom:

argv0="$0" # save $0
while test -n "$1" # while there is a remaining argument to use
    do
    printf '%s argument: %s\n' "$argv0" "$1" # example use
    shift # <--- boo!
done

What if variables could hold the names of other variables? Not like pointers in C but simply:

foo = 5;
# 5 is a string; this is still stringly typed like sh

bar = foo;

out "$[bar]";
# out means echo (no options allowed)
# the [square brackets] mean expand what is inside bar
# and then use that variable
# $[bar] becomes
# $foo because bar == "foo"

This could be used for better argument usage:

seq 1 $# \
| for i { out "$0 argument: $[i]" }

for in this example is a shell built-in that, for each line in standard input, sets the given variable (in this case i) to that line's content before running the commands in the curly braces. seq(1) in this usage prints a new-line delimited list of integers in the sequence described in its arguments. out is an analogue to echo(1) as currently being described in #27.

This might be a terrible idea that can be used to write terrible scripts. Also, there is no real concept of betta(1) yet, so I'm basing my examples off sh(1p).

I don't like the shell built-in `shift`. - It seems to mainly exist as an equivalent to the C idiom `++argv; --argc;` which is (given the benefit of >30 years of hindsight since C89) a complicating and not very readable way to save memory processing arguments. - Its definition in POSIX makes it unintuitive for shell scripters. Emma should elaborate here when fae has time, fae has more experience with this. - It necessitates storing `"$0"` as a new shell variable to be able to continue using it, which shell authors writing diagnostics messages and following good practice always want to do. - It's inflexible; it can only be used for the one task of shifting through arguments. `shift` is the only real way to iterate over arguments (`$1`, `$2`, `$3`...) in shell. Particularly with the following idiom: ```sh argv0="$0" # save $0 while test -n "$1" # while there is a remaining argument to use do printf '%s argument: %s\n' "$argv0" "$1" # example use shift # <--- boo! done ``` What if variables could hold the names of other variables? Not like pointers in C but simply: ``` foo = 5; # 5 is a string; this is still stringly typed like sh bar = foo; out "$[bar]"; # out means echo (no options allowed) # the [square brackets] mean expand what is inside bar # and then use that variable # $[bar] becomes # $foo because bar == "foo" ``` This could be used for better argument usage: ``` seq 1 $# \ | for i { out "$0 argument: $[i]" } ``` `for` in this example is a shell built-in that, for each line in standard input, sets the given variable (in this case `i`) to that line's content before running the commands in the curly braces. [seq(1)](https://www.man7.org/linux/man-pages/man1/seq.1.html) in this usage prints a new-line delimited list of integers in the sequence described in its arguments. `out` is an analogue to echo(1) as currently being described in #27. This might be a terrible idea that can be used to write terrible scripts. Also, there is no real concept of betta(1) yet, so I'm basing my examples off sh(1p).
Author
Owner

It necessitates storing "$0" as a new shell variable to be able to continue using it, which shell authors writing diagnostics messages and following good practice always want to do.

shift(1) does not modify the value of $0, actually. From shift(1p):

DESCRIPTION
       The positional parameters shall be shifted. Positional parameter 1
       shall be assigned the value of parameter (1+n), parameter 2 shall be
       assigned the value of parameter (2+n), and so on. The parameters
       represented by the numbers "$#" down to "$#-n+1" shall be unset, and
       the parameter '#' is updated to reflect the new number of positional
       parameters.

       The value n shall be an unsigned decimal integer less than or equal to
       the value of the special parameter '#'.  If n is not given, it shall be
       assumed to be 1. If n is 0, the positional and special parameters are
       not changed.
> It necessitates storing "$0" as a new shell variable to be able to continue using it, which shell authors writing diagnostics messages and following good practice always want to do. shift(1) does not modify the value of `$0`, actually. From shift(1p): ``` DESCRIPTION The positional parameters shall be shifted. Positional parameter 1 shall be assigned the value of parameter (1+n), parameter 2 shall be assigned the value of parameter (2+n), and so on. The parameters represented by the numbers "$#" down to "$#-n+1" shall be unset, and the parameter '#' is updated to reflect the new number of positional parameters. The value n shall be an unsigned decimal integer less than or equal to the value of the special parameter '#'. If n is not given, it shall be assumed to be 1. If n is 0, the positional and special parameters are not changed. ```
Author
Owner

It's inflexible; it can only be used for the one task of shifting through arguments.

Maybe we could have a shift-like utility that will “shift” any variable with IFS-delineated text (including $@)?

> It's inflexible; it can only be used for the one task of shifting through arguments. Maybe we could have a shift-like utility that will “shift” any variable with IFS-delineated text (including $@)?
Author
Owner

Its definition in POSIX makes it unintuitive for shell scripters. Emma should elaborate here when fae has time, fae has more experience with this.

From: "Lawrence Velázquez" <vq@larryv.me>                                                                                     To: "Christoph Anton Mitterer" <calestyo@scientia.org>, "Emma Tebibyte" <emma@tebibyte.media>                                
Cc: <dash@vger.kernel.org>                                                                                                    Bcc:                                                                                                                         
Date: 2023-12-06 10:29 PM                                                                                                                                                                                                                                  
Subject: Re:                                                                                                                                                                                                                                               
                                                                                                                                                                                                                                                           
On Thu, Dec 7, 2023, at 12:00 AM, Christoph Anton Mitterer wrote:                                                                                                                                                                                          
> On Wed, 2023-12-06 at 21:40 -0700, Emma Tebibyte wrote:                                                                                                                                                                                                  
>> I found a bug in dash version 0.5.12 where when shifting more than                                                                                                                                                                                      
>> ?#,                                                                                                                                                                                                                                                     
>> the shell exits before evaluating a logical OR operator.                                                                                                                                                                                                
>                                                                                                                                                                                                                                                          
> AFAIU from POSIX this is perfectly valid behaviour:                                                                                                                                                                                                      
>                                                                                                                                                                                                                                                          
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#shift                                                                                                                                                                          
>                                                                                                                                                                                                                                                          
>> EXIT STATUS                                                                                                                                                                                                                                             
>> If the n operand is invalid or is greater than "$#", this may be                                                                                                                                                                                        
>> considered a syntax error and a non-interactive shell may exit; if                                                                                                                                                                                      
>> the shell does not exit in this case, a non-zero exit status shall                                                                                                                                                                                      
>> be returned. Otherwise, zero shall be returned.                                                                                                                                                                                                         
                                                                                                                                                                                                                                                           
See also Section 2.8.1 [*], which states that interactive shells                                                                                                                                                                                           
shall not exit on special built-in utility errors and that:                                                                                                                                                                                                
                                                                                                                                                                                                                                                           
        In all of the cases shown in the table where an interactive                                                                                                                                                                                        
        shell is required not to exit, the shell shall not perform                                                                                                                                                                                         
        any further processing of the command in which the error                                                                                                                                                                                           
        occurred.                                                                                                                                                                                                                                          
                                                                                                                                                                                                                                                           
[*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_08_01                                                                                                                                                                 
                                                                                                                                                                                                                                                           
--                                                                                                                                                                                                                                                         
vq                                                                                                                                                                                                                                                             
> Its definition in POSIX makes it unintuitive for shell scripters. Emma should elaborate here when fae has time, fae has more experience with this. ``` From: "Lawrence Velázquez" <vq@larryv.me> To: "Christoph Anton Mitterer" <calestyo@scientia.org>, "Emma Tebibyte" <emma@tebibyte.media> Cc: <dash@vger.kernel.org> Bcc: Date: 2023-12-06 10:29 PM Subject: Re: On Thu, Dec 7, 2023, at 12:00 AM, Christoph Anton Mitterer wrote: > On Wed, 2023-12-06 at 21:40 -0700, Emma Tebibyte wrote: >> I found a bug in dash version 0.5.12 where when shifting more than >> ?#, >> the shell exits before evaluating a logical OR operator. > > AFAIU from POSIX this is perfectly valid behaviour: > > https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#shift > >> EXIT STATUS >> If the n operand is invalid or is greater than "$#", this may be >> considered a syntax error and a non-interactive shell may exit; if >> the shell does not exit in this case, a non-zero exit status shall >> be returned. Otherwise, zero shall be returned. See also Section 2.8.1 [*], which states that interactive shells shall not exit on special built-in utility errors and that: In all of the cases shown in the table where an interactive shell is required not to exit, the shell shall not perform any further processing of the command in which the error occurred. [*] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_08_01 -- vq ```
Owner

It necessitates storing "$0" as a new shell variable to be able to continue using it, which shell authors writing diagnostics messages and following good practice always want to do.

shift(1) does not modify the value of $0, actually.

Fuckin hell I've been using argv0 for years.

> > It necessitates storing "$0" as a new shell variable to be able to continue using it, which shell authors writing diagnostics messages and following good practice always want to do. > > shift(1) does not modify the value of `$0`, actually. Fuckin hell I've been using `argv0` for years.
emma changed title from betta(1): The betta shell feature thread to qi(1): The qi shell feature thread 2024-01-23 14:57:26 -07:00
Author
Owner

The name is changed because Trinity and I verbally discussed a shorter name to match the short and simple vibe of /bin/sh. In the future we may consider an interactive shell that wraps qi(1) with some convenience features.

The name is changed because Trinity and I verbally discussed a shorter name to match the short and simple vibe of `/bin/sh`. In the future we may consider an interactive shell that wraps `qi(1)` with some convenience features.
emma changed title from qi(1): The qi shell feature thread to `qi(1)`: The qi shell feature thread 2024-01-23 15:03:26 -07:00
Author
Owner
Thoughts on this? https://drewdevault.com/2023/07/31/The-rc-shell-and-whitespace.html
Owner

I've been asked to submit my grumblings about POSIX shell quoting to this issue. I am lacking sufficient energy to write eloquently (or at all) about this so here's an example.

# normal, makes sense
$ echo "foo"
foo

# normal, makes sense
$ echo "\"foo\""
"foo"

# normal, makes sense
$ echo 'foo'
foo

# what the fuck
$ echo '\'foo\''
# invalid syntax.
# string 1: \
# string 2: foo
# string 3: '
# error bc unclosed '

# solved by bashism. this should not have to be necessary
$ echo $'\'foo\''
'foo'
I've been asked to submit my grumblings about POSIX shell quoting to this issue. I am lacking sufficient energy to write eloquently (or at all) about this so here's an example. ```bash # normal, makes sense $ echo "foo" foo # normal, makes sense $ echo "\"foo\"" "foo" # normal, makes sense $ echo 'foo' foo # what the fuck $ echo '\'foo\'' # invalid syntax. # string 1: \ # string 2: foo # string 3: ' # error bc unclosed ' # solved by bashism. this should not have to be necessary $ echo $'\'foo\'' 'foo' ```
Owner

Plan 9's rc(1) doubles quote runes to escape them:

; echo "foo"
foo
; echo ""foo""
"foo"
; echo ''foo''
'foo'

Which I've always felt is quite nice.

The reason sh(1p)'s quotes are fucked is because not only is the backslash used to escape quotes in "-wrapped strings but it's also used for regular escape sequences. For example:

$ echo "test"  | wc -c # ['t','e','s','t','\n']
5
$ echo "test\n" | wc -c # ['t','e','s','t','\n','\n']
6
$ echo 'test\n' | wc -c # ['t','e','s','t','\\','n','\n']
7

POSIX shell quoting is abhorrent and the worst part about shell scripting by far. Escape sequences shouldn't be supported except by a theoretical format(1) (printf(1p) but Bonsai and improved).

Drew Devault's rc(1) bastardization (it is probably nice in practice) seems consistent but still as overcomplicated as shell is.

I'd like Plan 9 rc(1) quotes/escapes without a difference between "- and '-wrapped strings and no escape sequences besides "" or '', but I think this is better rather than good. I'm not sure what good would look like. Probably something to do with ASV.

Plan 9's rc(1) doubles quote runes to escape them: ```sh ; echo "foo" foo ; echo ""foo"" "foo" ; echo ''foo'' 'foo' ``` Which I've always felt is quite nice. The reason sh(1p)'s quotes are fucked is because not only is the backslash used to escape quotes in `"`-wrapped strings but it's also used for regular escape sequences. For example: ```sh $ echo "test" | wc -c # ['t','e','s','t','\n'] 5 $ echo "test\n" | wc -c # ['t','e','s','t','\n','\n'] 6 $ echo 'test\n' | wc -c # ['t','e','s','t','\\','n','\n'] 7 ``` POSIX shell quoting is abhorrent and the worst part about shell scripting by far. Escape sequences shouldn't be supported except by a theoretical format(1) (printf(1p) but Bonsai and improved). Drew Devault's rc(1) bastardization (it is probably nice in practice) seems consistent but still as overcomplicated as shell is. I'd like Plan 9 rc(1) quotes/escapes without a difference between `"`- and `'`-wrapped strings and no escape sequences besides `""` or `''`, but I think this is *better* rather than *good*. I'm not sure what good would look like. Probably something to do with ASV.
Author
Owner

I like single quotes being the way to literal-ize strings, i.e., "$(cat)" is expanded to the output of cat but '$(cat)' is literally $(cat).

I like single quotes being the way to literal-ize strings, i.e., `"$(cat)"` is expanded to the output of `cat` but `'$(cat)'` is literally `$(cat)`.
Owner

The "$()" pattern is horrid.

  • () - subshell
  • $() - oh wait, i need standard output
  • "$()" - oh wait, quoting

My ideal (Lisp!) shell syntax would be

$ echo "$(ls)"
.
..
a
b
c
$ fantasysh
( # the lparen is the prompt btw, and on enter an rparen is inserted (^enter to not do that))
( echo $(ls))
a
b
c
( foreach $((ls) |(sed :^\.:d )) file (
    ( mv *(file) $(strcat *(file) .png)))
( ls)
a.png
b.png
c.png
( expand *(set a "five thousand"))
(five thousand)
( echo *(a))
"five thousand"

qi(1) isn't intended to be a Lisp-oid but probably I will write something like that someday.

The `"$()"` pattern is horrid. - `()` - subshell - `$()` - oh wait, i need standard output - `"$()"` - oh wait, quoting My ideal (Lisp!) shell syntax would be ``` $ echo "$(ls)" . .. a b c $ fantasysh ( # the lparen is the prompt btw, and on enter an rparen is inserted (^enter to not do that)) ( echo $(ls)) a b c ( foreach $((ls) |(sed :^\.:d )) file ( ( mv *(file) $(strcat *(file) .png))) ( ls) a.png b.png c.png ( expand *(set a "five thousand")) (five thousand) ( echo *(a)) "five thousand" ``` qi(1) isn't intended to be a Lisp-oid but probably I will write something like that someday.
Owner

Should piping be done by the shell?

I've had an old program idea that I'm still working on called pspipe(1) - it pipes given commands. So pspipe [ cat ] [ less ] is equivalent to sh -c 'cat | less'. It was conceived because I planned to write a shell with the express purpose of doing as little as possible - to offload functionality usually achieved by shell syntax alone to other programs that can be used no matter what the present shell is, and in cases where their use might be more intuitive or elegant than the shell constructs.

There's currently qi and betta planned - qi being the core, simple shell useful for shebangs and scripting, and betta being the better (for interative use) shell with all the bells and whistles. I think | could be left out of qi and a piping utility used instead, and betta could either have built-in piping or automatically use a piping utility when it sees the pipe syntax.

I'm not sure how much I like this myself but my intent is to have as little code in qi as possible so it's easily dissectible by anyone who wishes to use it (and knows Rust|C) and has an hour to burn.

Should piping be done by the shell? I've had an old program idea that I'm still working on called pspipe(1) - it pipes given commands. So `pspipe [ cat ] [ less ]` is equivalent to `sh -c 'cat | less'`. It was conceived because I planned to write a shell with the express purpose of doing as little as possible - to offload functionality usually achieved by shell syntax alone to other programs that can be used no matter what the present shell is, and in cases where their use might be more intuitive or elegant than the shell constructs. There's currently `qi` and `betta` planned - `qi` being the core, simple shell useful for shebangs and scripting, and `betta` being the better (for interative use) shell with all the bells and whistles. I think `|` could be left out of `qi` and a piping utility used instead, and `betta` could either have built-in piping or automatically use a piping utility when it sees the pipe syntax. I'm not sure how much I like this myself but my intent is to have as little code in `qi` as possible so it's easily dissectible by anyone who wishes to use it (and knows Rust|C) and has an hour to burn.
Author
Owner

I think piping and redirects should be in both shells and I’m not too sure about the shells actually having differing syntax. I do think we should interrogate how piping and redirects work in POSIX and work from there.

I think piping and redirects should be in both shells and I’m not too sure about the shells actually having differing syntax. I do think we should interrogate how piping and redirects work in POSIX and work from there.
emma changed title from `qi(1)`: The qi shell feature thread to `qi(1)`: The qi shell 2024-02-06 23:32:15 -07:00
emma pinned this 2024-02-06 23:32:29 -07:00
Owner

The way POSIX shell does variable assignment is awful.

#!/bin/env -i /bin/sh
# ^^ don't inherit an existing environment

echo "$x" # unassigned (vars are "" by default), so it'll echo ['\n']

#x = hello # syntax error
# no spaces can be between the name, '=', and the value

x=hello # infix (subject verb object)
echo "$x" # prefix (verb subject)

Shell variable assignment isn't consistent with most syntax and really finnicky about whitespace. I propose a better way:

set x hello

set would need to be a shell builtin to change the local environment; it would have the usage set [variables...] [value] to facilitate setting multiple variables at the same time:

set x y z hello
echo "$x $y $z" # hello hello hello

POSIX shell already has a set builtin for modifying "$@" and its components (as well as configuring the behavior of the shell itself); we should rethink the functionality offered by POSIX set (reserved shell variables for configuration, for example).

The way POSIX shell does variable assignment is awful. ```sh #!/bin/env -i /bin/sh # ^^ don't inherit an existing environment echo "$x" # unassigned (vars are "" by default), so it'll echo ['\n'] #x = hello # syntax error # no spaces can be between the name, '=', and the value x=hello # infix (subject verb object) echo "$x" # prefix (verb subject) ``` Shell variable assignment isn't consistent with most syntax and really finnicky about whitespace. I propose a better way: ```sh set x hello ``` `set` would need to be a shell builtin to change the local environment; it would have the usage `set [variables...] [value]` to facilitate setting multiple variables at the same time: ```sh set x y z hello echo "$x $y $z" # hello hello hello ``` POSIX shell already has a `set` builtin for modifying `"$@"` and its components (as well as configuring the behavior of the shell itself); we should rethink the functionality offered by POSIX `set` (reserved shell variables for configuration, for example).
Author
Owner

The way POSIX shell does variable assignment is awful.

#!/bin/env -i /bin/sh
# ^^ don't inherit an existing environment

echo "$x" # unassigned (vars are "" by default), so it'll echo ['\n']

#x = hello # syntax error
# no spaces can be between the name, '=', and the value

x=hello # infix (subject verb object)
echo "$x" # prefix (verb subject)

Shell variable assignment isn't consistent with most syntax and really finnicky about whitespace. I propose a better way:

set x hello

set would need to be a shell builtin to change the local environment; it would have the usage set [variables...] [value] to facilitate setting multiple variables at the same time:

set x y z hello
echo "$x $y $z" # hello hello hello

POSIX shell already has a set builtin for modifying "$@" and its components (as well as configuring the behavior of the shell itself); we should rethink the functionality offered by POSIX set (reserved shell variables for configuration, for example).

I like this, but could we use let instead?

> The way POSIX shell does variable assignment is awful. > > ```sh > #!/bin/env -i /bin/sh > # ^^ don't inherit an existing environment > > echo "$x" # unassigned (vars are "" by default), so it'll echo ['\n'] > > #x = hello # syntax error > # no spaces can be between the name, '=', and the value > > x=hello # infix (subject verb object) > echo "$x" # prefix (verb subject) > ``` > > Shell variable assignment isn't consistent with most syntax and really finnicky about whitespace. I propose a better way: > > ```sh > set x hello > ``` > > `set` would need to be a shell builtin to change the local environment; it would have the usage `set [variables...] [value]` to facilitate setting multiple variables at the same time: > > ```sh > set x y z hello > echo "$x $y $z" # hello hello hello > ``` > > POSIX shell already has a `set` builtin for modifying `"$@"` and its components (as well as configuring the behavior of the shell itself); we should rethink the functionality offered by POSIX `set` (reserved shell variables for configuration, for example). I like this, but could we use `let` instead?
Owner

How will variables behave?

POSIX shell doesn't use variable declarations; any variable is the empty string at start (giving test -z its utility).

We could copy that behavior but it sort of sucks and makes typoes sometimes catastrophic. Remember the Steam rm -rf "$STEAMROOT/"* bug? An unset variable would evaluate as the empty string, thus making the expression used /* and nuking everything.

In other languages (Rust, for example), let expressions declare variables (and their inexistence prior to that expression). I think we should use let and treat variable use before declaration as an error.

We could also use set to declare environment variables to export to child processes, like sh(1p)'s export. I don't think it should be possible to export variables already declared with let, because knowing at declaration whether or not a variable will be exported seems to me to be more useful than being able to change it.

We could default to variables being immutable like Rust but I don't know how useful this is and it seems like it would cause more problems than it would solve.

How will variables behave? POSIX shell doesn't use variable declarations; any variable is the empty string at start (giving `test -z` its utility). We could copy that behavior but it sort of sucks and makes typoes sometimes catastrophic. Remember [the Steam `rm -rf "$STEAMROOT/"*` bug](https://github.com/ValveSoftware/steam-for-linux/issues/3671)? An unset variable would evaluate as the empty string, thus making the expression used `/*` and nuking everything. In other languages (Rust, for example), `let` expressions declare variables (and their inexistence prior to that expression). I think we should use `let` and treat variable use before declaration as an error. We could also use `set` to declare environment variables to export to child processes, like sh(1p)'s `export`. I don't think it should be possible to `export` variables already declared with `let`, because knowing at declaration whether or not a variable will be `export`ed seems to me to be more useful than being able to change it. We could default to variables being immutable like Rust but I don't know how useful this is and it seems like it would cause more problems than it would solve.
Author
Owner

We could default to variables being immutable like Rust but I don't know how useful this is and it seems like it would cause more problems than it would solve.

I don’t see much point to this.

We could also use set to declare environment variables to export to child processes, like sh(1p)'s export. I don't think it should be possible to export variables already declared with let, because knowing at declaration whether or not a variable will be exported seems to me to be more useful than being able to change it.

I would prefer to clash with POSIX shell as little as possible on syntax. Maybe set could be an exception but if there’s a better idea I’d rather that.

POSIX shell doesn't use variable declarations; any variable is the empty string at start (giving test -z its utility).

We could copy that behavior but it sort of sucks and makes typoes sometimes catastrophic (remember the Steam rm -rf "DIR/*" glitch?).

In other languages (Rust, for example), let expressions declare variables (and their inexistence prior to that expression). I think we should use let and treat variable use before declaration as an error.

This makes sense to me.

> We could default to variables being immutable like Rust but I don't know how useful this is and it seems like it would cause more problems than it would solve. I don’t see much point to this. > We could also use `set` to declare environment variables to export to child processes, like sh(1p)'s `export`. I don't think it should be possible to `export` variables already declared with `let`, because knowing at declaration whether or not a variable will be `export`ed seems to me to be more useful than being able to change it. I would prefer to clash with POSIX shell as little as possible on syntax. Maybe `set` could be an exception but if there’s a better idea I’d rather that. > POSIX shell doesn't use variable declarations; any variable is the empty string at start (giving `test -z` its utility). > > We could copy that behavior but it sort of sucks and makes typoes sometimes catastrophic (remember the Steam `rm -rf "DIR/*"` glitch?). > > In other languages (Rust, for example), `let` expressions declare variables (and their inexistence prior to that expression). I think we should use `let` and treat variable use before declaration as an error. This makes sense to me.
Owner

I have further thoughts on shell quoting.

When I think of program execution I think of the exec function family in C's <unistd.h>:

#include <unistd.h>

static char *args[] = {
    (char []){ "mm" },
    (char []){ "-i" },
    (char []){ "a b" }
};

int main(){
    execvp("mm", args);
}

Or subprocess.run in Python:

#!/usr/bin/env python3

import subprocess

subprocess.run(["mm", "-i", "a b"])

Or Rust's std::process::Command:

use std::process::Command;

// I don't know Rust well but I think this is valid
fn main() {
    let output = Command::new("mm")
        .args(["-i", "a b"]).output()
}

What these all have in common is that they have clear distinction between arguments, and if one wanted to use a variable as an argument it would be easy:

import subprocess
var="a b"
subprocess.run(["mm", "-i", var])

Meanwhile POSIX shell wants you to die:

#!/bin/sh

var="a b"

var2=$var # expands to `var2=a b`
# this runs `b` with `var2` equivalent to `a` in the
# child's environment

mm -i $var
# expands to `mm -i a b` which is invalid usage

No wonder people are desperate to use any interpreted programming language as a shell, asking if Python is a good fit and actually using Common Lisp. That being said, quoting every shell argument is at best inconvenient, with the example "mm" "-i" "a b" being 4 extra keypresses to type and up to 8 including the shift key.

I think we should start by mandating some useful rules that are already often followed by cautious scribes:

  • Always quote strings that contain whitespace; do not escape whitespace.

I don't think this will be very controversial. While escapes are convenient (an easy way to avoid navigating back to the beginning of the line, adding a quote, and then going back to the end just for one or two spaces) they're easy to mess up catastrophically:

#!/bin/sh

# removes one file
rm -f "A Super Duper Story (Draft).tex"

# removes one file
rm -f A\ Super\ Duper\ Story\ \(Draft\).tex

# removes two files (spot the typo!)
rm -f A\ Super\ Duper\ Story \(Draft\).tex
  • Always quote values in variable assignments.
#!/bin/sh

# again as in the example earlier, runs `b` with `var` set to
# `a` in the child environment
var=a b

# sets `var` to `a b`
var="a b"

# this is functionally equivalent to line 5 but makes
# it clear that the intention is that behavior rather
# than line 8's
var="a" b

This also seems non-controversial. I have more to say but am out of time to write so will comment this right now.

I have further thoughts on shell quoting. When I think of program execution I think of [the `exec` function family in C's `<unistd.h>`](https://www.man7.org/linux/man-pages/man3/execvp.3p.html): ```c #include <unistd.h> static char *args[] = { (char []){ "mm" }, (char []){ "-i" }, (char []){ "a b" } }; int main(){ execvp("mm", args); } ``` Or [`subprocess.run` in Python](https://docs.python.org/3/library/subprocess.html#subprocess.run): ```python #!/usr/bin/env python3 import subprocess subprocess.run(["mm", "-i", "a b"]) ``` Or [Rust's `std::process::Command`](https://doc.rust-lang.org/std/process/struct.Command.html): ```rs use std::process::Command; // I don't know Rust well but I think this is valid fn main() { let output = Command::new("mm") .args(["-i", "a b"]).output() } ``` What these all have in common is that they have clear distinction between arguments, and if one wanted to use a variable as an argument it would be easy: ```py import subprocess var="a b" subprocess.run(["mm", "-i", var]) ``` Meanwhile POSIX shell wants you to die: ```sh #!/bin/sh var="a b" var2=$var # expands to `var2=a b` # this runs `b` with `var2` equivalent to `a` in the # child's environment mm -i $var # expands to `mm -i a b` which is invalid usage ``` No wonder people are desperate to use any interpreted programming language as a shell, [asking if Python is a good fit](https://softwareengineering.stackexchange.com/questions/182077/is-it-possible-to-use-python-as-a-shell-replacement) and [actually using Common Lisp](https://clisp.sourceforge.io/clash.html). That being said, quoting every shell argument is at best inconvenient, with the example `"mm" "-i" "a b"` being 4 extra keypresses to type and up to 8 including the shift key. I think we should start by mandating some useful rules that are already often followed by cautious scribes: - Always quote strings that contain whitespace; do not escape whitespace. I don't think this will be very controversial. While escapes are convenient (an easy way to avoid navigating back to the beginning of the line, adding a quote, and then going back to the end just for one or two spaces) they're easy to mess up catastrophically: ```sh #!/bin/sh # removes one file rm -f "A Super Duper Story (Draft).tex" # removes one file rm -f A\ Super\ Duper\ Story\ \(Draft\).tex # removes two files (spot the typo!) rm -f A\ Super\ Duper\ Story \(Draft\).tex ``` - Always quote values in variable assignments. ```sh #!/bin/sh # again as in the example earlier, runs `b` with `var` set to # `a` in the child environment var=a b # sets `var` to `a b` var="a b" # this is functionally equivalent to line 5 but makes # it clear that the intention is that behavior rather # than line 8's var="a" b ``` This also seems non-controversial. I have more to say but am out of time to write so will comment this right now.
Owner

Alright, this is the continuation of my last comment.

The behavior of the traditional POSIX shell with regards to unquoted variable expansion is useful, sometimes, but usually unwanted and a pain to deal with. In Python if I wanted that behavior I'd use str.split:

import subprocess

var="a b"

subprocess.run(
    # ["mm", "-i", "a", "b"]
    ["mm", "-i"] + var.split()
)

The C standard library has no such helper function (the functionality offered by str.split could be replicated though) and Rust is as of now beyond me.

Alright, this is the continuation of my last comment. The behavior of the traditional POSIX shell with regards to unquoted variable expansion is useful, sometimes, but usually unwanted and a pain to deal with. In Python if I wanted that behavior I'd use [str.split](https://docs.python.org/3/library/stdtypes.html#str.split): ```py import subprocess var="a b" subprocess.run( # ["mm", "-i", "a", "b"] ["mm", "-i"] + var.split() ) ``` The C standard library has no such helper function (the functionality offered by `str.split` could be replicated though) and Rust is as of now beyond me.
trinity reopened this issue 2024-02-12 11:46:28 -07:00
Owner

Accidentally tapped the button. That thought's incomplete and I'll finish it later.

Accidentally tapped the button. That thought's incomplete and I'll finish it later.
Owner

alright i haven't slept in a few days and have important things to be doing rn, what better idea than to propose shell syntax. this is poorly thought-out and full of holes. have fun deciphering and feel free to harass me if that takes too long

i haven't read through like any of the posts here so i might repeat or redefine things that've been discussed already

i will be using words here. maybe (read: probably) even misusing words. here's a best-effort explanation of my nonsense:

  • term: an evaluable substring of a term (yeah it's a recursive definition, weep). terms may evaluate to their literal text, or evaluate to the result of their contained sub-terms if present. dw if this doesn't make sense, i'll explain it more below.

editor's note: i simplified this quite a bit, only one word remains standing. you're welcome.

emma and i were discussing variable assignment and the potential usage of a let term, which is used to define terms. some pseudoishcode snippets from the conversation to elucidate on that a bit:

; let a meow  # the term `a` evaluates to `meow`
; let b "hru 2 3 -"  # the term `b` evaluates to `hru 2 3 -` which itself evaluates to the results of the command
; a
/bin/qi: meow: not found
; b
-1
; let a meowzers
# a <- meowzers
; let meowzers b
; let a meowzers
# a <- b
; let 'the bomb dot com' meow
; 'the bomb dot com'
/bin/qi: meow: not found
; let a b c meowzers
# a, b, and c all equal meowzers
;let a let
;a b c
# b <- c

some important observations from this that i've already made on your behalf:

  • a term is any valid rust string. keywords can be redefined. let itself can be redefined. also, spaces symbols are valid. # can be redefined.
  • a term may take arguments from its right hand side. let does this. hru (we have an hru?) also does this. let's call these argument-taking terms "operator terms", or maybe just "operators", and their arguments "operands".
  • one could describe # as an operator that takes all subsequent terms as its operands and does nothing
  • looking at let's operands, they're just terms. they may be taken literally, but they may also be operator terms.
  • this all unlocks some very interesting metaprogramming capabilities

so we have operators, which operate on and consume the terms to their right. these terms may be operators themselves, but they may also just be literals. do you smell it yet? i smell it. it's the smell of polish notation. alright so what if we did polish notation in more places.

suppose there was a pn(1) utility, serving as the prefix version of rpn(1):

# math in the shell can still be concise!
let + 'pn +'
let - 'pn-ng -'  # you could even swap out implementations for various operations if desirable
+ 3 - 2 1  # `pn + 3 pn-ng - 2 1` -> `pn + 3 1` -> `4`

as the reader it is now your job to come up with a more exciting example than that bc that's as far as my thoughts are willing to go at this time of night.

there are issues with this. the main one that jumps out is that in your typical pn language, the interpreter is aware of how many terms will be consumed as operands by a given operator. however, when an operator can be an arbitrary executable, the number of terms consumed is completely ambiguous. there is a fix, which is to let the programmer define those bounds themselves:

(let + (pn +))
(let - (pn-ng -)) (# is it just me or does this syntax seem kinda familiar)
(+ 3 (- 2 1))

oops. we should probably not do this.

realized that about halfway through typing out this textwall but was told to post it anyways so here ya go

alright i haven't slept in a few days and have important things to be doing rn, what better idea than to propose shell syntax. this is poorly thought-out and full of holes. have fun deciphering and feel free to harass me if that takes too long i haven't read through like any of the posts here so i might repeat or redefine things that've been discussed already i will be using words here. maybe (read: probably) even misusing words. here's a best-effort explanation of my nonsense: - term: an evaluable substring of a term (yeah it's a recursive definition, weep). terms may evaluate to their literal text, or evaluate to the result of their contained sub-terms if present. dw if this doesn't make sense, i'll explain it more below. *editor's note: i simplified this quite a bit, only one word remains standing. you're welcome.* emma and i were discussing variable assignment and the potential usage of a `let` term, which is used to define terms. some pseudoishcode snippets from the conversation to elucidate on that a bit: ``` ; let a meow # the term `a` evaluates to `meow` ; let b "hru 2 3 -" # the term `b` evaluates to `hru 2 3 -` which itself evaluates to the results of the command ; a /bin/qi: meow: not found ; b -1 ``` ``` ; let a meowzers # a <- meowzers ; let meowzers b ; let a meowzers # a <- b ``` ``` ; let 'the bomb dot com' meow ; 'the bomb dot com' /bin/qi: meow: not found ``` ``` ; let a b c meowzers # a, b, and c all equal meowzers ``` ``` ;let a let ;a b c # b <- c ``` some important observations from this that i've already made on your behalf: - a term is *any* valid rust string. keywords can be redefined. `let` itself can be redefined. also, spaces symbols are valid. `#` can be redefined. - a term may take arguments from its right hand side. `let` does this. `hru` (we have an `hru`?) also does this. let's call these argument-taking terms "operator terms", or maybe just "operators", and their arguments "operands". - one could describe `#` as an operator that takes all subsequent terms as its operands and does nothing - looking at `let`'s operands, they're just terms. they may be taken literally, but they may also be operator terms. - this all unlocks some very interesting metaprogramming capabilities so we have operators, which operate on and consume the terms to their right. these terms may be operators themselves, but they may also just be literals. do you smell it yet? i smell it. it's the smell of polish notation. alright so what if we did polish notation in more places. suppose there was a `pn(1)` utility, serving as the prefix version of `rpn(1)`: ``` # math in the shell can still be concise! let + 'pn +' let - 'pn-ng -' # you could even swap out implementations for various operations if desirable + 3 - 2 1 # `pn + 3 pn-ng - 2 1` -> `pn + 3 1` -> `4` ``` as the reader it is now your job to come up with a more exciting example than that bc that's as far as my thoughts are willing to go at this time of night. there are issues with this. the main one that jumps out is that in your typical pn language, the interpreter is aware of how many terms will be consumed as operands by a given operator. however, when an operator can be an arbitrary executable, the number of terms consumed is completely ambiguous. there is a fix, which is to let the programmer define those bounds themselves: ``` (let + (pn +)) (let - (pn-ng -)) (# is it just me or does this syntax seem kinda familiar) (+ 3 (- 2 1)) ``` [oops](http://lambda-the-ultimate.org/node/2352). we should probably not do this. realized that about halfway through typing out this textwall but was told to post it anyways so here ya go
Author
Owner

This extends from the syntax I was considering yesterday:

; let a b
; a
/bin/qi: b: Not found.
; let b 'a'
; 'b'
/bin/qi: b: Not found.
; let y { rpn 2 1 - }
; out y
1
; let a b c 10
# a == 10; b == 10; c == 10
; let string str
; string
Usage: str [type] [string...]
This extends from the syntax I was considering yesterday: ``` ; let a b ; a /bin/qi: b: Not found. ; let b 'a' ; 'b' /bin/qi: b: Not found. ; let y { rpn 2 1 - } ; out y 1 ; let a b c 10 # a == 10; b == 10; c == 10 ; let string str ; string Usage: str [type] [string...] ```
Owner
; let string str
; string
Usage: str [type] [string...]

why not update argv0? just curious, i don't have any strong opinions on the matter

> ``` > ; let string str > ; string > Usage: str [type] [string...] > ``` why not update argv0? just curious, i don't have any strong opinions on the matter
Author
Owner
; let string str
; string
Usage: str [type] [string...]

why not update argv0? just curious, i don't have any strong opinions on the matter

Because string expands to str and the behavior is consistent with how aliasing works in POSIX shell.

> > ``` > > ; let string str > > ; string > > Usage: str [type] [string...] > > ``` > > why not update argv0? just curious, i don't have any strong opinions on the matter Because `string` *expands* to `str` and the behavior is consistent with how aliasing works in POSIX shell.
Author
Owner

A question to ask is if I do this:

; let a b
; let c 'a'
; c

should this:

; c
/bin/qi: a: Not found.

or this:

; b
/bin/qi: b: Not found.

occur?

I like the idea of preserving the literal string ('a') because otherwise this becomes a whole lot more complex to utilize.

A question to ask is if I do this: ``` ; let a b ; let c 'a' ; c ``` should this: ``` ; c /bin/qi: a: Not found. ``` or this: ``` ; b /bin/qi: b: Not found. ``` occur? I like the idea of preserving the literal string (`'a'`) because otherwise this becomes a whole lot more complex to utilize.
Author
Owner

Actually on second thought I just realized that this is kind of pointless. I can’t think of any scenario where preserving the 'a' would make sense, considering if you run c then it is intuitive that it would try to run a, expanding it, anyway. If you want a string literal 'a' in the output you can do it like this:

; let a b
; out 'a'
a
Actually on second thought I just realized that this is kind of pointless. I can’t think of any scenario where preserving the `'a'` would make sense, considering if you run `c` then it is intuitive that it would try to run `a`, expanding it, anyway. If you want a string literal `'a'` in the output you can do it like this: ``` ; let a b ; out 'a' a ```
Author
Owner

I think preserving parameter expansion from POSIX shell is a good idea:

; let a b
; out a
b
; out 'a' "a c"
a b c
; out "drop the aass"
drop the aass
; out "drop the {a}ass"
drop the bass
I think preserving parameter expansion from POSIX shell is a good idea: ``` ; let a b ; out a b ; out 'a' "a c" a b c ; out "drop the aass" drop the aass ; out "drop the {a}ass" drop the bass ```
Author
Owner

I think preserving parameter expansion from POSIX shell is a good idea:

; let a b
; out a
b
; out 'a' "a c"
a b c
; out "drop the aass"
drop the aass
; out "drop the {a}ass"
drop the bass

I guess this specific syntax won’t work if we’re planning on using { } as a subshell, but we could do something similar.

> I think preserving parameter expansion from POSIX shell is a good idea: > ``` > ; let a b > ; out a > b > ; out 'a' "a c" > a b c > ; out "drop the aass" > drop the aass > ; out "drop the {a}ass" > drop the bass > ``` I guess this specific syntax won’t work if we’re planning on using `{ }` as a subshell, but we could do something similar.
Author
Owner

I was also thinking that perhaps all quoting should make characters literal so that you have to wrap them in our equivalent to { } to use them. I’m not sure how good of an idea that is but it would help with not having to do something like:

; let number 8
; out 'number:' number
number: 8
I was also thinking that perhaps all quoting should make characters literal so that you have to wrap them in our equivalent to `{ }` to use them. I’m not sure how good of an idea that is but it would help with not having to do something like: ``` ; let number 8 ; out 'number:' number number: 8 ```
Owner

Perhaps there should be a way to change which file is used for randomness, in the shell. Like a random_source variable that expands to a file (like /dev/random).

This could potentially be catastrophic though (try changing the random_source to /dev/zero and running a program that needs to be secure). I'm also not sure how this would be implemented.

This would replicate the functionality of GNU shuf(1)'s --random-source in a way that makes more sense to me (related to #55).

Perhaps there should be a way to change which file is used for randomness, in the shell. Like a `random_source` variable that expands to a file (like `/dev/random`). This could potentially be catastrophic though (try changing the `random_source` to `/dev/zero` and running a program that needs to be secure). I'm also not sure how this would be implemented. This would replicate the functionality of GNU shuf(1)'s `--random-source` in a way that makes more sense to me (related to #55).
Owner

emma and i were discussing variable assignment and the potential usage of a let term, which is used to define terms. some pseudoishcode snippets from the conversation to elucidate on that a bit:

; let a meow  # the term `a` evaluates to `meow`
; let b "hru 2 3 -"  # the term `b` evaluates to `hru 2 3 -` which itself evaluates to the results of the command
; a
/bin/qi: meow: not found
; b
-1
; let a meowzers
# a <- meowzers
; let meowzers b
; let a meowzers
# a <- b
; let 'the bomb dot com' meow
; 'the bomb dot com'
/bin/qi: meow: not found
; let a b c meowzers
# a, b, and c all equal meowzers
;let a let
;a b c
# b <- c

some important observations from this that i've already made on your behalf:

  • a term is any valid rust string. keywords can be redefined. let itself can be redefined. also, spaces symbols are valid. # can be redefined.
  • a term may take arguments from its right hand side. let does this. hru (we have an hru?) also does this. let's call these argument-taking terms "operator terms", or maybe just "operators", and their arguments "operands".
  • one could describe # as an operator that takes all subsequent terms as its operands and does nothing
  • looking at let's operands, they're just terms. they may be taken literally, but they may also be operator terms.
  • this all unlocks some very interesting metaprogramming capabilities

I really like this though some caveats come to mind that I'll mention when I think them through.

It would be nice to have a non-redifinable shell command def:

; let a b
; out a # b
; def
Builtin usage: def [variable name] (index)
; def a # default index is -1
; out a # a
; let a b
; let a c
; out a # c
; def a
; out a # b
; def a 2 # the second definition of a
; out a # which is c
; def a -2 # move back two definitions
; out a # undefined; a

All definitions are saved and def traverses definitions, so it's possible to use a previous definition of a variable.

; let random_source /dev/music_random # found this kernel driver on supersecure.info.ru
; ls /media/audio/ | shuffle | xargs mpv # grooving time
; def random_source # i only wanted that for that one command though

This is inspired by Forth though I'm not very familiar with it.

> emma and i were discussing variable assignment and the potential usage of a `let` term, which is used to define terms. some pseudoishcode snippets from the conversation to elucidate on that a bit: > ``` > ; let a meow # the term `a` evaluates to `meow` > ; let b "hru 2 3 -" # the term `b` evaluates to `hru 2 3 -` which itself evaluates to the results of the command > ; a > /bin/qi: meow: not found > ; b > -1 > ``` > ``` > ; let a meowzers > # a <- meowzers > ; let meowzers b > ; let a meowzers > # a <- b > ``` > ``` > ; let 'the bomb dot com' meow > ; 'the bomb dot com' > /bin/qi: meow: not found > ``` > ``` > ; let a b c meowzers > # a, b, and c all equal meowzers > ``` > ``` > ;let a let > ;a b c > # b <- c > ``` > some important observations from this that i've already made on your behalf: > - a term is *any* valid rust string. keywords can be redefined. `let` itself can be redefined. also, spaces symbols are valid. `#` can be redefined. > - a term may take arguments from its right hand side. `let` does this. `hru` (we have an `hru`?) also does this. let's call these argument-taking terms "operator terms", or maybe just "operators", and their arguments "operands". > - one could describe `#` as an operator that takes all subsequent terms as its operands and does nothing > - looking at `let`'s operands, they're just terms. they may be taken literally, but they may also be operator terms. > - this all unlocks some very interesting metaprogramming capabilities I really like this though some caveats come to mind that I'll mention when I think them through. It would be nice to have a non-redifinable shell command `def`: ``` ; let a b ; out a # b ; def Builtin usage: def [variable name] (index) ; def a # default index is -1 ; out a # a ; let a b ; let a c ; out a # c ; def a ; out a # b ; def a 2 # the second definition of a ; out a # which is c ; def a -2 # move back two definitions ; out a # undefined; a ``` All definitions are saved and `def` traverses definitions, so it's possible to use a previous definition of a variable. ``` ; let random_source /dev/music_random # found this kernel driver on supersecure.info.ru ; ls /media/audio/ | shuffle | xargs mpv # grooving time ; def random_source # i only wanted that for that one command though ``` This is inspired by Forth though I'm not very familiar with it.
Owner

On second thought def when used without arguments should display variable usages. Here are some examples of its use to help explain it.

First let's set some variables:

; let a b
; let c d
; let a d
; let a b
; let c b
; let c d

Variable content history with def with no arguments:

; def
a    b    1    -2
c    d    1    -2
a    d    2    -1
a    b    3    <-
c    b    2    -1
c    d    3    <-

This is a mess of information. Let's pick out one variable, a:

; def a
     0    -3
b    1    -2
d    2    -1
b    3    <-

Let's see c now:

; def c
     0    -3
d    1    -2
b    2    -1
d    3    <-

So this is a linear history of c's assignments. For each row showing assignment information, the first column refers to the values, the second refers to the number of definitions so far including itself, and the third refers to the relative placement to the current defintion (marked with a <-).

You can assign variables to previous definitions with def [variable] [placement]:

; def c 1
; def c
     0    -1
d    1    <-
b    2    +1
d    3    +2
; out c
d

And use the relative placements with +n or -n:

; def c +1
; def c
     0    -2
d    1    -1
b    2    <-
d    3    +1
; out c
b

Thankfully, changing things when time traveling doesn't affect the future:

; let c h
; def c
     0    -3
d    1    -2
b    2    -1
h    3    <-
; def c 0
; def c
     0    <-
d    1    +1
b    2    +2
h    3    +3
; let c j
; def c
     0    -1
j    1    <-
b    2    +1
h    3    +2

I would implement this similarly to how the Forth dictionary is implemented which was the inspiration for this idea.

On second thought `def` when used without arguments should display variable usages. Here are some examples of its use to help explain it. First let's set some variables: ``` ; let a b ; let c d ; let a d ; let a b ; let c b ; let c d ``` Variable content history with `def` with no arguments: ``` ; def a b 1 -2 c d 1 -2 a d 2 -1 a b 3 <- c b 2 -1 c d 3 <- ``` This is a mess of information. Let's pick out one variable, `a`: ``` ; def a 0 -3 b 1 -2 d 2 -1 b 3 <- ``` Let's see `c` now: ``` ; def c 0 -3 d 1 -2 b 2 -1 d 3 <- ``` So this is a linear history of `c`'s assignments. For each row showing assignment information, the first column refers to the values, the second refers to the number of definitions so far including itself, and the third refers to the relative placement to the current defintion (marked with a `<-`). You can assign variables to previous definitions with `def [variable] [placement]`: ``` ; def c 1 ; def c 0 -1 d 1 <- b 2 +1 d 3 +2 ; out c d ``` And use the relative placements with `+n` or `-n`: ``` ; def c +1 ; def c 0 -2 d 1 -1 b 2 <- d 3 +1 ; out c b ``` Thankfully, changing things when time traveling doesn't affect the future: ``` ; let c h ; def c 0 -3 d 1 -2 b 2 -1 h 3 <- ; def c 0 ; def c 0 <- d 1 +1 b 2 +2 h 3 +3 ; let c j ; def c 0 -1 j 1 <- b 2 +1 h 3 +2 ``` I would implement this similarly to [how the Forth dictionary is implemented](https://github.com/nornagon/jonesforth/blob/master/jonesforth.S#L166) which was the inspiration for this idea.
Owner

Goodness... I think that would be a vector implementation. I don't know how to feel about it now.

Our flow control is still up in the air so ignore the weirdness besides let and def:

#!/bin/qi-potentially

cat file | foreach i {
    let a i
} # store a file in 'a' definitions
def a 1
while true {
    out a
    def a +1 || break
} # print it
Goodness... I think that would be a vector implementation. I don't know how to feel about it now. Our flow control is still up in the air so ignore the weirdness besides `let` and `def`: ```sh #!/bin/qi-potentially cat file | foreach i { let a i } # store a file in 'a' definitions def a 1 while true { out a def a +1 || break } # print it ```
Author
Owner

I’m really not into def or having arrays/vectors in qi shell. If you want them you can use ASV.

Speaking of, I’d like to have variables representing ASV characters since typing them is not possible.

I’m really not into `def` or having arrays/vectors in qi shell. If you want them you can use ASV. Speaking of, I’d like to have variables representing ASV characters since typing them is not possible.
Author
Owner

I was thinking sysexits.h(3) values could be exposed by qi to the shell session:

if strcmp var '' {
  format "Usage: {} [whatever]"
  exit EX_USAGE
}

since both Trinity and I use those values in shell scripts, but they are hard-coded to whatever our sysexits.h says, so they’re not portable.

I was thinking `sysexits.h(3)` values could be exposed by qi to the shell session: ``` if strcmp var '' { format "Usage: {} [whatever]" exit EX_USAGE } ``` since both Trinity and I use those values in shell scripts, but they are hard-coded to whatever *our* sysexits.h says, so they’re not portable.
Author
Owner

What should we use for subshells? My initial thought was to use curly braces ({}) but if we do that it will conflict with our plans for format(1).

What should we use for subshells? My initial thought was to use curly braces (`{}`) but if we do that it will conflict with our plans for [`format(1)`](https://git.tebibyte.media/bonsai/coreutils/issues/43).
emma added this to the `qi(1)` project 2024-03-24 13:31:59 -06:00

What should we use for subshells? My initial thought was to use curly braces ({}) but if we do that it will conflict with our plans for format(1).

have the curly braces start with a period, zig does it something like this.

const foo = struct {
    num: i32,
};

const x: foo = .{
   .b = 5,
};
> What should we use for subshells? My initial thought was to use curly braces (`{}`) but if we do that it will conflict with our plans for [`format(1)`](https://git.tebibyte.media/bonsai/coreutils/issues/43). have the curly braces start with a period, zig does it something like this. ```zig const foo = struct { num: i32, }; const x: foo = .{ .b = 5, }; ```

does time have to be a shell built in? how about a simple coreutil for bench-marking instead?

does `time` have to be a shell built in? how about a simple coreutil for bench-marking instead?

Both Emma and I plan to have math be in a separate utility or utilities and not built into the shell at all.

how different is it going to be from expr

> Both Emma and I plan to have math be in a separate utility or utilities and not built into the shell at all. how different is it going to be from `expr`
Owner

Both Emma and I plan to have math be in a separate utility or utilities and not built into the shell at all.

how different is it going to be from expr

This was implemented as rpn(1) in src/rpn.rs, discussed in #21.

> > Both Emma and I plan to have math be in a separate utility or utilities and not built into the shell at all. > > how different is it going to be from `expr` This was implemented as [rpn(1)](https://git.tebibyte.media/bonsai/coreutils/src/branch/main/docs/rpn.1) in [src/rpn.rs](https://git.tebibyte.media/bonsai/coreutils/src/branch/main/src/rpn.rs), discussed in #21.
Owner

does time have to be a shell built in? how about a simple coreutil for bench-marking instead?

time isn't a built-in in dash (there is a times that does something else) nor in bash. I believe on my laptop time(1) is the busybox implementation. I wonder how simply a good benchmarking tool could be implemented, and if it's relevant to Bonsai - you should make a new issue for it.

> does `time` have to be a shell built in? how about a simple coreutil for bench-marking instead? `time` isn't a [built-in in dash](https://git.kernel.org/pub/scm/utils/dash/dash.git/tree/src/bltin) (there is a `times` that does something else) nor [in bash](https://git.savannah.gnu.org/cgit/bash.git/tree/builtins). I believe on my laptop time(1) is [the busybox implementation](https://git.busybox.net/busybox/tree/miscutils/time.c). I wonder how simply a good benchmarking tool could be implemented, and if it's relevant to Bonsai - you should make a new issue for it.
Author
Owner

I’ve been thinking about it and I think the qi shell needs to be rethought from the ground up. There’s a reason variables are separate from plaintext and I think probably the complications that would come with using plaintext variable names are not worth the convenience.

I’ve been thinking about it and I think the `qi` shell needs to be rethought from the ground up. There’s a reason variables are separate from plaintext and I think probably the complications that would come with using plaintext variable names are not worth the convenience.
Owner

What should we use for subshells? My initial thought was to use curly braces ({}) but if we do that it will conflict with our plans for format(1).

What if a subshell left-brace necessarily preceded a newline or a comment?

if cond {
    out this is a subshell
}
if not cond { # this is fine
    # because only whitespace or comments separate
    # the left-brace from the newline
    out this is a subshell too
    format "this is not a subshell -> "{} \
"because the left brace didn't precede a newline"
}

# though this makes some things funky
out {
    ls .
} {
    ls ..
}

The funkiness would be fine, though, because subshells really aren't often necessary in shell scripting (use xargs(1p)!), and if they are they often benefit from their contents being on a new line anyway (though I tried and failed to find code that demonstrates subshell commands on a new line).

> What should we use for subshells? My initial thought was to use curly braces (`{}`) but if we do that it will conflict with our plans for [`format(1)`](https://git.tebibyte.media/bonsai/coreutils/issues/43). What if a subshell left-brace necessarily preceded a newline or a comment? ```sh if cond { out this is a subshell } if not cond { # this is fine # because only whitespace or comments separate # the left-brace from the newline out this is a subshell too format "this is not a subshell -> "{} \ "because the left brace didn't precede a newline" } # though this makes some things funky out { ls . } { ls .. } ``` The funkiness would be fine, though, because subshells really aren't often necessary in shell scripting (use xargs(1p)!), and if they are they often benefit from their contents being on a new line anyway (though I tried and failed to find code that demonstrates subshell commands on a new line).
Owner

not sure if this has been said already, but i can't quickly find any mention of it so i'll do so now. due to shebangs, it's basically required that # is the comment character. while we could register a binfmt handler for //! or whatever, that's obv a stupid hack and shouldn't be done. it doesn't look like anyone is considering anything other than #, but i feel it's worth stating regardless.

not sure if this has been said already, but i can't quickly find any mention of it so i'll do so now. [due to shebangs](https://github.com/torvalds/linux/blob/c85af715cac0a951eea97393378e84bb49384734/fs/binfmt_script.c#L41), it's basically required that `#` is the comment character. while we could register a binfmt handler for `//!` or whatever, that's obv a stupid hack and shouldn't be done. it doesn't look like anyone is considering anything other than `#`, but i feel it's worth stating regardless.
Author
Owner

I’m going to work on getting this issue split out into a few others to make it easier to work with.

I’m going to work on getting this issue split out into a few others to make it easier to work with.
emma closed this issue 2024-08-04 09:57:42 -06:00
emma unpinned this 2024-08-04 09:57:52 -06:00
Sign in to join this conversation.
No Milestone
No project
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: bonsai/harakit#8
No description provided.