qi(1)
: The qi shell
#8
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This issue thread has been created for the discussion of the future features of the Bonsai Computer System’s default shell.
a problem i often run into is prepending and inserting lines to a file or stream of text. i was thinking our shell could have a redirect similarly to >> in POSIX shell, but allow prepending and appending: for prepending,
>^
and for appending,>$
. this would mean no more>>
, though.I would prefer not to support appending or prepending, which can be accomplished with subshells or cat(1p). I would prefer any functionality that can be moved to external utilities be so to keep code minimal and, in the case of scripting using the shell, idiomatic. Currently there are many ways to append in sh(1p) in POSIX:
All of these solutions are also ways to prepend (
in
becomesprologue
,epilogue
becomesin
).A stream editor (literally, not like sed(1p)) that supports insertion of text before arbitrary lines in standard input may be nice.
trinity and i talked about this already but we want to replace case/esac with match statements that act similarly to Rust matches.
i would also like to see curly braces ({ }) used instead of if then fi and for/while do done, but if someone has a better solution im all ears
no elif. else if please
Curly braces could open a block the same way they do in C, where variables created inside the block are local to it and can't escape but variables from outside the block can be modified as usual, and an arbitrary amount of statements can go inside blocks. So
if cond { statements... }
,if cond statement;
. Separatingcond
fromstatement
would be a problem though. Parens are kinda gaudy for the purpose.if(foo) bar(baz);
doesn't visually indicate well enough thatif
is not a function nor similar.Then
if cond statement; else if cond statement; else statement;
can look like(if (cond) (statement) (else (if (cond) (statement) (else (statement))))
internally, the same way C does it, which is pretty nice.I think
&&
and||
should go or they should replaceif
entirely.That doesn't look terrible but I'm not the biggest fan. But then of course how does
if(cond && cond)
work?..i would prefer not to see scoping in the shell. as for the way conditionals work, what exactly is the issue with this?:
This is probably for the best. I mention it because this is how C functions with statement blocks.
I find how C does statement blocks to be delightful, with statement blocks being collections of statements but operating as a single statement. e.g.
This strikes me as elegant and simple to implement, and the issue of separating the condition from the body arises to support the
if cond statement
pattern. C uses parens (if (cond) statement;
).Probably the best option, though, is yours; to enforce curlies all the time, requiring a statement block specifically as the body of a flow control statement. This is fundamentally more readable and also as a result prevents the classic error of adding statements to the body of an
if
only to realize you forgot to put all the statements in a block.Both Emma and I plan to have math be in a separate utility or utilities and not built into the shell at all.
I don't like the shell built-in
shift
.It seems to mainly exist as an equivalent to the C idiom
++argv; --argc;
which is (given the benefit of >30 years of hindsight since C89) a complicating and not very readable way to save memory processing arguments.Its definition in POSIX makes it unintuitive for shell scripters. Emma should elaborate here when fae has time, fae has more experience with this.
It necessitates storing
"$0"
as a new shell variable to be able to continue using it, which shell authors writing diagnostics messages and following good practice always want to do.It's inflexible; it can only be used for the one task of shifting through arguments.
shift
is the only real way to iterate over arguments ($1
,$2
,$3
...) in shell. Particularly with the following idiom:What if variables could hold the names of other variables? Not like pointers in C but simply:
This could be used for better argument usage:
for
in this example is a shell built-in that, for each line in standard input, sets the given variable (in this casei
) to that line's content before running the commands in the curly braces. seq(1) in this usage prints a new-line delimited list of integers in the sequence described in its arguments.out
is an analogue to echo(1) as currently being described in #27.This might be a terrible idea that can be used to write terrible scripts. Also, there is no real concept of betta(1) yet, so I'm basing my examples off sh(1p).
shift(1) does not modify the value of
$0
, actually. From shift(1p):Maybe we could have a shift-like utility that will “shift” any variable with IFS-delineated text (including $@)?
Fuckin hell I've been using
argv0
for years.betta(1): The betta shell feature threadto qi(1): The qi shell feature threadThe name is changed because Trinity and I verbally discussed a shorter name to match the short and simple vibe of
/bin/sh
. In the future we may consider an interactive shell that wrapsqi(1)
with some convenience features.qi(1): The qi shell feature threadto `qi(1)`: The qi shell feature threadThoughts on this?
https://drewdevault.com/2023/07/31/The-rc-shell-and-whitespace.html
I've been asked to submit my grumblings about POSIX shell quoting to this issue. I am lacking sufficient energy to write eloquently (or at all) about this so here's an example.
Plan 9's rc(1) doubles quote runes to escape them:
Which I've always felt is quite nice.
The reason sh(1p)'s quotes are fucked is because not only is the backslash used to escape quotes in
"
-wrapped strings but it's also used for regular escape sequences. For example:POSIX shell quoting is abhorrent and the worst part about shell scripting by far. Escape sequences shouldn't be supported except by a theoretical format(1) (printf(1p) but Bonsai and improved).
Drew Devault's rc(1) bastardization (it is probably nice in practice) seems consistent but still as overcomplicated as shell is.
I'd like Plan 9 rc(1) quotes/escapes without a difference between
"
- and'
-wrapped strings and no escape sequences besides""
or''
, but I think this is better rather than good. I'm not sure what good would look like. Probably something to do with ASV.I like single quotes being the way to literal-ize strings, i.e.,
"$(cat)"
is expanded to the output ofcat
but'$(cat)'
is literally$(cat)
.The
"$()"
pattern is horrid.()
- subshell$()
- oh wait, i need standard output"$()"
- oh wait, quotingMy ideal (Lisp!) shell syntax would be
qi(1) isn't intended to be a Lisp-oid but probably I will write something like that someday.
Should piping be done by the shell?
I've had an old program idea that I'm still working on called pspipe(1) - it pipes given commands. So
pspipe [ cat ] [ less ]
is equivalent tosh -c 'cat | less'
. It was conceived because I planned to write a shell with the express purpose of doing as little as possible - to offload functionality usually achieved by shell syntax alone to other programs that can be used no matter what the present shell is, and in cases where their use might be more intuitive or elegant than the shell constructs.There's currently
qi
andbetta
planned -qi
being the core, simple shell useful for shebangs and scripting, andbetta
being the better (for interative use) shell with all the bells and whistles. I think|
could be left out ofqi
and a piping utility used instead, andbetta
could either have built-in piping or automatically use a piping utility when it sees the pipe syntax.I'm not sure how much I like this myself but my intent is to have as little code in
qi
as possible so it's easily dissectible by anyone who wishes to use it (and knows Rust|C) and has an hour to burn.I think piping and redirects should be in both shells and I’m not too sure about the shells actually having differing syntax. I do think we should interrogate how piping and redirects work in POSIX and work from there.
`qi(1)`: The qi shell feature threadto `qi(1)`: The qi shellThe way POSIX shell does variable assignment is awful.
Shell variable assignment isn't consistent with most syntax and really finnicky about whitespace. I propose a better way:
set
would need to be a shell builtin to change the local environment; it would have the usageset [variables...] [value]
to facilitate setting multiple variables at the same time:POSIX shell already has a
set
builtin for modifying"$@"
and its components (as well as configuring the behavior of the shell itself); we should rethink the functionality offered by POSIXset
(reserved shell variables for configuration, for example).I like this, but could we use
let
instead?How will variables behave?
POSIX shell doesn't use variable declarations; any variable is the empty string at start (giving
test -z
its utility).We could copy that behavior but it sort of sucks and makes typoes sometimes catastrophic. Remember the Steam
rm -rf "$STEAMROOT/"*
bug? An unset variable would evaluate as the empty string, thus making the expression used/*
and nuking everything.In other languages (Rust, for example),
let
expressions declare variables (and their inexistence prior to that expression). I think we should uselet
and treat variable use before declaration as an error.We could also use
set
to declare environment variables to export to child processes, like sh(1p)'sexport
. I don't think it should be possible toexport
variables already declared withlet
, because knowing at declaration whether or not a variable will beexport
ed seems to me to be more useful than being able to change it.We could default to variables being immutable like Rust but I don't know how useful this is and it seems like it would cause more problems than it would solve.
I don’t see much point to this.
I would prefer to clash with POSIX shell as little as possible on syntax. Maybe
set
could be an exception but if there’s a better idea I’d rather that.This makes sense to me.
I have further thoughts on shell quoting.
When I think of program execution I think of the
exec
function family in C's<unistd.h>
:Or
subprocess.run
in Python:Or Rust's
std::process::Command
:What these all have in common is that they have clear distinction between arguments, and if one wanted to use a variable as an argument it would be easy:
Meanwhile POSIX shell wants you to die:
No wonder people are desperate to use any interpreted programming language as a shell, asking if Python is a good fit and actually using Common Lisp. That being said, quoting every shell argument is at best inconvenient, with the example
"mm" "-i" "a b"
being 4 extra keypresses to type and up to 8 including the shift key.I think we should start by mandating some useful rules that are already often followed by cautious scribes:
I don't think this will be very controversial. While escapes are convenient (an easy way to avoid navigating back to the beginning of the line, adding a quote, and then going back to the end just for one or two spaces) they're easy to mess up catastrophically:
This also seems non-controversial. I have more to say but am out of time to write so will comment this right now.
Alright, this is the continuation of my last comment.
The behavior of the traditional POSIX shell with regards to unquoted variable expansion is useful, sometimes, but usually unwanted and a pain to deal with. In Python if I wanted that behavior I'd use str.split:
The C standard library has no such helper function (the functionality offered by
str.split
could be replicated though) and Rust is as of now beyond me.Accidentally tapped the button. That thought's incomplete and I'll finish it later.
alright i haven't slept in a few days and have important things to be doing rn, what better idea than to propose shell syntax. this is poorly thought-out and full of holes. have fun deciphering and feel free to harass me if that takes too long
i haven't read through like any of the posts here so i might repeat or redefine things that've been discussed already
i will be using words here. maybe (read: probably) even misusing words. here's a best-effort explanation of my nonsense:
editor's note: i simplified this quite a bit, only one word remains standing. you're welcome.
emma and i were discussing variable assignment and the potential usage of a
let
term, which is used to define terms. some pseudoishcode snippets from the conversation to elucidate on that a bit:some important observations from this that i've already made on your behalf:
let
itself can be redefined. also, spaces symbols are valid.#
can be redefined.let
does this.hru
(we have anhru
?) also does this. let's call these argument-taking terms "operator terms", or maybe just "operators", and their arguments "operands".#
as an operator that takes all subsequent terms as its operands and does nothinglet
's operands, they're just terms. they may be taken literally, but they may also be operator terms.so we have operators, which operate on and consume the terms to their right. these terms may be operators themselves, but they may also just be literals. do you smell it yet? i smell it. it's the smell of polish notation. alright so what if we did polish notation in more places.
suppose there was a
pn(1)
utility, serving as the prefix version ofrpn(1)
:as the reader it is now your job to come up with a more exciting example than that bc that's as far as my thoughts are willing to go at this time of night.
there are issues with this. the main one that jumps out is that in your typical pn language, the interpreter is aware of how many terms will be consumed as operands by a given operator. however, when an operator can be an arbitrary executable, the number of terms consumed is completely ambiguous. there is a fix, which is to let the programmer define those bounds themselves:
oops. we should probably not do this.
realized that about halfway through typing out this textwall but was told to post it anyways so here ya go
This extends from the syntax I was considering yesterday:
why not update argv0? just curious, i don't have any strong opinions on the matter
Because
string
expands tostr
and the behavior is consistent with how aliasing works in POSIX shell.A question to ask is if I do this:
should this:
or this:
occur?
I like the idea of preserving the literal string (
'a'
) because otherwise this becomes a whole lot more complex to utilize.Actually on second thought I just realized that this is kind of pointless. I can’t think of any scenario where preserving the
'a'
would make sense, considering if you runc
then it is intuitive that it would try to runa
, expanding it, anyway. If you want a string literal'a'
in the output you can do it like this:I think preserving parameter expansion from POSIX shell is a good idea:
I guess this specific syntax won’t work if we’re planning on using
{ }
as a subshell, but we could do something similar.I was also thinking that perhaps all quoting should make characters literal so that you have to wrap them in our equivalent to
{ }
to use them. I’m not sure how good of an idea that is but it would help with not having to do something like:Perhaps there should be a way to change which file is used for randomness, in the shell. Like a
random_source
variable that expands to a file (like/dev/random
).This could potentially be catastrophic though (try changing the
random_source
to/dev/zero
and running a program that needs to be secure). I'm also not sure how this would be implemented.This would replicate the functionality of GNU shuf(1)'s
--random-source
in a way that makes more sense to me (related to #55).I really like this though some caveats come to mind that I'll mention when I think them through.
It would be nice to have a non-redifinable shell command
def
:All definitions are saved and
def
traverses definitions, so it's possible to use a previous definition of a variable.This is inspired by Forth though I'm not very familiar with it.
On second thought
def
when used without arguments should display variable usages. Here are some examples of its use to help explain it.First let's set some variables:
Variable content history with
def
with no arguments:This is a mess of information. Let's pick out one variable,
a
:Let's see
c
now:So this is a linear history of
c
's assignments. For each row showing assignment information, the first column refers to the values, the second refers to the number of definitions so far including itself, and the third refers to the relative placement to the current defintion (marked with a<-
).You can assign variables to previous definitions with
def [variable] [placement]
:And use the relative placements with
+n
or-n
:Thankfully, changing things when time traveling doesn't affect the future:
I would implement this similarly to how the Forth dictionary is implemented which was the inspiration for this idea.
Goodness... I think that would be a vector implementation. I don't know how to feel about it now.
Our flow control is still up in the air so ignore the weirdness besides
let
anddef
:I’m really not into
def
or having arrays/vectors in qi shell. If you want them you can use ASV.Speaking of, I’d like to have variables representing ASV characters since typing them is not possible.
I was thinking
sysexits.h(3)
values could be exposed by qi to the shell session:since both Trinity and I use those values in shell scripts, but they are hard-coded to whatever our sysexits.h says, so they’re not portable.
What should we use for subshells? My initial thought was to use curly braces (
{}
) but if we do that it will conflict with our plans forformat(1)
.have the curly braces start with a period, zig does it something like this.
does
time
have to be a shell built in? how about a simple coreutil for bench-marking instead?how different is it going to be from
expr
This was implemented as rpn(1) in src/rpn.rs, discussed in #21.
time
isn't a built-in in dash (there is atimes
that does something else) nor in bash. I believe on my laptop time(1) is the busybox implementation. I wonder how simply a good benchmarking tool could be implemented, and if it's relevant to Bonsai - you should make a new issue for it.I’ve been thinking about it and I think the
qi
shell needs to be rethought from the ground up. There’s a reason variables are separate from plaintext and I think probably the complications that would come with using plaintext variable names are not worth the convenience.What if a subshell left-brace necessarily preceded a newline or a comment?
The funkiness would be fine, though, because subshells really aren't often necessary in shell scripting (use xargs(1p)!), and if they are they often benefit from their contents being on a new line anyway (though I tried and failed to find code that demonstrates subshell commands on a new line).
not sure if this has been said already, but i can't quickly find any mention of it so i'll do so now. due to shebangs, it's basically required that
#
is the comment character. while we could register a binfmt handler for//!
or whatever, that's obv a stupid hack and shouldn't be done. it doesn't look like anyone is considering anything other than#
, but i feel it's worth stating regardless.I’m going to work on getting this issue split out into a few others to make it easier to work with.