Author Topic: what would you use to replace Bash for scripting? (Read 11432 times)

paulca · « **Reply #100 on:** June 21, 2023, 11:05:41 am »

As an example of a day job in software engineering.

I am prototyping and demoing service granularity in both configuration layer and the functional layer.

Problem: Big fat monolithical services/applications become a bear to maintain and extend.
Solution: Fragment them into many little services and have them collaborate via middleware (message bus etc.)

Problem: Having a few dozen git repos and a few dozen Jenkins build artifacts, several dozen docker images and a whole end to end maintenance, governance, security, version matrixes, release strategies and testing matrixes, etc, etc, for each... is a pain in the backside.

Solution: Mono or Uber repos. Place more than one service per repo. In fact bundle all related (by data or by tech/dependencies /integrations) into a single repo.

One step further: Polymorphic services. You would be familiar with this as the "symlink busybox pattern" in linux boot discs. Basically it's one application, but when you launch it, you selective enable components. Thus you can deploy all dozen services in one go to one "pod" or you can deploy a spread of them to meet the load of each step individually in an elastic fashion.

As to the code that goes into these services. That's the easy part.

Nominal Animal · « **Reply #101 on:** June 21, 2023, 01:11:14 pm »

Quote from: paulca on June 21, 2023, 10:46:37 am

You sound to me like a sys-admin and a solution integrator.

It is only a fraction of my background, the one that happens to be relevant here. I have also been paid to write code in at least half a dozen programming languages.

Note that this is the Programming sub-subforum of the Computers subforum; not the Embedded one. We're not talking about single-binary appliances, but systems with a proper kernel-userspace boundary, i.e. fully hosted systems in the C/C++ standards definition.

I already explained why 12k lines of Bash in a fully hosted Linux system, even in an embedded appliance, is nothing strange. I also showed how even a basic installation of desktop Linux Mint 20.3 using systemd init yields an order of magnitude more lines, ~110k, in Bash or POSIX shell launcher scripts in /bin, /usr/bin, /sbin, and /usr/sbin, and how to check the same on any Linux, BSD, OpenWRT, etc. system.

The fact that you still object, without any basis other than your own assertions, is the indicator that you really don't know what you are talking about.
I'm sure that kind of marketing tactic will get you far in any business organization, as they're only concerned with short-term and surface appearances, but here, among other technically adept people, you do have to base your boasts on verifiable facts, not your own authority or CV.

Quote from: paulca on June 21, 2023, 10:46:37 am

The only shell script I have written in the past year was a very carefully constructed admin "purge" job which was ONLY commissioned because the application had been hurried/forced into release without it's own admin.

There are software designers who believe they know how the software they designed should be allowed to be used, and then there are those who design tools to be used in whatever way users find appropriate, verified to accomplish those tasks in at least one way, but not limited to that one.

I've met a metric fuckton of Windows-based developers who believe only the former matters. They consider the Unix principle ridiculous and outdated, even though it is what still drives basically all internet-facing servers, except for a small handful of Windows machines. Perhaps they write quite sensible code for Windows, but I've yet to find one that can also design acceptable applications and services for other OSes, especially POSIXy ones. It is like their mind has been reprogrammed, or something.

Interface modularity is something you get if you understand how and why one would use scripts instead of libraries. The former is easily modified to adjust to each task, including new interfaces that appear after the application or service is in production, and the latter is when you provide a set of solutions to the end users, updated only if the entire application/service/system is upgraded (i.e., never). It is at the heart of the Unix philosophy, although the aforementioned Windows types do seem to object to it too, ignoring a few decades of successful implementations.

Interface modularity is exactly why I prefer to write graphical user interfaces in Python 3 and Qt 5, with the high-bandwidth, possibly properietary Secret Sauce written in C (for better portability, as C++ requires version-specific libraries) and dynamically linked to the Python process. Documenting the interface (fully for the Python side) is sufficient boundary for using GPL-licensed Qt, and any open source license (even a proprietary, non-copyleft open source license) for the Python interface, and any (open source or not, proprietary or copyleft) license for the C code.

Why so? Because the user interface part is the one users have most issues with. By designing the entire application so that it allows end users to modify (at their own risk) the user interface, without giving them access to the actually meaningful/sensitive/protection-worthy C code, or without them having to install any kind of development environment, you maximize the end user freedom with minimal business risk.

A big reason for the popularity of Inkscape is its extensions system, which allowed users and developers to write import and export filters for a surprising number of vector graphics formats. It, too, supports external "scripts" (any executable will do), with an XML description of the extension UI defining the user choices and passing those as command-line parameters to the external executable. The data extensions work on, is always in SVG format. Inkscape even provides inkex.py, a Python module that makes writing Inkscape extensions in Python quite easy.

With the entire user interface being user-modifiable, users willing to take the risk can fix and adjust the UI to best fit their workflow. With only a small marketing push, a company can do e.g. yearly competition on best UI modifications, and provide a repository of such unofficial and untested interfaces submitted by other users.

Of course, if your "software engineers" believe only monolithic and fully proprietary applications are the only sensible way to construct applications, they will never accept that or say using user-specified external programs or scripts to work on that user's data. After all, all data belongs to the application, not the user, eh?

Nominal Animal · « **Reply #102 on:** June 21, 2023, 01:59:46 pm »

I am also not interested in any kind of "winning" in this discussion, either. What I'd really want to see, is paulca checking all the files provided in their most recent full-Linux (kernel and userspace) appliance, and counting the number of shell scripts and the number of lines in them, and reporting it here.
Either the figure is basically zero, in which case they have a point, or it isn't and they do not. Facts are facts, and I want to make them visible over all the opinion-based fluff.

When you provide fully integrated appliances, say an OS controlling some industrial device, you are responsible for the entire product. Just claiming that a bug was in an upstream-provided script is not a valid answer. The 12kLOC of Bash DiTBho mentioned is pretty darn close to a minimum used in a stripped-down but functional appliance-style Linux installation. DiTBho can clarify, but I assumed that is the number of lines of shell scripts in the full system, not just in some custom init system or such; i.e. includes all the scripted parts that one needs to replace or verify, if they want to be able to answer for the entire system without resorting to the excuse at the beginning of this paragraph.

For a realistic real world data point on the use of scripts, perhaps looking at OpenWRT would be in order? It is widely used, used on very limited hardware (currently 8 MB of storage, 64 MB of RAM), and a very good example of a widely-used appliance. (The squashfs for openwrt-22.03.5-ramips-mt7620-asus_rt-ac51u-squashfs-sysupgrade.bin has 222 executable shell scripts, totaling 8514 lines; or 232 scripts that file reports as something 'shell script' totaling 8943 lines. Only about two thirds of the mentioned 12kLines, sure, but AFAIK DiTBho works on quite different types of appliances, and this is only a very basic router installation.)

MMMarco · « **Reply #103 on:** June 21, 2023, 04:03:57 pm »

Quote from: Nominal Animal on June 21, 2023, 09:30:05 am

To list the number of lines and the script name, I used
find /usr/bin /usr/sbin /bin /sbin -maxdepth 1 -type f -perm /o+x -print0 | (T=0; while read -d "" NAME ; do T=$[T+1]; file "$NAME" | grep -qe 'shell script' || continue ; wc -l "$NAME" ; done ; printf '%d files total\n' $T >&2 ) | awk ' NF>1 { L=L+$1 ; n=n+1 ; print } END { printf "%d scripts containing %d lines\n", n, L }'

And that's the proof that bash sucks for anything than basic stuff.

Picuino · « **Reply #104 on:** June 21, 2023, 04:43:43 pm »

Very concise, there's no denying that.
But it takes a lot to unravel it. Pipes, wc, find, awk, grep ... all together.
I appreciate that a single line program is capable of doing all that, but there have to be simpler ways to do the task, even if it entails writing more lines of code.

Nominal Animal · « **Reply #105 on:** June 21, 2023, 05:39:20 pm »

Quote from: MMMarco on June 21, 2023, 04:03:57 pm

Quote from: Nominal Animal on June 21, 2023, 09:30:05 am
To list the number of lines and the script name, I used
find /usr/bin /usr/sbin /bin /sbin -maxdepth 1 -type f -perm /o+x -print0 | (T=0; while read -d "" NAME ; do T=$[T+1]; file "$NAME" | grep -qe 'shell script' || continue ; wc -l "$NAME" ; done ; printf '%d files total\n' $T >&2 ) | awk ' NF>1 { L=L+$1 ; n=n+1 ; print } END { printf "%d scripts containing %d lines\n", n, L }'
And that's the proof that bash sucks for anything than basic stuff.

Well, it does use find, bash, file, grep, wc, and awk to get the work done.

You need to first construct a list of executable files. In Linux, all bytes except NUL and slash are valid in names, so I used -print0 to print the path to each file NUL-separated.

Next, you need to pick the files that are shell scripts, just the task for file and grep -qe 'shell script', as the former will include 'shell script' for POSIX ('POSIX shell script') and Bash ('Bourne-Again shell script'), and the latter will return success iff found, failure otherwise. I used a Bash/POSIX shell loop, with continue skipping all non-matching file types. For shell scripts, we want the line count, which wc was designed to report. The output is one line per file, with the line count first and then the file name or path. The loop also counts the total number of files seen, and prints the total count to standard error. For this to work, the entire loop section must be in a subshell.

Finally, the awk part counts the number of shell script files, and sums up the total number of files, printing each input record as it goes.

Because any file names with newlines will make the wc part confuse awk, it does not actually work for all possible file names, though.

This is better:

Code: [Select]

find /usr/bin /usr/sbin /bin /sbin -maxdepth 1 -type f -perm /o+x -print0 | xargs -r0 file -00 | gawk 'BEGIN { RS=FS="\0" } { files++; name=$0; getline; if ($0 !~ /shell script/) next; scripts++; n=0; RS="\n"; while (getline < name) n++; close(name); RS="\0"; lines+=n; printf "%9d %s\n", n, name } END { printf "%d files, of which %d shell scripts, containing %d lines\n", files, scripts, lines }'

or split into logical lines,

Code: [Select]

find /usr/bin /usr/sbin /bin /sbin -maxdepth 1 -type f -perm /o+x -print0 \
 | xargs -r0 file -00 \
 | gawk '
    BEGIN {
        RS=FS="\0"
    }

    {
        files++
        name=$0 
        getline
        if ($0 !~ /shell script/) next
        scripts++
        n=0
        RS="\n"
        while (getline < name) n++
        close(name)
        RS="\0"
        lines+=n
        printf "%9d %s\n", n, name
    }

    END {
        printf "%d files, of which %d shell scripts, containing %d lines\n", files, scripts, lines
    }'

The "find /usr/bin /usr/sbin /bin /sbin -maxdepth 1 -type f -perm /o+x -print0 | xargs -r0 file -00" part of the command produces a sequence of "filename\0type\0", i.e. filename and type as text, separated by NULs. It does this by piping the NUL-separated file names or paths (-print0), to xargs, which splits them back (-0 NUL-separated, and -r says not to run if there are no parameters to supply), and executes file -00 with each file name or path as a separate parameter, as many as can fit at a time. The -00 parameter tells file to output a single NUL after each filename and after the type of the preceding file.

Other than executing and piping the commands together, it does not use Bash at all. gawk can handle NUL-separated records by simply setting RS="\0". I set also the field separator to the same, because we don't want gawk to waste time splitting records (or lines) into fields. The main rule gets applied to each executable file found in those directories. If file did not report it as a shell script, it is only counted (as a file, not as a script). For shell scripts, we count the number of lines by reading each line (using newline as the record separator), and print the per-script count, and update the tally. The END rule is applied after all input has been processed, and it prints the summary.

In POSIX C, we could use nftw() to walk the trees, or scandir() to obtain the list of files in each directory. After checking the stats of the file (to make sure we only look at group-executable files), we can read the first line of each file using getline(), and if it looks like a valid shebang line for bash, dash, ash, or sh, count the number of lines in it (by reading each line), and otherwise ignore it. Count the number of files, the number of scripts, and the number of lines in script files, and print them, and you're done.

Do note that while the shell script stanza is concise, it also uses more than one process at a time. xargs will buffer file names so that we execute file the fewest number of times; it is faster than executing it once for each file (which you can do by using xargs -r0 -n 1 file -00 instead). The piped processes are run in parallel, and they are executed all at the same time, which means that on multi-core machines you use at least three cores if available. To do the same in C, you'll need to use threads, since otherwise a single core (at a time) will be used to perform the tasks. (You can use e.g. popen(), but executing anything once for each file found will be slow.)

Picuino · « **Reply #106 on:** June 21, 2023, 07:21:13 pm »

I started looking for a way to program the same application in Python and I have found that it is necessary to replicate the operation of standard unix tools like "file", "find", "wc", "grep", etc.
Some, like grep or wc, are easy to implement. Others like find are not so easy or immediate. Finally "file" needs an external library.
For some reason Unix tools have been around for so long in "good health".

gnuarm · « **Reply #107 on:** June 21, 2023, 08:06:02 pm »

Quote from: alm on March 20, 2023, 12:44:54 pm

Quote from: DiTBho on March 20, 2023, 12:28:50 pm
functions are not actually functions
-> can override without a warning
-> no actual return value
-> the number of arguments is not checked
-> arguments data-type is not checked

the integer data type is simulated by external tools
-> compare two numbers is ugly, and too error prone
Those may be arguments not to want to write shell scripts, but not to rewrite scripts that will probably have been tested for years or even decades on many systems.

I believe most modern systems don't actually use bash for init scripts, but something lighter weight like ash. Sure, the language is ugly, but there's a reason why it has been used for simple scripts for decades.

Perl and Ruby would be the obvious candidates in my mind, but I'm not so sure if they are much lighter than Python. If you want very low footprint, try Forth. It's used in some bootloaders.

I'm surprised Forth has not been discussed more. I know some people just can't accept a language that isn't actually compiled or that uses reverse polish notation. Sometimes it's surprising the limitations people put on themselves. Forth is an amazing tool for all sorts of work, especially close to the metal. It's what I use in my FPGA designs when I want a processor.

SiliconWizard · « **Reply #108 on:** June 21, 2023, 09:44:15 pm »

Quote from: Picuino on June 21, 2023, 07:21:13 pm

I started looking for a way to program the same application in Python and I have found that it is necessary to replicate the operation of standard unix tools like "file", "find", "wc", "grep", etc.

Uh? These are all individual programs. You can call them from Bash, but otherwise from any other scripting language that allows executing external programs, which both Lua and Python can.

Bash doesn't do anything else than spawning processes for any command you ask it to execute and that's something you can do in many languages.
The only thing that may be more or less difficult in other languages is directly piping one executable to another - that would be something to check.
For Python, this is where to start: https://docs.python.org/3/library/subprocess.html

Quote from: Picuino on June 21, 2023, 07:21:13 pm

Some, like grep or wc, are easy to implement. Others like find are not so easy or immediate.

Again why would you necessary want to reimplement them? Unless you're not talking about the topic, which was to replace Bash with another language, but talking about something different, like being able to run the same script on environments which do not provide the above listed tools, which would be a completely different endeavor. Bash and the standard command-line tools commonly found on Unix-like systems are two completely different things.

brucehoult · « **Reply #109 on:** June 21, 2023, 11:38:34 pm »

Quote from: MMMarco on June 21, 2023, 04:03:57 pm

Quote from: Nominal Animal on June 21, 2023, 09:30:05 am
To list the number of lines and the script name, I used
find /usr/bin /usr/sbin /bin /sbin -maxdepth 1 -type f -perm /o+x -print0 | (T=0; while read -d "" NAME ; do T=$[T+1]; file "$NAME" | grep -qe 'shell script' || continue ; wc -l "$NAME" ; done ; printf '%d files total\n' $T >&2 ) | awk ' NF>1 { L=L+$1 ; n=n+1 ; print } END { printf "%d scripts containing %d lines\n", n, L }'

And that's the proof that bash sucks for anything than basic stuff.

Here are the parts of it that are bash:

Code: [Select]

… | (T=0; while read -d "" NAME ; do T=$[T+1]; … | … || continue ; … ; done ; printf '%d files total\n' $T >&2 ) | …

In another language each "…" is typically written as system() or popen() containing exactly the same as whatever I replaced with the "…".

MMMarco · « **Reply #110 on:** June 22, 2023, 12:19:46 am »

Quote from: brucehoult on June 21, 2023, 11:38:34 pm

In another language each "…" is typically written as system() or popen() containing exactly the same as whatever I replaced with the "…".

Probably not. In higher level languages you rarely resort to using the command line interface.

Quote from: brucehoult on June 21, 2023, 11:38:34 pm

Here are the parts of it that are bash:

Code: [Select]
… | (T=0; while read -d "" NAME ; do T=$[T+1]; … | … || continue ; … ; done ; printf '%d files total\n' $T >&2 ) | …

Doesn't matter, the result is the same: an unreadable mess.

Nominal Animal · « **Reply #111 on:** June 22, 2023, 02:50:34 am »

I can read and parse (and even debug and maintain) such stanzas easily, but that's because some of my work involved a lot of conversions, generation and/or slicing of text-based data sets (variants of PDB format data for molecular dynamics simulations); my heavy use of awk is also from that era.

(But give me a few lines of simple Perl, and I break out in hives, due to a horrible experience maintaining some Perl code ages ago.)

When using syntax highlighting, the fixed stanza becomes much more readable:

(Apologies, I'm too lazy to convert the syntax highlighting to forum text colors; an image has to suffice.)

In POSIX C, one can use ssize_t len = getdelim(&line, &size, '\0', stdin); (with previously used and/or originally initialized char *line = NULL; size_t size = 0;) to read NUL-delimited inputs. It is an excellent way to receive e.g. paths from an external script (replacing stdin with a handle returned by popen(script_path,"r")) that does file format detection or other user-specified custom filtering, while still supporting all possible file names and paths in Linux. If you have more than one field, like file -00 provides pairs of NUL-terminated string, use different line pointer and size for each field. After all input is done, len is negative, and one can safely free() all line pointers (it is safe to do even if they are NULL).

I warmly recommend making sure you have similar abilities in Lua. File names and paths in Linux really do need to be treated as arbitrary byte sequences with path elements separated by '/' (47) and terminated with NUL (0). For even the above stanza to work, one may need to set LANG=C LC_ALL=C, if some file names contain illegal byte sequences for the current locale charset; this is typical if you have a partition with Windows 1252 -encoded file names mounted, but current locale uses UTF-8 character set.

The main use of shell scripts is to efficiently chain and combine simple utilities. This is worth it, because of combinatorial explosion: the number of possible combinations is much more than just the sum of options in each utility.

Consider, for example, how many different ways shells makes pipe use easy: You have $(...) (and equivalent old `...`) to insert the output of an executable (or subshell); ...|... to construct a pipe from the output of the left side to the input of the right side; and Bash <(...) which expands to the path of the read end of a pipe with its write end connected to the output of the executable or subshell. Both sides of each pipe execute concurrently, too, making easy use of multiple CPU cores if available.

A portable launcher script written in Bash or POSIX shell might contain
case "$(uname -o)" in
to set command-line arguments or environment variables based on the OS, and "$(uname -r)" if there are libraries or options that should be enabled only for specific kernel versions. It might even create a temporary directory (using mktemp -d), and populate it with symlinks to all libraries needed by the application at runtime, and point LD_LIBRARY_PATH environment variable to that, so that at each invocation, the compatible system-installed libraries will be used, and the rest used from a non-system application-specific directory suitable for that kernel.

brucehoult · « **Reply #112 on:** June 22, 2023, 06:52:59 am »

Quote from: MMMarco on June 22, 2023, 12:19:46 am

Quote from: brucehoult on June 21, 2023, 11:38:34 pm
In another language each "…" is typically written as system() or popen() containing exactly the same as whatever I replaced with the "…".

Probably not. In higher level languages you rarely resort to using the command line interface.

In other words, you simply don't want to do "scripting" [1] at all, whether in bash or otherwise.

Which is fine, but it makes your scripting opinions irrelevant in this thread.

[1] programatically combining the functions of a number of specialised stand-alone programs to achieve a desired result.

Nominal Animal · « **Reply #113 on:** June 22, 2023, 10:43:57 am »

Statistically, it can be interesting to look at the distribution of the number of lines in scripts (including the first, 'shebang', line; i.e. #!/bin/bash or equivalent, and including empty and comment lines). For example,

Code: [Select]

find /usr/bin /usr/sbin /bin /sbin -maxdepth 1 -type f -perm /o+x -print0 \
 | xargs -r0 file -00 \
 | gawk 'BEGIN {
             RS=FS="\0"
         }
         {
             name=$0
             getline
             if ($0 !~ /shell script/) next

             scripts++
             n=0
             RS="\n"
             while (getline < name) n++
             close(name)
             RS="\0"

             lines[n]++
             if (maxlines<n) maxlines=n
         }
         END {
             count=0
             for (n=0; n<=maxlines; n++)
                 if (lines[n]>0) {
                     count+=lines[n]
                     printf "%.1f%% have at most %d lines\n", 100*count/scripts, n
                 }
         }'

on my system, half of the launcher scripts have 75 lines or less, and a quarter are just 14 lines or less.

To skip empty and comment lines, including the initial shebang line, we can use a somewhat slower

Code: [Select]

find /usr/bin /usr/sbin /bin /sbin -maxdepth 1 -type f -perm /o+x -print0 \
 | xargs -r0 file -00 \
 | gawk 'BEGIN {
             RS=FS="\0"
         }
         {
             name=$0
             getline
             if ($0 !~ /shell script/) next
             printf "%s\0", name
         }' \
 | while read -d "" name ; do \
       sed -e '/^[\t ]*#/d' -e '/^[\t ]*$/d' "$name" | wc -l
   done \
 | awk '{
            scripts++
            lines[$1]++
            if (maxlines < $1) maxlines=$1
        }
        END {
            count=0
            for (n=0; n<=maxlines; n++)
                if (lines[n]>0) {
                    count+=lines[n]
                    printf "%.1f%% of scripts (%d) have at most %d lines\n", 100*count/scripts, count, n
                }
        }'

which drops the median count to 51 lines of code, and a third having only 17 lines of code or less, and two thirds having only a hundred lines or less.

The Bash while loop in the middle is useful, because we can then use sed to skip lines we are not interested in (those beginning with # indicating a comment, or empty), and just calculate the number of lines in the result. At this point, we're no longer interested in the file names, so the output of the loop is only the number of non-empty, non-comment lines in each shell script executable.

This means that it is reasonable to expect typical scripts to be quite short.

If we expand the search (find /usr /bin /lib /etc -type f -perm /o+x -print0), I find 3090 executable shell scripts on my system under these directory trees, with 72.7% at most a hundred LOC, half at most 43 lines, and one fifth ten lines or less, not counting shebang or comment or empty lines; further supporting that argument (since this is a deliberately typical Linux Mint installation I haven't modified).

What does it mean for scripts to be short? In my opinion, it supports my argument that scripts are extremely useful as modular interface glue.
_ _ _ _ _

When running various simulations and collecting data, I do not include any visualization libraries to the simulator; I output data in a hopefully efficient form that I can later (or concurrently during simulation) feed to various different visualizers – even on completely different machines.

For verifying the structures of trees, heaps, and forests, I often emit the graphs in Graphviz DOT format; output statistical data in plain decimal for graphing in gnuplot, and so on.

The downside (if you consider that), is that you have a lot of intermediate data, and its temporary storage does require pretty good bandwidth to keep up. (This leads to things like using M.2 PCIe SSD's in Linux in sw-RAID0/1 configurations, and other similar shenanigans.)
The upside is that the simulation process is clean of any visualization, reducing unnecessary complexity, making maintenance easier, and that I can easily control the CPU and I/O priority of the simulation(s) separately from the conversions, which is quite useful with limited computing resources.
_ _ _ _

There is always a tradeoff in selecting ones tools. I do not think scripts are any kind of panacea; their application is always some kind of a tradeoff.
The walls of text I've posted to this thread is not intended to get anyone to use scripts more, but to show possible cases where they are worth their tradeoffs (that do depend on the language and interpreter used!).

Rejecting all script use wholesale is rejecting an entire class of tools. Like saying nails are unnecessary because screws are better. I am not suggesting using nails instead of screws to build houses and such, just pointing out that things like finishing nails and pin nails are actually ubiquitous, in furniture, molding, etc; they're just often hidden, so you don't notice them unless you look/check.

What DiTBho is doing by replacing Bash with Lua is in my opinion like choosing a different material (than the standard galvanized or nickel-plated steel) for such finishing nails and pin nails, because of their better chemistry and longevity in the surrounding material, and possible new ways of examining their reliability and efficacy. It is a lot of work, but there is also potential there. Only the results will tell for sure.

MMMarco · « **Reply #114 on:** June 22, 2023, 11:05:18 am »

Quote from: brucehoult on June 22, 2023, 06:52:59 am

Quote from: MMMarco on June 22, 2023, 12:19:46 am
Quote from: brucehoult on June 21, 2023, 11:38:34 pm
In another language each "…" is typically written as system() or popen() containing exactly the same as whatever I replaced with the "…".

Probably not. In higher level languages you rarely resort to using the command line interface.

In other words, you simply don't want to do "scripting" [1] at all, whether in bash or otherwise.

Which is fine, but it makes your scripting opinions irrelevant in this thread.

[1] programatically combining the functions of a number of specialised stand-alone programs to achieve a desired result.

I don't know what you are talking about.

Your definition of "scripting" seems to be very odd.

Wikipedia both lists PHP and JavaScript as "scripting" languages, where you usually don't interact with other programs (awk, sed etc.) directly.

Picuino · « **Reply #115 on:** June 22, 2023, 11:53:43 am »

Quote from: SiliconWizard on June 21, 2023, 09:44:15 pm

Quote from: Picuino on June 21, 2023, 07:21:13 pm
I started looking for a way to program the same application in Python and I have found that it is necessary to replicate the operation of standard unix tools like "file", "find", "wc", "grep", etc.

Uh? These are all individual programs. You can call them from Bash, but otherwise from any other scripting language that allows executing external programs, which both Lua and Python can.

Bash doesn't do anything else than spawning processes for any command you ask it to execute and that's something you can do in many languages.
The only thing that may be more or less difficult in other languages is directly piping one executable to another - that would be something to check.
For Python, this is where to start: https://docs.python.org/3/library/subprocess.html

I am familiar with the subprocess lib. I have used it frequently to call some external tools like ImageMagick that sometimes seem to me better than the corresponding Python library (PIL / Pillow).
But calling an external program is usually more messy, slower and harder to communicate with than using the language's own tools.

Quote from: SiliconWizard on June 21, 2023, 09:44:15 pm

Quote from: Picuino on June 21, 2023, 07:21:13 pm
Some, like grep or wc, are easy to implement. Others like find are not so easy or immediate.

Again why would you necessary want to reimplement them? Unless you're not talking about the topic, which was to replace Bash with another language, but talking about something different, like being able to run the same script on environments which do not provide the above listed tools, which would be a completely different endeavor. Bash and the standard command-line tools commonly found on Unix-like systems are two completely different things.

I have no complex in declaring myself a Windows user. I'm sure there are many on the forum who look down on this option, but it seems to me an artificial posh position.
What I was getting at. In Windows I use Cygwin intensively and I could hardly do without the Unix environment it provides (especially bash and make). I have also used sometimes the linux environment for Windows WSL.
So I don't understand what you mean when you talk about environments that don't provide the standard Unix tools.

Nominal Animal · « **Reply #116 on:** June 22, 2023, 01:03:18 pm »

Quote from: Picuino on June 22, 2023, 11:53:43 am

But calling an external program is usually more messy, slower and harder to communicate with than using the language's own tools.

"Messy" and "harder" are opinions. I've found mine varies as time goes on.

"Slower" is complicated.

Executing an external process, be that a script or whatever, always has some latency.
External processes can run concurrently, on different cores, than the main process.
Data transfer between the parent and external process takes extra time, because the actual data needs to be copied (due to process separation).
Thus:

We can safely say that an external script or program does require more CPU effort (at minimum, the new process setup, and transferring the data between processes).
We can not safely say that using an external script or program will always take more wall-clock time, especially if the main process is single-threaded, or CPU-limited (heavy computation).
We cannot say that using less CPU time makes a process feel faster to us humans.

One "trick" I often use with my own MD simulators, where I want to save a snapshot of the system state every N time steps, is to simply copy the state to a pre-reserved area in memory, and have a dedicated thread (or MPI async send) write/send the data to storage, with the simulator only "paused" for the duration of the in-memory copy, not the write to the storage. The simulation has often proceeded several time steps, before all of the snapshot data has been saved. If I had done the save/send synchronously, the simulation would have been that much slower per wall clock time elapsed.

A practical example: Consider when you have an application where the Save operation takes a while.
If you do it synchronously, the user cannot proceed before the operation completes.
If you make an in-memory snapshot, and then delegate the saving of that snapshot to a separate thread, you can let the user do whatever they want, including modifying the data, and simply show a progress meter indicating how the save is proceeding. If the user tries to close the program, you do need to make them choose between waiting or cancelling that previously started operation, so it does add a bit of complexity.

Most application developers do not want to implement such in-memory snapshots, in particular because the saved state tends to be scattered in memory, and making an in-memory copy of it just to format and save it to a file is messy and hard. Typical solutions to the "don't freeze the interface while saving" lead to an implementation where the saved data may be inconsistent, if the user modifies the data to be saved before the save completes. (WontFix: Wait for the save to complete before you modify the document, user.)

When you implement the in-memory snapshot for file saving, say in Python and Qt for example, you'll find that the machinery you constructed for that, is just about perfectly suited to passing that data to and from helper processes. So, there are many user-facing applications that can be made much nicer ("faster", more responsive) for us humans to use, if one applies the techniques that one would use to pass the data to external programs, even if one does not end up using external programs or scripts. It just has to be designed in properly, and not tacked on as the only solution that lead the spaghetti to stick to the wall.

MMMarco · « **Reply #117 on:** June 22, 2023, 01:38:35 pm »

Quote from: Nominal Animal on June 22, 2023, 01:03:18 pm

Quote from: Picuino on June 22, 2023, 11:53:43 am
But calling an external program is usually more messy, slower and harder to communicate with than using the language's own tools.
"Messy" and "harder" are opinions. I've found mine varies as time goes on.

I have to agree with Picuino here, it's not an opinion, it's fact.

jfiresto · « **Reply #118 on:** June 22, 2023, 02:19:30 pm »

Quote from: Picuino on June 21, 2023, 07:21:13 pm

I started looking for a way to program the same application in Python and I have found that it is necessary to replicate the operation of standard unix tools like "file", "find", "wc", "grep", etc....

I finally got around to writing an involved script-style application of the sort I used to write with awk, expr, sed, sh and friends, but this time I used Python. At least for me, I have found it easier to just go ahead and replicate the bits of the standard UNIX tools I need and build up the scripting to a much higher level – as in, address the details and then forget about them. As an example, I need to scan the files in a directory, from oldest to newest, so that I process the few that have become stale and skip everything newer. What I ended up with was faster and easier, in development and execution (the latter out of Python), than using the standard UNIX tools.

Code: [Select]

    @classmethod
    def get_wip(cls):
        """Return the download times of the emails that are Work In Progress.   
                                                                                
        Return a dictionary of the download time, in seconds since the          
        epoch, for each header hash / download-time file name in the WIP        
        directory.                                                              
        """
        # Don't read the files just yet (see identifiers()). Using a Unix shell
        # command (e.g., ls -tr) to list the oldest to newest header-hash file  
        # names from a WIP directory containing 1000 or 10000 files is wordier  
        # and slower than the following dictionary comprehension.               
        #                                                                       
        times_dir = root_dir/'WIP'/'' + os.sep      # Create fast plain str     
        return {cls(name): os.path.getmtime(times_dir + name)
                for name in os.listdir(times_dir)}

brucehoult · « **Reply #119 on:** June 22, 2023, 02:44:30 pm »

Quote from: MMMarco on June 22, 2023, 11:05:18 am

Quote from: brucehoult on June 22, 2023, 06:52:59 am
Quote from: MMMarco on June 22, 2023, 12:19:46 am
Quote from: brucehoult on June 21, 2023, 11:38:34 pm
In another language each "…" is typically written as system() or popen() containing exactly the same as whatever I replaced with the "…".

Probably not. In higher level languages you rarely resort to using the command line interface.

In other words, you simply don't want to do "scripting" [1] at all, whether in bash or otherwise.

Which is fine, but it makes your scripting opinions irrelevant in this thread.

[1] programatically combining the functions of a number of specialised stand-alone programs to achieve a desired result.

I don't know what you are talking about.

Your definition of "scripting" seems to be very odd.

Wikipedia both lists PHP and JavaScript as "scripting" languages, where you usually don't interact with other programs (awk, sed etc.) directly.

Oh, really?

Code: [Select]

var shell = require('shelljs');

if (!shell.which('git')) {
  shell.echo('Sorry, this script requires git');
  shell.exit(1);
}

// Copy files to release dir
shell.rm('-rf', 'out/Release');
shell.cp('-R', 'stuff/', 'out/Release');

// Replace macros in each .js file
shell.cd('lib');
shell.ls('*.js').forEach(function (file) {
  shell.sed('-i', 'BUILD_VERSION', 'v0.1.2', file);
  shell.sed('-i', /^.*REMOVE_THIS_LINE.*$/, '', file);
  shell.sed('-i', /.*REPLACE_LINE_WITH_MACRO.*\n/, shell.cat('macro.js'), file);
});
shell.cd('..');

// Run external tool synchronously
if (shell.exec('git commit -am "Auto-commit"').code !== 0) {
  shell.echo('Error: Git commit failed');
  shell.exit(1);
}

Nominal Animal · « **Reply #120 on:** June 22, 2023, 03:02:03 pm »

Quote from: MMMarco on June 22, 2023, 01:38:35 pm

Quote from: Nominal Animal on June 22, 2023, 01:03:18 pm
Quote from: Picuino on June 22, 2023, 11:53:43 am
But calling an external program is usually more messy, slower and harder to communicate with than using the language's own tools.
"Messy" and "harder" are opinions. I've found mine varies as time goes on.
I have to agree with Picuino here, it's not an opinion, it's fact.

Asserting an opinion does not a fact make.

MMMarco · « **Reply #121 on:** June 22, 2023, 03:14:45 pm »

Quote from: brucehoult on June 22, 2023, 02:44:30 pm

Oh, really?

Code: [Select]
var shell = require('shelljs'); if (!shell.which('git')) { shell.echo('Sorry, this script requires git'); shell.exit(1); } // Copy files to release dir shell.rm('-rf', 'out/Release'); shell.cp('-R', 'stuff/', 'out/Release'); // Replace macros in each .js file shell.cd('lib'); shell.ls('*.js').forEach(function (file) { shell.sed('-i', 'BUILD_VERSION', 'v0.1.2', file); shell.sed('-i', /^.*REMOVE_THIS_LINE.*$/, '', file); shell.sed('-i', /.*REPLACE_LINE_WITH_MACRO.*\n/, shell.cat('macro.js'), file); }); shell.cd('..'); // Run external tool synchronously if (shell.exec('git commit -am "Auto-commit"').code !== 0) { shell.echo('Error: Git commit failed'); shell.exit(1); }

Yes, really. shelljs does not use the shell, it has a misleading library name and you fell for it.

shelljs is just a wrapper around native JavaScript APIs like the fs module etc.

For example, shell.sed does indeed NOT call the sed binary but instead emulates the expected behaviour with native JavaScript.

You can see this for yourself here:

Quote

var result = lines.map(function (line) { return line.replace(regex, replacement); }).join('\n');

The same can be said about shell.ls, it doesn't use the ls binary at all but emulates the behaviour of ls in pure JavaScript (see here).

As it should be. It would be idiotic to use the shell in a higher level language where you have the APIs available.

shelljs is just a library that emulates commonly used shell commands like cp, rm etc. but doesn't use them AT ALL.

What you did is proving my point, nobody uses the shell / external commands in JavaScript if they don't have to.

This is because of the following reasons:

Command line tools are designed to be used by humans, not other programs (generally speaking)
Command line tools, although there's some standardization with POSIX, are generally not-portable
APIs can (and are) be versioned, command-line tools not so much.
Because of API versioning, APIs tend to be very stable
APIs (at least, native language APIs) are cross-platform and have the same behaviour across platforms
... and some more

shelljs appears to cater to people accustomed to the UNIX command line - but that doesn't change the fact that this library isn't using the shell / external programs at all (which was my initial point).

MMMarco · « **Reply #122 on:** June 22, 2023, 03:15:21 pm »

Quote from: Nominal Animal on June 22, 2023, 03:02:03 pm

Quote from: MMMarco on June 22, 2023, 01:38:35 pm
Quote from: Nominal Animal on June 22, 2023, 01:03:18 pm
Quote from: Picuino on June 22, 2023, 11:53:43 am
But calling an external program is usually more messy, slower and harder to communicate with than using the language's own tools.
"Messy" and "harder" are opinions. I've found mine varies as time goes on.
I have to agree with Picuino here, it's not an opinion, it's fact.
Asserting an opinion does not a fact make.

It's not an opinion but general knowledge.

Nominal Animal · « **Reply #123 on:** June 22, 2023, 03:56:03 pm »

Quote from: MMMarco on June 22, 2023, 03:15:21 pm

Quote from: Nominal Animal on June 22, 2023, 03:02:03 pm
Quote from: MMMarco on June 22, 2023, 01:38:35 pm
Quote from: Nominal Animal on June 22, 2023, 01:03:18 pm
Quote from: Picuino on June 22, 2023, 11:53:43 am
But calling an external program is usually more messy, slower and harder to communicate with than using the language's own tools.
"Messy" and "harder" are opinions. I've found mine varies as time goes on.
I have to agree with Picuino here, it's not an opinion, it's fact.
Asserting an opinion does not a fact make.
It's not an opinion but general knowledge.

Just like Earth being flat, and the Sun orbiting the Earth. Right.

MMMarco · « **Reply #124 on:** June 22, 2023, 03:57:36 pm »

Quote from: Nominal Animal on June 22, 2023, 03:56:03 pm

Quote from: MMMarco on June 22, 2023, 03:15:21 pm
Quote from: Nominal Animal on June 22, 2023, 03:02:03 pm
Quote from: MMMarco on June 22, 2023, 01:38:35 pm
Quote from: Nominal Animal on June 22, 2023, 01:03:18 pm
Quote from: Picuino on June 22, 2023, 11:53:43 am
But calling an external program is usually more messy, slower and harder to communicate with than using the language's own tools.
"Messy" and "harder" are opinions. I've found mine varies as time goes on.
I have to agree with Picuino here, it's not an opinion, it's fact.
Asserting an opinion does not a fact make.
It's not an opinion but general knowledge.
Just like Earth being flat, and the Sun orbiting the Earth. Right.

I beg your pardon, that's a pretty ridiculous comparison.

Quote from: Picuino on June 22, 2023, 11:53:43 am

But calling an external program is usually more messy, slower and harder to communicate with than using the language's own tools.

Is true. Just as the earth is round whether you believe it or not


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: what would you use to replace Bash for scripting? (Read 11432 times)

Share me