3

hey i have the problem with bash script. It should run interproscan REST Im green in bash scripting. I found this script on the web:

#!/bin/bash
i=1
waitevery=30

mkdir -p out

for j in $(find `pwd` -type f -name "*.fa")
do
    echo "Iteration: $i; File: $j"
    filename=$(basename "$j")
    python3 iprscan5.py \
        --goterms \
        --pathways \
        --email=blabla@protonmail.com \
        --outfile=out/${filename} \
        --outformat=tsv \
        --quiet \
        $j & (( i++%waitevery==0 )) && wait
done

out:

Iteration: 1; File: /home/marcin/interproscan-5.57-90.0-64-bit/interproscan-5.57-90.0/test/xtr.fa
interpro_batch_submit.sh: 18: i++%waitevery==0: not found
Iteration: 1; File: /home/marcin/interproscan-5.57-90.0-64-bit/interproscan-5.57-90.0/test/xts.fa
interpro_batch_submit.sh: 18: i++%waitevery==0: not found
Iteration: 1; File: /home/marcin/interproscan-5.57-90.0-64-bit/interproscan-5.57-90.0/test/xtt.fa
interpro_batch_submit.sh: 18: i++%waitevery==0: not found
Iteration: 1; File: /home/marcin/interproscan-5.57-90.0-64-bit/interproscan-5.57-90.0/test/xtu.fa
interpro_batch_submit.sh: 18: i++%waitevery==0: not found
Iteration: 1; File: /home/marcin/interproscan-5.57-90.0-64-bit/interproscan-5.57-90.0/test/xtv.fa
interpro_batch_submit.sh: 18: i++%waitevery==0: not found
Iteration: 1; File: /home/marcin/interproscan-5.57-90.0-64-bit/interproscan-5.57-90.0/test/xtw.fa

Edit:

I think I forgot to add --sequence, but something was still wrong after it.

gringer
  • 12,758
  • 5
  • 21
  • 75
MTG
  • 135
  • 7
  • Also, I'm assuming you're saving the script as `submit.sh` and then running it. How are you running it exactly? – Ram RS Aug 16 '22 at 17:19
  • 2
    @RamRS `wait` will wait until all background jobs have finished. `(( i++%waitevery==0 ))` will be true whenever the current value of `i` modulo `waitevery` is 0, so this is a really neat trick to make the script wait for all background processes to execute every N processes. Quite clever, that. – terdon Aug 17 '22 at 11:15

2 Answers2

1

First of all never do for foo in $(find ...). That is very bad practice as it can't handle even slightly odd file names like those with a space. See bash pitfall #1.

Next, you don't need pwd, just run find -type f -name "*.fa" which will search in the current directory by default and unlike find `pwd`... , won't break if your path contains whitespace.

That's just general best practice, the real issue here is that you are probably running the script like this:

sh /path/to/script.sh

Right? And you are using a Debian-based system like Ubuntu, right? On such systems, /bin/sh is actually not bash but a very basic POSIX compatible shell called dash and dash doesn't understand the (( i++%waitevery==0 )) syntax.

Since you have a shebang line (#!/bin/bash), you can simply make your script executable (chmod a+x /path/to/script.sh) and then run the script directly like this:

/path/to/script.sh

Or, alternatively, run it explicitly with bash instead of sh:

bash /path/to/script.sh

Either should work and if your file names and paths are sane, the issues I mentioned above won't be relevant. But to keep things nice and clean, I would also change the script to (note that this assumes you are using Linux and GNU find):

#!/bin/bash
i=1
waitevery=30

mkdir -p out

find -type f -name "*.fa" -print0 |
  while IFS= read -r -d '' j; do 
    echo "Iteration: $i; File: $j"
    filename=$(basename "$j")
    python3 iprscan5.py \
        --goterms \
        --pathways \
        --email="blabla@protonmail.com" \
        --outfile=out/"$filename" \
    --outformat=tsv \
    --quiet \
    "$j" & (( i++%waitevery==0 )) && wait
done
terdon
  • 8,869
  • 3
  • 16
  • 44
  • I asked them how they're running 'submit.sh` for the reason you picked precisely - they're probably using `sh` and hitting a POSIX compatibility wall. – Ram RS Aug 17 '22 at 20:02
1

I'd guess that the specific error that's coming up is because there are spaces missing from the wait line. This seems to be a problem in the source script, i.e.:

$j & (( i++%waitevery==0 )) && wait

should probably be:

$j & (( i++ % waitevery == 0 )) && wait

but for me (following @terdon's reproduction advice via dash -c "i=9; ((++i%5==0)) && echo 'worked!'") that only changes the thing that is not found from i++%waitevery==0 to i++, suggesting that the post-increment operator is not supported in the shell. Post-increment is supported in bash, suggesting that this script is being run (as @RamRS commented) in an unexpected way.

Try running the script directly:

./interpro_batch_submit.sh

Or explicitly using bash:

bash ./interpro_batch_submit.sh

... but see @terdon's answer for a better approach to doing this.

gringer
  • 12,758
  • 5
  • 21
  • 75