|New terms in this lesson|
| ||Number of arguments|
| ||Save output of cmd|
| ||Individual Positional parameters|
| ||Exit status of previous command|
| ||All positional parameters|
Ultimately, we want the script to work on the command line like this
$ revComp sequence [reverse complement of sequence]
From within the script, there has to be a way to access what the user types on the prompt. The technical term for this is positional parameters, in that you can access them by “position” on the command line as variables in the script.
$ scriptname first second third etc
#!/bin/bash echo "$1 equals first" echo "$2 equals second" echo "$3 equals third" echo "$4 equals etc"
There are also special variables that are useful:
Let's change revComp to work on command line arguments.
#!/bin/bash # Assign a DNA sequence to a variable (sequence): #sequence=ACTGTACGGTACAC sequence=$1 echo $sequence | tr [ACTGactg] [TGACtgac] | rev
I have commented-out the hard-coded declaration of
sequence and replaced it with a statement that uses the first argument of the command.
Now test it by running the script, but giving it a sequence on the command line.
$ revComp ACTGTACGGTACAC GTGTACCGTACAGT $ revComp TAT ATA $ revComp GGG CCC
But what happens if we give it something unanticipated? Like nothing, or something that's not a sequence?
$ revComp $ revComp trash hstra
The most common practice of this is to have your program tell you how to use it, or what you did wrong.
In order to respond to different types of input, we need to read variables and react to their values.
$ revComp revComp -- Return the reverse complement of a DNA sequence. Usage: revComp sequence
We're going to tell the user (that's you, or whomever else runs your script), how to run the script when it is not given any arguments.
There are a few approaches here, but an easy way is to check if the 1st positional parameter is blank. This indicates that there were no arguments supplied on the command line.
Here is the syntax:
if [ -z "$1" ]; then # no arguments given # print usage message and exit fi
if condition; then statement(s) elif other condition; then statement(s) else statement(s) fi
if must be followed by conditional expressions. The keyword
then marks the beginning of statements that are executed if the conditionals resolve to TRUE. OPTIONALLY, an if-elif-else construct can contain an
else clause. The
fi (“if” backwards), closes the statement.
Spaces are important: The single bracket enclosure, isolated by spaces, signifies a simple conditional. A conditional always evaluates to true or false. I use TRUE and FALSE in all caps from here on in order to signify these special boolean values.
You can think of it as asking a question that is expressed through operators and operands. Is one quantity greater than another? Does a file exist? Is a variable blank?
The last question, “is a variable blank?” is evaluated with the
-z comparison operator.
-z evaluates whether a string is null (or has zero length).
-n, evaluates whether a string is not empty (or has great-than-zero length).
Let's do an exercise to get a feel for it.
[ -z "" ] # evaluates to TRUE [ -n "" ] # evaluates to FALSE [ -z "hi" ] # evaluates to FALSE [ -n "hi" ] # evaluates to TRUE
Let's incorporate an alternate, but equivalent syntax on the command line:
$ test -z "" $ [ -z "" ] $ test -z "nonempty"
Unfortunately, there is no output on the command line to show you the outcome of the conditional. It is behind the scenes.
To make use of the hidden outcome, you can use the
$ if test -z ""; then echo "it is empty"; fi
Up-arrow and change the tested string to something non-null, such as:
$ if test -z "funkadelic"; then echo "it is empty"; fi
Obviously here the conditional evaluates to false, and nothing is printed. Let's add statements for the alternate outcome:
$ if [ -z "funkadelic" ]; then echo "it is empty"; else echo "it is not empty"; fi it is not empty
A convenient shorthand for the above is available for statement blocks that only have one line, like the examples above.
$ [ -z "funkadelic" ] && echo "it is empty" || echo "it is not empty" it is not empty
There is much less typing in this example, but the operators are a little obtuse. The
&& is a boolean AND operator, but here you can read if as
then, while the
|| is a boolean OR operator, but here you can read it as
We'll see more about how this works when we talk about exit statuses in this lesson, and compound conditionals in a future lesson.
When you type on the command line, semicolons are a substitute for hitting the return key. Try doing the above example, but hitting the return key everywhere there is a semicolon.
$ if test -z "funkadelic" >
Hold up. See your prompt has changed to a '>'. That means that more input is expected. If you screw up here, you'll get a syntax error, so the stakes are high!
$ if test -z "funkadelic" > fi -bash: syntax error near unexpected token `fi'
The shell is not having it.
The shell needs a then keyword to tell it that you're done building the conditional. It's just the way it is.
$ if test -z "funkadelic" > then > echo "yes" > else > echo "no" > fi no
If you up-arrow, you'll see the single-line version.
$ if test -z "funkadelic"; then echo "yes"; else echo "no"; fi
Let's add this decision-making capacity to our script, and also print an usage message:
#!/bin/bash if [ -z "$1" ]; then echo "revComp -- Return the reverse complement of a DNA sequence." echo "Usage: revComp sequence" exit fi sequence=$1 echo $sequence | tr [ACTGactg] [TGACtgac] | rev
Try it out!
$ ./revComp.sh revComp -- Return the reverse complement of a DNA sequence. Usage: revComp sequence $ ./revComp ATG CAT
The code now gives an uses message and exits
Why do we use
[ -z "$1" ]
[ -z $1 ]
The answer is an issue of best practice. In a loosely-typed language such as bash, the programmer has to enforce certain constraints. This practice is taught in this class, but is best learned by experience. Quote your variables in comparisons so that they do not evaluate to a syntax error.
What do we do about the other problem, of getting a non-DNA sequence? This may not seem like a very likely problem, but it will be a useful illustration of how to check the input and respond.
We can build on the tools we've used so far to filter the input sequence.
#!/bin/bash #### check for no arguments #### if [ -z "$1" ]; then echo "revComp -- Return the reverse complement of a DNA sequence." echo "Usage: revComp sequence" exit fi sequence=$1 #### check for invalid input #### filtered=$(echo $sequence | tr -d [ACTGactg]) if [ -n "$filtered" ] then echo "Invalid characters detected in '$sequence' = '$filtered'" >&2 exit 1 fi #### take the reverse complement #### echo $sequence | tr [ACTGactg] [TGACtgac] | rev
I used some new features to accomplish this last task:
And I did some unusual things to respond to the error:
tr to delete the valid characters, leaving only invalid ones. By storing the result of this command, I was able to use it later.
filtered=$(tr -d [ACGTacgt])
An alternate formulation is to use the backquote
`, but this version is less portable.
filtered=`tr -d [ACGTacgt]`
As always, there can be no spaces surround the equal sign in the assignment.
Let's try it out:
$ revComp CACx Invalid characters detected in 'CACx' = 'x' $ revComp AAAeCCGTGAwGTGGAwGcgtwAC! Invalid characters detected in 'AAAeCCGTGAwGTGGAwGcgtwAC!' = 'ewww!'
Conventionally, messages that are intended to be separate from normal output are printed to standard error.
On the command line, you can do (use single quotes):
$ echo 'BAD!' >&2 BAD!
It looks the same as:
$ echo 'BAD!'
This is so error messages don't get printed into the normal program output.
The exit status of a program is used by the shell to see if a job failed. We used
exit once before without arguments, but it defaulted to 0. Anything greater than 0 signifies a non-standard program completion, including error.
You can use a hidden variable
$? to see the exit status of a program.
$ echo "hi" $ echo $? 0
0 means OK.
$ asdfasdf -bash: asdfasdf: command not found $ echo $? 127
127 means command not found.
1 refers to miscellanous errors, which is how we ended our script when the input was wrong.
In pipelines, you aren't interested in the specific error code if there is one. It only makes sense to check to see that the exit status of a command is ZERO (or NOT ZERO), to see if you should continue.
if [ $? -eq 0 ]; then echo 'OK!'; else echo 'error!'; fi
if [ $? -ne 0 ]; then echo 'error!'; else echo 'OK'; fi
Next lesson Loops and More Conditionals 🙉