User Tools

Site Tools


wiki:2018basic_scripting

Basics of Scripting πŸ™†

Variables and Constants

In programming, when you want to act on a value, you include it in a statement.

For example, this statement prints the value β€œhello” to the screen.

$ echo "hello"
hello

The command echo prints values to the screen (stdout), and can be used in pipes.

It is common to act on a value that is stored in a variable.

$ greeting="hello"
echo $greeting
hello

Values can be

  • typed in (literal),
  • computed,
  • or passed from other sources
    • such as read from a file or
    • from the arguments to your script.

Variables are absolutely necessary for the latter usage above.

Variable syntax

Variables are so-named because they can change. They can be assigned different values throughout the lifetime of the script. Constants can only be set once.

Although constants are important in low-level languages, they have limited use in BASH scripting. We'll focus only on variables.

The dollar sign: '$'

The dollar sign '$' has multiple uses in different contexts.

As a prompt

Until now, you have only seen the dollar sign '$' as the single character prompt.

$ command

The prompt is a string that the shell prints to the screen to show you that it's OK to type a new command. It is not part of the command or an actual BASH statement. iWe will go through an exercise below to learn about it.

To dereference variables

In the context of scripting, the dollar sign is used to dereference a variable.

# set a variable
variable="Lorem Ipsum"
# access the value of a variable: dereference it
echo $variable

Two new concepts:

  • dereference: access the value of a variable
  • string interpolation: substituting a placeholder in a template string

The '$' is necessary to distinguish two similar statements:

# 1. Regular (a literal) string 
echo "This is a statement."
# 2. String with a variable.
echo "This is a $statement."

Output from the first version is the string as written.

But, in the 2nd version, echo returns a string with some value substituted for $statement if it has been assigned. The echo command has interpolated the value of the variable into the string.

β˜… Interactive Exercise: Variable Interpolation

Let's try it:

$ catcolor="black"
$ echo "my cat is $catcolor"
my cat is black
$ catcolor="orange"
$ echo "my cat is $catcolor"
my cat is orange
$ statement="statement of purpose"
$ echo "This is a statement"
This is a statement
$ echo "This is a $statement"
This is a statement of purpose

Try some others.

$ hi="Hello"
$ echo "$hi there!"
Hello there!
$ question="How"
$ echo "$question is this possible?"
How is this possible?
$ question="Why"
$ echo "$question is this possible?"
Why is this possible?

And so on.

Notice how we are setting a string to a variable, and then using that string in an interpolation later.

Let's work with a special variable, PS1, which holds a string to define your prompt.

β˜… Interactive Exercise: Changing your Prompt

In a previous lesson, I introduced environmental variables. By convention, they are in all capital letters.

Here are some common ones:

Environmental variable name Purpose
HOME Your home directory
HOSTNAME Name of your computer
PATH :-delimited list of executable directories
PS1 Pattern to use as your prompt
PWD Your current directory (as returned by pwd)
USER Who you are logged in as (same as returned by whoami)

Some are for informational, such as USER, HOSTNAME, and HOME. It doesn't make sense to change them. Others can have a useful effect if changed, such as PATH and PS1.

The prompt, PS1, is a string that is evaluated after every command.

On my desktop it is:

$ echo $PS1
\h:\w\$ 

On my summit account, it is:

$ echo $PS1
[\h \w]\$

The escape sequences have special meaning to the shell so that your prompt can be informative.

Here are some useful ones:

escape sequence description my opinion
\h The hostname up to the first `.' Pretty useful if you frequently log on to other computers
\H The full hostname Often times is too long to be printed every command
\w the current working directory, with $HOME abbreviated with a tilde This can be too long, if you have a complicated directory structure, sometimes it's necessary. I use this one on summit.
\W the basename of the current working directory Only the last part of your $PWD. Use if \w is too long.

A more complete list

You can also just use a string without control characters if you want:

Here are some things to try:

$ PS1='> '
> echo $PS1
>
> PS1='--> '
---> PS1='hello, human: '
hello, human: echo $PS1 

Scripting revComp

We're going to use what we learned about variables to write revComp.

0. Return to your ''dev'' directory.

$ cd ~/dev 

1. Create a new file called "revComp"

Open a new file in your text editor. Save it as revComp.

Enter the following text:

#!/usr/bin/env bash
 
# Assign a DNA sequence to a variable (sequence):
sequence=ACTGTACGGTACAC
echo $sequence

1a. Save the file. Then run it.

$ bash revComp
ACTGTACGGTACAC

1b. Make it executable.

Last time we did a+x which means add executable to all. This time we'll use a numeric argument to chmod (change mode).

$ chmod 755 revComp
$ ./revComp
ACTGTACGGTACAC

The number 755 means

  • owner - 7 - Read/Write/Execute
  • group - 5 - Read/Execute
  • other - 5 - Read/Execute

The numbers here reflect the use of a binary number as a checklist. Each bit assumes 1 if checked, 0 otherwise. It's a way to pack an option list into a single number.

r w x Binary Decimal Permissions
No bits set
000 0 No permissions
One bit set
βœ” 001 1 Execute only
βœ” 010 2 Write only
βœ” 100 4 Read only
Two bits set
βœ” βœ” 011 3 Write/Execute
βœ” βœ” 101 5 Read/Execute
βœ” βœ” 110 6 Read/Write
All bits set
βœ” βœ” βœ” 111 7 Read/Write/Execute

The three decimals together (for user/group/other) constitute the file mode, i.e., the file's permissions.


Now we're going to add functions as we go:

2. Complement the sequence using tr:

The command tr is a tool to translate one set of characters to another.

# Complement the sequence using tr:
echo $sequence | tr [ACTGactg] [TGACtgac]

Save and run again

$ bash revComp
CACATGGCATGTCA

3. Reverse the translated sequence using rev:

The command rev is a tool to reverse the order of a stream of characters.

echo $sequence | tr [ACTGactg] [TGACtgac] | rev 

Save and run again

$ bash revComp
GTGTACCGTACAGT

4. The finished script

#!/usr/bin/env bash
 
# Assign a DNA sequence to a variable (sequence):
sequence=ACTGTACGGTACAC
echo $sequence | tr [ACTGactg] [TGACtgac] | rev 

This has given us a template to take the reverse complement of a sequence that is already known. To make it useful, we want to be able to specify the sequence on the command line.

Intro to Arguments, Flow control πŸ™Ž

wiki/2018basic_scripting.txt Β· Last modified: 2018/09/04 11:22 by david