# Computational biology at CSU

### Site Tools

wiki:2018scripting

# Differences

This shows you the differences between two versions of the page.

 — wiki:2018scripting [2018/07/05 15:41] (current)david created 2018/07/05 15:41 david created 2018/07/05 15:41 david created Line 1: Line 1: + ~~NOTOC~~ ​ + =====INTRODUCTION TO SCRIPTING===== + ---- + + ====OUTLINE==== + -Why scripts are useful - a prerequisite to scripting. + -Text editors. + -Bash scripting basics. + + **Script:** A series of commands or instructions to automate a task. The commands are written in a text file that is then executed by a program without being first compiled (converted into the binary machine code). \\ + + **Scripting language:** A computer programming language that supports scripts. The scripts are typically interpreted by the program and do not have to be compiled.  ​ + + ====★ Interactive Exercise: Prerequisite to scripting==== + In this exercise, we’ll discuss variable assignment and manipulation. The goal is to demonstrate the need to automate repeated tasks, which will be done in the next exercise in which we introduce scripts. + + **1.** Open a shell or terminal window.\\ + + **2.** Assign a DNA sequence to a variable (**sequence**):​ + \$ sequence=ACTGTACGGTACAC + **3.** Complement the sequence using **tr**: + \$ echo \$sequence | tr [ACTGactg] [TGACtgac] + **4.** Reverse the sequence using **rev**: + \$ echo \$sequence | rev + **5.** Reverse complement the sequence using **tr** and **rev**: + \$ echo \$sequence | rev | tr [ACTGactg] [TGACtgac] + **6.** Calculate the length of the sequence using **wc -c**: + \$ echo -n \$sequence | wc -c + + This exercise required multiple steps and if it is something we wanted to repeat, it could be easily automated with a script, as demonstrated below. + ​ + ====Text Editors==== + A good text editor is essential for writing scripts. ​ Microsoft Word and Mac TextEdit should not be used for writing scripts. + + **Common Free Text Editors** + * Mac - TextWrangler http://​www.barebones.com/​products/​textwrangler/​ + * Windows - Notepad++ https://​notepad-plus-plus.org + * Linux  - gedit https://​wiki.gnome.org/​Apps/​Gedit + + ​ + ====★ Interactive Exercise: Bash scripting basics==== + In this exercise, we will write a bash script to identify the length, the complement, the reverse, and the reverse complement of a DNA sequence. + + **1.** Open a shell or terminal window.\\ \\ + **2.** Create a new directory named **Bash_Scripts** using **mkdir**.\\ \\ + **3.** Change into the **Bash_Scripts** directory using **cd**.\\ \\ + **4.** Open a text editor, such as **TextWrangler** or **Notepad++**,​ using **open** and use it to create a new file called **iseq.sh** within the **Bash_Scripts** directory. In the next steps, we will write a bash script by adding commands to the file.\\ \\ + **5.** Confirm that you are in the **Bash_Scripts** directory using **pwd** and that the **iseq.sh** script is in the directory using **ls**.\\ \\ + **6.** Insert a shebang that directs the terminal to bash within the **iseq.sh** file: #​!/​bin/​bash\\ \\ + **7.** Prompt the user for a sequence and store as a variable (**seq**) using **read -p**: + read -p "Enter a sequence: " seq + **8.** Determine if the sequence entered is DNA or RNA using a conditional statement: + if [[ \$seq =~ U ]] + then + ​type=RNA + elif  [[ \$seq =~ T ]] + then + ​type=DNA + else type=unknown + fi + **9.** Complement the sequence using **tr** and store as new variable (**comp**): + comp=`echo \$seq | tr [ACTGactg] [TGACtgac]` + **10.** Reverse the sequence using **rev** and store as a new variable (**reverse**):​ + reverse=`echo \$seq | rev` + **11.** Reverse complement a sequence using **tr** and **rev** and store as a new variable (**revcomp**):​ + revcomp=`echo \$seq | tr [ACTGactg] [TGACtgac] | rev` + **12.** Calculate the length of the sequence using **wc -c** and store as a new variable (**length**):​ + length=`echo -n \$seq | wc -c` + **13.** Print to the shell the output of each of the above steps along with a brief description:​ + echo ""​ + echo "​Original sequence: \$seq" + echo "Your sequence is \$type" + echo "​Complement:​ \$comp" + echo "​Reverse:​ \$reverse"​ + echo "​Reverse complement: \$revcomp"​ + echo ""​ + **14.** Return to your terminal window and execute the script using **./​iseq.sh**. ​ //Did you receive an error message? ​ How can you make the script executable?//​ \\ \\ + **15.** Edit **iseq.sh** to repeat indefinitely using a while loop as follows.\\ \\ + **a.** Add **while** before the read command. + while read -p "Enter a sequence: " seq + **b.** Insert a new line and the command **do** directly after the previous line containing **while….**:​ + do + **c.** At the end of the script add a new line and the command **done**: + done + **16.** Return to your terminal window and execute the script again.
wiki/2018scripting.txt · Last modified: 2018/07/05 15:41 by david