User Tools

Site Tools


assignments:ex10

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
assignments:ex10 [2017/10/27 14:19]
dokuroot
assignments:ex10 [2018/11/05 15:14] (current)
dokuroot
Line 4: Line 4:
 ---- ----
  
-===== Small RNA Data Analysis Part 1: quality control and read mapping =====+===== Small RNA data analysis (part 1): quality control and read mapping =====
  
 \\ Document the exercise in your electronic lab notebook. ​ \\ \\ Document the exercise in your electronic lab notebook. ​ \\
Line 10: Line 10:
 NOTE: text in quotes within commands must be changed based on your directory and naming of files. \\ \\ NOTE: text in quotes within commands must be changed based on your directory and naming of files. \\ \\
  
-**1.**  ​Download the small RNA data from the montgomery lab server using ''​Cyberduck''​ or ''​FileZilla''​The path to the folder containing the data is as follows:+**1.**  ​Open a terminal window\\
  
-  ​Documents/SmallRNA_Data/prg1+**2.** Change into the ''​prg1''​ directory: ''/​Users/​bz360/​Documents/SmallRNA_data/prg1''​.\\
  
-**2.** Change into the **prg1** directory.\\ +**3.** There are two small RNA libraries: 1) ''​wt_fastq.txt.gz'' ​and 2) ''​prg-1_fastq.txt.gz''​. Examine the contents of the two files using ''​zmore''​. \\
- +
-**3.** There are two small RNA libraries: 1) **wt_fastq.txt.gz** and 2) **prg-1_fastq.txt.gz**. Examine the contents of the two files using ''​zmore''​. \\+
  
 **4.** Decompress the files using ''​gzip''​. \\ **4.** Decompress the files using ''​gzip''​. \\
Line 22: Line 20:
 **5.** Examine the fastq files in ''​FastQC''​. How many reads are there in each library and what is the read length? Record this information in your lab notebook. \\ **5.** Examine the fastq files in ''​FastQC''​. How many reads are there in each library and what is the read length? Record this information in your lab notebook. \\
  
-**6.** Use ''​Trimmomatic''​ to remove adapter sequences and reads shorter than 16 nt:+**6.** Use ''​Trimmomatic''​ to remove adapter sequences ​and filter poor quality reads and reads shorter than 16 nt:
  
-  $ trimmomatic SE -phred33 ​'​input_file'​ '​output_file'​ ILLUMINACLIP:/​Users/​bz360/​Documents/​Trimmomatic_Files/​TruSeq-smallRNA.fa:​2:​30:​10 MINLEN:16+  $ trimmomatic SE '​input_file'​ '​output_file'​ ILLUMINACLIP:/​Users/​bz360/​Documents/​Trimmomatic_Files/​TruSeq-smallRNA.fa:​2:​30:​10 MINLEN:​16 ​AVGQUAL:30
  
 What proportion of reads were dropped? What proportion of reads were dropped?
  
-**7.** Examine the contents of the two original files and the two new files using ''​more''​. What is different ​from the original files? \\+**7.** Examine the contents of the two original files and the two new files using ''​more''​. What is different ​between ​the trimmed and original files? \\
  
-**8.** Examine the trimmed fastq files in FastQC. Now how many reads are there in each library and what is the read length? Record this information in your lab notebook. \\ +**8.** Index the C. elegans genome for Bowtie2 (notice that a fasta formatted file containing the genome sequence is in the ''​prg1'' ​folder):
- +
-**9.** Index the C. elegans genome for Bowtie2 (notice that a fasta formatted file containing the genome sequence is in the **prg1** folder):+
  
   $ bowtie2-build '​genome_input_file.fa'​ '​genome_name'​   $ bowtie2-build '​genome_input_file.fa'​ '​genome_name'​
  
-**10.** Create a new folder within the **prg1** directory called ​**bowtie_cel** using ''​mkdir''​. Move all the files related to the bowtie index and the fasta formatted genome sequence into the **bowtie_cel** folder using ''​mv'':​+**9.** Create a new folder within the ''​prg1'' ​directory called ​''​bowtie_cel'' ​using ''​mkdir''​. Move all the files related to the bowtie index and the fasta formatted genome sequence into the ''​bowtie_cel'' ​folder using ''​mv'':​
  
   $ mv c* bowtie_cel   $ mv c* bowtie_cel
  
-**11.** Map reads from each adapter-trimmed library (wt and prg-1 from step 6) to the C. elegans genome using ''​Bowtie2''​. While waiting on bowtie, if you haven'​t done so already, examine the original and trimmed fastq files in ''​FastQC''​.+**10.** Map reads from each adapter-trimmed library (wt and prg-1 from step 6) to the C. elegans genome using ''​Bowtie2''​. While waiting on bowtie, if you haven'​t done so already, examine the original and trimmed fastq files in ''​FastQC''​.
  
-  $ bowtie2 -x '​path_to_bowtie_index/​prefix'​ -U '​fastq_file_name'​ -S 'strain.sam' +  $ bowtie2 -x '​path_to_bowtie_index/​prefix'​ -U '​fastq_file_name'​ -S 'sampleID.sam'
-  +
-NOTE: record the total number of reads and total mapped reads for each library in your lab notebook.\\+
    
-> TOTAL READS     TOTAL MAPPED READS +NOTE: record the total number of reads and total mapped reads for each library in your lab notebook: 
-> wt:   ​prg-1: ​  ​wt:   ​prg-1:+\\ 
 +Total Reads wt: 
 +> Total Reads prg-1: 
 +> Total Mapped Reads wt: 
 +> Total Mapped Reads prg-1:
  
 **Submit your answer to the following question on Canvas:** \\ **Submit your answer to the following question on Canvas:** \\
assignments/ex10.1509135597.txt.gz · Last modified: 2017/10/27 14:19 by dokuroot