User Tools

Site Tools


2018pipes2

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
2018pipes2 [2018/07/10 14:17]
erin created
2018pipes2 [2018/08/30 09:28] (current)
erin
Line 7: Line 7:
 **tee** - redirect stdout or stderr to multiple locations **tee** - redirect stdout or stderr to multiple locations
  
-:!: **Exercise**: ​Let's make a test file. Copy and paste the text below into a file called mini.gff+:!: **Exercise**: ​You probably already have the file called ''​mini.gff''​. If you don't, copy and paste the text below into a file by that name.
  
 <​code>​ <​code>​
-# A tester gff file.  +# A tester gff file.                                 
-# For testing pipes.  +# For testing pipes. ​                                
-chrV test CDS 789 809 . + . annotation info +chrV sacCer3_ensGene CDS 574807 575379 0.000000 - 0 gene_id "​YER190C-A";​ transcript_id "​YER190C-A";​ 
-chrII test CDS 24558 26798 . + . annotation info +chrII sacCer3_ensGene CDS 805038 805256 0.000000 - 0 gene_id "​YBR298C-A";​ transcript_id "​YBR298C-A";​ 
-chrV test CDS 789 809 . + . annotation info +chrV sacCer3_ensGene start_codon 575377 575379 0.000000 - . gene_id "​YER190C-A"; transcript_id "​YER190C-A";​ 
-chrI test CDS 233 236 . + . annotation info +chrII sacCer3_ensGene start_codon 805254 805256 0.000000 - . gene_id "​YBR298C-A";​ transcript_id "​YBR298C-A";​ 
-chrIV test CDS 1234 7654 . - . annotation info +chrII sacCer3_ensGene exon 805035 805256 0.000000 - . gene_id "​YBR298C-A";​ transcript_id "​YBR298C-A";​ 
-chrI test CDS 233 236 . + . annotation info +chrIII sacCer3_ensGene exon 309070 310155 0.000000 + . gene_id "​YCR105W";​ transcript_id "​YCR105W";​ 
-chrII test CDS 24558 26798 . + . annotation info +CHRII sacCer3_ensGene start_codon 805351 805353 0.000000 + . gene_id "​YBR299W";​ transcript_id "​YBR299W";​ 
-CHRI test CDS 11565 11951 . + . annotation info +CHRIII sacCer3_ensGene start_codon 310958 310960 0.000000 + . gene_id "​YCR106W";​ transcript_id "​YCR106W";​ 
-chrII test CDS 24558 26798 . + . annotation info +chrV sacCer3_ensGene exon 574804 575379 0.000000 - . gene_id "​YER190C-A";​ transcript_id "​YER190C-A";​ 
-chrIII test CDS 13678 137888 . + . annotation info +chrV sacCer3_ensGene stop_codon 575680 575682 0.000000 - . gene_id "​YER190C-B"; transcript_id "​YER190C-B";​
-CHRII test CDS 7997 8547 . + . annotation info +
-chrIII test CDS 13678 137888 . + . annotation info +
-chrIV test CDS 1234 7654 . - . annotation info +
-chrV test CDS 13363 13743 . + . annotation info +
-chrIV test CDS 1234 7654 . - . annotation info +
-chrIV test CDS 1234 7654 . - . annotation info +
-chrV test CDS 789 809 . + . annotation info+
 </​code>​ </​code>​
  
Line 37: Line 30:
 We can use sort to sort a file's lines into a new order... We can use sort to sort a file's lines into a new order...
  
-**sort usage:**\\+//**sort usage:**//\\
 **sort** [options] <​file.txt>​ ... **sort** [options] <​file.txt>​ ...
  
Line 43: Line 36:
  
 <code bash> <code bash>
-$sort mini.gff+$ sort mini.gff
 </​code>​ </​code>​
  
Line 59: Line 52:
 We can identify unique (or duplicated) lines in a pre-sorted file using the command **uniq**. We can identify unique (or duplicated) lines in a pre-sorted file using the command **uniq**.
  
-**uniq usage:**\\+//**uniq usage:**//\\
 **uniq** [options] <​sortedFile.txt>​ **uniq** [options] <​sortedFile.txt>​
  
Line 69: Line 62:
  
 <code bash> <code bash>
-$sort mini.gff | uniq+$ sort mini.gff ​| uniq 
 +</​code>​ 
 + 
 +That wasn't very interesting. What if we do this just for the first column... 
 + 
 +<code bash> 
 +$ cut -f 1 mini.gff 
 +$ cut -f 1 mini.gff | sort 
 +$ cut -f 1 mini.gff | sort | uniq
 </​code>​ </​code>​
  
Line 88: Line 89:
  
 <code bash> <code bash>
-$wc mini.gff | tee wc_output.txt+$ wc mini.gff | tee wc_output.txt
 </​code>​ </​code>​
  
Line 94: Line 95:
  
 <code bash> <code bash>
-$wc mini.gff skdjfldj 2>&1 | tee wc_stdoutstderr.txt+$ wc mini.gff skdjfldj 2>&1 | tee wc_stdoutstderr.txt
 </​code>​ </​code>​
  
 ---- ----
  
-:!: **Exercise**:​ Can you write a series of pipes that will determine how many unique ​chromosomes are represented in mini.gff?+:!: **Exercise**:​ Can you write a series of pipes that will output a list of unique ​transcript_id entries from mini.gff?
   * [[wiki:​pipes2_hint1|hint1]]   * [[wiki:​pipes2_hint1|hint1]]
   * [[wiki:​pipes2_hint2|hint2]]   * [[wiki:​pipes2_hint2|hint2]]
 +  * [[wiki:​pipes2_hint3|answer]]
  
  
-[[wiki:exercises2|{{:​wiki:​slide2_arrow.png?​40|}}]] [[wiki:exercises2|Further Exercises2]]+[[assignments:2018assignment3|{{:​wiki:​slide2_arrow.png?​40|}}]] [[assignments:2018assignment3|Assignment 3]]
  
2018pipes2.1531253850.txt.gz · Last modified: 2018/07/10 14:17 by erin