User Tools

Site Tools


assignment6

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
assignment6 [2018/10/25 08:52]
dokuroot
assignment6 [2018/10/29 09:30] (current)
dokuroot
Line 6: Line 6:
 ===Exercise 1=== ===Exercise 1===
  
-Write a function, ''​fasta_to_csv(input_file,​ output_file)'',​ that converts a fasta file, such as c_elegans_mirnas.fato a comma separated file (csv) **using regular expressions**.+Write a function, ''​fasta_to_csv(input_file,​ output_file)'',​ that converts a fasta file, such as {{ :c_elegans_mirnas.fa.gz | c_elegans_mirnas.fa }} to a comma separated file (csv) **using regular expressions**.
 \\ \\
-The function should ​accept ​the two arguments from the commmand line.+\\ 
 +The function should ​exit gracefully if the files can't be opened.
  
   Input File   Input File
Line 25: Line 26:
 ===Exercise 2=== ===Exercise 2===
  
-Write a function ''​motif_finder(input_file,​ motif)'',​ that returns the number of times a sequence motif occurs in a sequence file, such as {{ :​c_elegans_chri.fa.gz | c_elegans_chrI.fa}} (note that the sequence is lowercase). The function should allow for any number of Ns to be present in the motif (e.g. TGANNNTCA) ​and should ​require ​the user to pass the input file name and motif to the function ​from the command line+Write a function ''​motif_finder(input_file,​ motif)'',​ that returns the number of times a sequence motif occurs in a sequence file, such as {{ :​c_elegans_chri.fa.gz | c_elegans_chrI.fa}} (note that the sequence is lowercase). 
 +\\ 
 +\\ 
 +The function should have the following attributes:​ 
 +\\ 
 +\\ 
 +1) The function should allow for any number of Ns (where N is a wild card that can be any nucleotide - A, C, G, T) specified by the user to be present in the motif (e.g. TGANNNTCA).  The user should ​specify ​the exact sequence, which may contain Ns, but the user should indicate how many Ns there are and where they occur within ​the sequence. Some possible motifs: TTTTCNGA, NATAAA, NATNAA. 
 +\\ 
 +\\ 
 +2) The function ​should exit gracefully if the file can't be opened. 
 +\\
 \\ \\
-Your program ​should count motifs that span multiple lines.+3) The function ​should count motifs that span multiple lines.
 \\ \\
  
  
 **Submit your assignment as a file upload on canvas.** **Submit your assignment as a file upload on canvas.**
assignment6.1540479171.txt.gz · Last modified: 2018/10/25 08:52 by dokuroot