Due date: 10/30/18 by 10 am
Write a function,
fasta_to_csv(input_file, output_file), that converts a fasta file, such as c_elegans_mirnas.fa to a comma separated file (csv) using regular expressions.
The function should exit gracefully if the files can't be opened.
Input File >cel-let-7 TGAGGTAGTAGGTTGTATAGTT >cel-lin-4 TCCCTGAGACCTCAAGTGTGA >cel-miR-1 TGGAATGTAAAGAAGTATGTA
Output File cel-let-7,TGAGGTAGTAGGTTGTATAGTT cel-lin-4,TCCCTGAGACCTCAAGTGTGA cel-miR-1,TGGAATGTAAAGAAGTATGTA
Write a function
motif_finder(input_file, motif), that returns the number of times a sequence motif occurs in a sequence file, such as c_elegans_chrI.fa (note that the sequence is lowercase).
The function should have the following attributes:
1) The function should allow for any number of Ns (where N is a wild card that can be any nucleotide - A, C, G, T) specified by the user to be present in the motif (e.g. TGANNNTCA). The user should specify the exact sequence, which may contain Ns, but the user should indicate how many Ns there are and where they occur within the sequence. Some possible motifs: TTTTCNGA, NATAAA, NATNAA.
2) The function should exit gracefully if the file can't be opened.
3) The function should count motifs that span multiple lines.
Submit your assignment as a file upload on canvas.