User Tools

Site Tools


assignment5

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
assignment5 [2018/10/18 08:51]
dokuroot created
assignment5 [2018/10/20 17:37] (current)
dokuroot
Line 1: Line 1:
 +~~NOTOC~~
 ===== ASSIGNMENT 5 ===== ===== ASSIGNMENT 5 =====
 ---- ----
Line 8: Line 9:
  
 ====Exercise 1==== ====Exercise 1====
-Write a function, '​transposer(matrix)',​ that swaps the rows and columns of a matrix and returns the results as a new matrix.+Write a function, ​''​transposer(matrix)'', that swaps the rows and columns of a matrix and returns the results as a new matrix.
  
 For example: For example:
Line 19: Line 20:
  
 \\ \\
 +
 +**Additional Hints** (avoid reading these until you get stuck)\\
 +\\
 +1) Your code will likely have a nested for loop. Use the length of the nested list within the matrix (''​len(matrix[0])''​) for the number of iterations in the outer for loop.\\
 +\\
 +2) Use the length of the matrix (''​len(matrix)''​) for the number of iterations in the nested for loop.\\
 +\\
 +3) Append each element from the original nested lists within the matrix (''​matrix[n][i]'',​ where n and i are the two iteration variables in the for loops - but which order?) to a new list within the nested for loop and then append the new list to a second new list in the outer for loop.
 +\\
 +
 ====Exercise 2==== ====Exercise 2====
  
-Write a function, '​miRNA_counter(input_fastq_file,​ input_fasta_file,​ output_file)',​ counts the number of times each miRNA appears in a small RNA high-throughput sequencing library.+Write a function, ​''​miRNA_counter(input_fastq_file,​ input_fasta_file,​ output_file)'', ​that counts the number of times each miRNA in a fasta file appears in a small RNA high-throughput sequencing library.
  
-The function should read in, line by line in a for loop, a fastq file containing the small RNA library data and create a dictionary with the key being the sequence and the value being the number of reads for that sequence (you'​ll need to use the get method and use the approach demonstrated in class)+fastq file: [[ http://rna.colostate.edu/​dokuwiki/​doku.php?​id=sample_data | small_RNAs.fastq]]\\ 
-Next, the function read in a fasta file containing the miRNA sequences, again line by line, store the miRNA name as a variable, and then calculate the number of reads for the corresponding sequence, using the get method and the dictionary of fastq sequences from above.+fasta file: [[ http://​rna.colostate.edu/​dokuwiki/​doku.php?​id=sample_data | c_elegans_miRNAs.fa]]
  
 +The function should:
 +\\
 +1. Read in, line by line in a for loop, a fastq file containing the small RNA library data. \\
 +\\
 +2. Create a dictionary with each key being the sequence and the value being the number of reads for that sequence (you'​ll need to use the get method and use the approach demonstrated in class). \\
 +\\
 +3. Read in a fasta file containing the miRNA sequences, again line by line.\\
 +\\
 +4. Store the miRNA name as a variable, and then calculate the number of reads for the corresponding sequence, using the get method and the dictionary of fastq sequences from above.\\
 +\\
 +5. Write to an output file, the name of the miRNA and the number of reads in tab delimited format.
  
   #​input_fastq_file:​   #​input_fastq_file:​
Line 49: Line 71:
   lin-4 0   lin-4 0
  
 +**Additional Hints** (avoid reading these until you get stuck)\\  ​
 +1) Your first for loop should loop through the fastq file creating a dictionary with the sequence lines (''​if (line_number + 3) % 4 == 1''​) as the keys and the numbers of times each sequence appears as the values. You can increment the values using get (''​fastq_dictionary[sequence] = fastq_dictionary.get(sequence,​ 0) + 1''​).\\
 +\\
 +2) The second for loop should loop through the fasta file and retrieve the number of reads for each sequence and should write the miRNA name and reads to a new file (''​outfile.write(mirna_name[1:​] + '​\t'​ + str(fastq_dictionary.get(line,​ 0)) + '​\n'​)''​).
assignment5.1539874298.txt.gz · Last modified: 2018/10/18 08:51 by dokuroot