# Computational biology at CSU

## DSCI 512: RNA-seq

### Questions?

assignment5

This is an old revision of the document!

## ASSIGNMENT 5

Due date: 10/23/18 by 10 am

The exercises can be completed using only material covered so far.

### Exercise 1

Write a function, `transposer(matrix)`, that swaps the rows and columns of a matrix and returns the results as a new matrix.

For example:

```#Original matrix:
[[1,2],[3,4],[5,6]]```
```#Transposed matrix:
[[1,3,5],[2,4,6]]```

### Exercise 2

Write a function, `miRNA_counter(input_fastq_file, input_fasta_file, output_file)`, that counts the number of times each miRNA in a fasta file appears in a small RNA high-throughput sequencing library.

fastq file: small_RNAs.fastq
fasta file: c_elegans_miRNAs.fa

The function should:
1. Read in, line by line in a for loop, a fastq file containing the small RNA library data.

2. Create a dictionary with each key being the sequence and the value being the number of reads for that sequence (you'll need to use the get method and use the approach demonstrated in class).

3. Read in a fasta file containing the miRNA sequences, again line by line.

4. Store the miRNA name as a variable, and then calculate the number of reads for the corresponding sequence, using the get method and the dictionary of fastq sequences from above.

5. Write to an output file, the name of the miRNA and the number of reads in tab delimited format.

```#input_fastq_file:
@D64TDFP1:248:C50DMACXX:5:1101:1241:2095 1:N:0:ATCACG
TGAGGTAGTAGGTTGTATAGTT
+
CCCFFFFFHHHHHJIJGHJJJJI
@D64TDFP1:248:C50DMACXX:5:1101:1371:2154 1:N:0:ATCACG
TCAATATTTGCATAGGGTATC
+
CCCFFFFFHHHHHJJJJGFHI```
```#input_fasta_file:
>cel-let-7
TGAGGTAGTAGGTTGTATAGTT
>cel-lin-4
TCCCTGAGACCTCAAGTGTGA```
```#output_file:
let-7	1
lin-4	0```