# Computational biology at CSU

## DSCI 512: RNA-seq

### Questions?

exam

This is an old revision of the document!

## Exam (100 pts)

INSTRUCTIONS

You cannot use any modules that have to be imported into your code.

Upload a single file named exam_yourname.py containing your functions.  Any content outside of the function must be commented (#).

1. [20 pts] Write a function, `nt_content(sequence)`, that calculates the proportion of each nucleotide (A, C, G, T) in a sequence. For example, the sequence AATTGCCCCC would return A: 0.2, C: 0.5, G: 0.1, T: 0.2

2. [20 pts] Write a function, `character_count(input_file)`, that returns the number of characters (all characters including white space, except new lines) in a file. The return value should be a single number. The function should work with any size file (i.e. don't read the entire file into memory). The function should exit gracefully if the file can't be opened and should not leave the file open.

For example:

```>seq1
ATG
>seq2
CTA```

would return 16 (i.e. 16 characters)

3.  Write a function, `matrix_mean(matrix)`, that returns the mean value for each column in a matrix as a list. For example, the matrix [[1,2,3], [5,6,7]] would return [3, 4, 5].

4.  Write a function, `element_finder(some_list, some_element)` that returns as a dictionary key-value the number of occurrences for a particular element (some_element) in a list (some_list). For example, element_finder(['ATG', 'TAG', 'TTT', 'TAG', 'TTT'], 'TTT') would return {'TTT': 2}.

5.  Write a function, `motif_finder(sequence, motif)`, that returns the number of occurences of the substring motif in the string sequence. For example, given the sequence 'CCCGAGCCCGAGGAGCCC' and the motif 'GAG', the function would return 3. 