# Computational biology at CSU

exam2

## Exam (100 pts)

INSTRUCTIONS

You cannot use any modules that have to be imported into your code.

Upload a single file named exam_yourname.py containing your functions.  Any content outside of the function must be commented (#).

1. [20 pts] Write a function, `gc_content(sequence)`, that calculates the proportion of nucleotides in a sequence that are either G or C. For example, the sequence ATGC would return 0.5.

2. [20 pts] Write a function, `word_count(input_file)`, that returns the number of lines and characters (all characters including white space, except new lines) in a file. The return value should be a tuple. The function should work with any size file (i.e. don't read the entire file into memory). The function should exit gracefully if the file can't be opened and should not leave the file open.

For example:

```>seq1
ATG
>seq2
CTA```

would return (4, 16) (i.e. 4 lines, 16 characters)

3. [20] Write a function, `matrix_mean(matrix)`, that returns the mean value for each row in a matrix as a list. For example, the matrix [[1,2,3], [5,6,7]] would return [2, 6].

4. [20] Write a function, `element_counter(some_list)` that returns a dictionary of each unique element and the corresponding number of occurrences of the element in the list. For example, the list ['ATG', 'TAG', 'TTT', 'TAG', 'TTT'] would return {'ATG': 1, 'TAG': 2, 'TTT': 2}

5. [20] Write a function, `motif_coordinates(sequence, motif)`, that returns the start and end position of the first occurrence of a motif within a sequence as a tuple. Report the position relative to a starting position of 1 not the first python index of 0. For example, given the sequence ATGCTGTTAGCAG and the motif CAG, the function would return (11, 13).

exam2.txt · Last modified: 2018/11/05 09:24 by dokuroot