Introduction to Computational Biology


Comp167, Spring 2016

Prof. Donna Slonim

Introduction to Computational Biology

slonim_AT_cs.tufts.edu

Monday and Wednesday, 1:30-2:45pm, Halligan Room 111B

http://www.cs.tufts.edu/~slonim

Office hours: Halligan 107B, Mondays 3:00-4:30; Fridays, 11-12; or by appointment.

TA Office hours: Halligan 107, Tuesdays, 4-5:30 or by appt. Piazza site


Course Description

 Policies

Course Materials


Tentative Schedule:

DATE TOPICS READING OPTIONAL READING
Mon., Jan. 25 Class overview and administrivia.
Introduction to sequences and sequence comparison.
Syllabus handout.
Zvelebil & Baum (ZB): Chapter 1 and Section 4.1
For CS students new to biology: Larry Hunter's article, Molecular Biology for Computer Scientists.
For BME students or others with less formal CS background: either Corman, Leiserson, Rivest and Stein Chapters 2 + 3, or Jones and Pevzner, Chapter 2: Bio O notation, NP-completeness.
Weds., Jan. 27 Sequence alignment:
Global alignment. Dynamic programming. Local alignment.
ZB: Sections 4.2, 4.5 (pp. 87-89 only); 5.2 Global alignment: Durbin, pp. 17-22.
Local alignment: Durbin, pp. 23-24, 29-30
Mon., Feb. 1 Sequence alignment: gaps, scoring matrices ZB: Sections 4.3, 4.4, 5.1
Weds., Feb. 3 Database search, BLAST, FASTA
Significance of alignment scores.
ZB: 4.6-4.7, 5.3-5.4 Altschul's tutorial on statistics of sequence similarity scores. Warren Gish's webpage on information theory and alignment scoring statistics.
Mon., Feb. 8 Multiple sequence alignment Ron Shamir's MSA notes ZB: 4.5 (pp. 90-93), 6.4-6.5; Durbin, 6.1--6.4
Weds., Feb. 10 DNA motifs, profiles. Gibbs sampling. ZB: 6.1, 6.6, short paper on EM algorithms Original paper on the Gibbs sampler for local multiple alignment
Mon., Feb. 15 NO CLASS: Tufts Holiday
Weds., Feb. 17 Gene finding, intro to HMMs ZB: 9.2-9.7; 10.2- 10.8 Rabiner handout, pp. 257-266.
THURSDAY., Feb. 18:
TUFTS Monday
Finish HMMs and their use in gene finding. ZB: 4.8-4.9; ZB: 9.2-9.7; 10.2- 10.8 Durbin: chapter 3
Mon., Feb. 22 Profile HMMs; introduction to sequence assembly ZB: 6.2, Nagarajan and Pop's Sequencing overview Eric Green's historical review article on genomic sequencing methods
Weds., Feb. 24 Sequence assembly, overlap graphs and suffix trees GAGE: Evaluating short-read assemblies; this will be useful in completing homework 3. Schuster's review article on next generation sequencing; Mardis' more detailed article about next generation sequencing technologies. The paper about the SOAPdenovo assembler.
Mon., Feb. 29 Sequence assembly, deBruijn graphs and Eulerian paths ZB: 5.3
Weds., Mar. 2 EXAM 1
Mon., Mar. 7 Gene expression: technology, normalization, detecting differential expression ZB: 15.1, 16.1 Slonim review article
Weds., Mar. 9 Gene expression: gene set analysis methods, the Gene Ontology, functional enrichment ZB: 16.4 Gene Set Enrichment Analysis
Mon., Mar. 14 Gene expression: clustering and classification ZB: 16.2-16.3, 16.5 Golub and Slonim et al., on leukemia classification,
Weds., Mar. 16 Introduction to phylogeny ZB: chapter 7 Mona Singh's phylogeny notes
Mon., Mar. 21 NO CLASS: SPRING BREAK
Weds., Mar. 23 NO CLASS: SPRING BREAK
Mon., Mar. 28 Phylogeny ZB: 8.1-8.4
Weds., Mar. 30 Protein interaction networks Alm and Arkin review of biological networks Yu, et al., on bottlenecks in protein networks ; Przytycka, Singh, and Slonim review of network dynamics.
Mon., Apr. 4 Networks and systems biology ZB: chapter 17 or TBA
Weds., Apr. 6 EXAM 2
Mon., Apr. 11 (Daniels) Big data and Compressive BLAST
Weds., Apr. 13 (Daniels) Sublinear search techniques for big data
Mon., Apr. 18 NO CLASS: Patriots' Day
Weds., Apr. 20 Introduction to protein structure prediction ZB: chapter 2; 11.1, 11.4-11.5
Mon., Apr. 25 Predicting secondary and super-secondary structure, evaluation ZB: 13.2-13.5
Weds., Apr. 27 Project presentations
Mon., May 2 Project presentations

Course Description

No biology background required! This is a computer science elective aimed at upper level undergraduates and graduate students. It offers students a chance to explore the practical computer science challenges arising from real world applications.

In this course, students will develop an understanding of the key computational challenges in molecular biology, or any field in which the onslaught of data requires sophsticated algorithms and data structures for scalability. We will discuss algorithms used to solve some of these problems, and we will introduce ongoing areas of research in the fields of bioinformatics and computational biology. Grading will be based on homework assignments (both written and computer-based), two in-class exams, and a written course project. Students will also be expected to contribute to class discussion and group activities, to do the assigned reading, and to read supplementary background materials as they find necessary.

Course Staff:

We are likely to have some guest lecturers this semester, including:

The graduate teaching assistant for the course is Mengfei Cao, mengfei.cao _AT_ tufts.edu.
TA office hours: Halligan 107, time TBA, tentatively Tues., 4-5:30 and Sunday afternoons as needed.

Course Requirements:


Policies



Course Materials

For homeworks, slides, and other class information, go to the private course materials page.
Last updated March 1, 2016