|
Comp167, Fall 2008 |
Prof. Donna Slonim |
|
Introduction to Computational Biology |
slonim_AT_cs.tufts.edu |
|
Tuesdays and Thursdays, 10:30-11:45am, Halligan 111B |
|
|
Office hours: Halligan-234, Tuesdays and Thursdays, 1:30-2:30pm |
|
| DATE | TOPICS | READING | OPTIONAL READING | ||
| Sept. 2 | Class Introduction. | Syllabus handout. For CS students: Larry Hunter's article, Molecular Biology for Computer Scientists. For BME students: either Corman, Leiserson, Rivest and Stein Chapters 2 + 3, or Jones and Pevzner, Chapter 2. |
Mount, Chapter 1. | ||
| Sept. 4 | Sequence alignment I: global alignment. Dynamic programming | Mount, Chapter 3, especially pp. 70-94. | Global alignment: Durbin, pp. 17-22. | ||
| Sept. 9 | Sequence alignment II: local alignment, scoring matrices. | Mount, pp. 94-112 and 240-258. | Local alignment: Durbin, pp. 23-24, 29-30 | ||
| Sept. 11 | Database searching. Significance of sequence alignment scores. | Altshul's tutorial on statistics of sequence similarity scores. | Warren Gish's webpage on information theory and alignment scoring statistics. Pevsner, Chapter 2. |
||
| Sept. 16 ADD DATE |
Online bioinformatics resources. Multiple Sequence Alignment: uses, scoring, optimal methods, progressive and iterative alignment methods |
Mount, Chapter 5, pp. 163--189. | Durbin, 6.1--6.4 | ||
| Sept. 18 | Genome sequencing background and technology; overlap detection | Mount, pp. 33-41. Eric Green's review article on genomic sequencing methods |
The sequence assembler Arachne and a follow up paper about it. | ||
| Sept. 23 | Genome Sequencing methods | Eric Green's review article on genomic sequencing methods | Setubal and Meidanis, Chapter 4. | ||
| Sept. 25 | Mapping algorithms | ||||
| Sept. 30 | Introduction to Gene Finding | Mount, pp. 361-385. | |||
| Oct. 2 | QUIZ 1 | ||||
| Oct. 7 DROP and PASS/FAIL DATE |
Hidden Markov Models | Rabiner handout, pp. 257-266. | Durbin, Chapter 3 | ||
| Oct. 9 | Gene Finding with HMMs; DNA regulatory motifs PROJECT PROPOSAL DUE |
Mount, pp. 384-400 | References describing a prokaryotic gene finder, EasyGene, and GenScan. | ||
| Oct. 14 | Protein motifs, profile HMMs | Mount, pp. 189-215 | |||
| Oct. 16 | Phylogeny I | Mount, Chapter 7 | Durbin, Chapter 7 | ||
| Oct. 21 | Phylogeny II | Mount, Chapter 7 | Mona Singh's phylogeny notes | ||
| Oct. 23 | Gene Expression Microarrays: clustering, classification, functional analysis | Mount, Chapter 13 | Golub, et al., on leukemia classification, Pomeroy, et al., on medulloblastomas, and Furey, et al., on support vector machines | ||
| Oct. 28 | Gene Expression Microarrays: differential expression and clustering | Slonim review article | Alizadeh, et al., on lymphoma | ||
| Oct. 30 | Gene Expression Microarrays: technology, normalization | Quackenbush review: cDNA normalization; and Bolstad, et al., on oligo normalization; paper on normalization method comparisons | |||
| Nov. 4 | RNA structure prediction (?) | ||||
| Nov. 6 | QUIZ 2 | ||||
| Nov. 11 | NO CLASS (Veteran's Day Observed) | ||||
| Nov. 13 | Protein Structure Prediction | Mount, Chapter 10 | Pevsner, Chapter 9; Rick Lathrop's threading review | ||
| Nov. 18 | Proteomics PROJECT UPDATE DUE (= emailed progress report) |
||||
| Nov. 20 | Systems Biology I | ||||
| Nov. 25 | Systems Biology II | TBA | |||
| Nov. 27 | NO CLASS (Thanksgiving holiday) | ||||
| Dec. 2 | Student presentations | ||||
| Dec. 4 | Student presentations | ||||
| Dec. 8 | Last Day of Classes TERM PAPER / PROJECT DUE |
In this course, students will develop an understanding of the key computational challenges in molecular biology. We will discuss algorithms used to solve some of these problems, and we will introduce ongoing areas of research in the fields of bioinformatics and computational biology. Grading will be based on homework assignments, two in-class quizzes, a written course paper/project, and a project presentation. Students will also be expected to contribute to class discussion and to read supplementary background materials as they find necessary.
Course Requirements:
Prerequisites: Comp15 and at least one 100-level computer science class, or permission of the instructor. Graduate standing in computer science or a related field may be sufficient with no further prerequisites; check with the instructor. Comfort writing programs in some language is essential, as we will be implementing some of the algorithms we discuss. In the past, students have successfully used Perl, Python, Java, C, and C++. If you have another preference, please discuss your choice of language with the TA.
Readings: The course textbook is Bioinformatics: Sequence and Genome Analysis, Second Edition by David Mount, published by Cold Spring Harbor Press. Softcover versions are available in the Medford campus bookstore. If you'd rather have the hardcover edition (expensive but more durable), you can order it online. Note that the First Edition of the book has different page numbers for the readings and does not cover some topics (such as microarrays) at all.
The textbook was chosen because it covers, in some detail, most of the material we will discuss in the class. Readings from this text will be listed in the syllabus where appropriate. However, for individual topics, other textbooks may be better. Thus, supplementary readings from some of the recommended textbooks listed below will be listed as well. If you have no biology background, you may want to supplement the readings as well by getting a good introductory molecular biology text. (Several are available online via PubMed/Books; we'll discuss how to access these in an early lecture.)
Other recommended books:
Computational resources: You will need access to a computer with an internet connection and support for whatever programming language / tools you intend to use. If you need help in obtaining resources for this purpose, please contact the instructor or teaching assistant as soon as possible.
The typical review paper will involve selecting a topic and writing a
thorough survey and critical analysis
that illustrates your depth of understanding of that
topic. The paper should present the details of the topic as if you
were teaching the material to your classmates, and indeed,
you will then present some of this material to the
class at the end of the term.
The paper should cite appropriate
primary literature, and should include clear original figures and/or
tables to help explain the material.
Typical papers will be 5-10
pages in length, excluding the bibliography.
You are not required to
do any original research or programming for this option, though you
are welcome to include a section on future work that outlines a
project you might wish to pursue. However, a
Possible topics include (but are not limited to):
Choosing a term project is most appropriate if you already know something about the topic you want to pursue, or if you start to study an area and find that you can think of a better way to solve the problem than anyone has yet designed. Generally, projects will involve either writing a program that would be useful for a biomedical research project, designing and testing a new method of solving an existing problem, or testing a biological hypothesis by a novel data analysis. In the past, successful term projects have included such topics as:
This project must be accompanied by a written report describing the problem, the solution or the results obtained, and the implications of the work. Because the term project usually spans only a few weeks, a description of future work will be an important component of the written report, whether or not you actually choose to pursue the project any further.
All sources used should be cited. In other words, if you discuss a homework problem with a classmate, you should list that classmate as one of your references for that problem. A special note about finding solutions on the web: be warned that not everything you read online is correct. Even data from supposedly reputable sources, such as slides posted by faculty at this or other universities, may not have been reviewed by an editor and might contain crucial typos. For this reason, I'd like to discourage you from using Google to tackle the problem sets, but if you choose to do so, you must cite the URL(s) that you used. Directly copying text from any source without attribution is plagiarism and will be dealt with accordingly.
Internet Resources for Computational Biology and Bioinformatics
Genome Browsers / Viewers:
You can generally get access to these resources (often online) and others through Tisch library or the Health Sciences Library. Bibliographic search tools such as PubMed, CiteSeer, MedMiner, PubGene and Chilibot are also your friends... Consider signing up for some of the table of contents alert services, too.
Note that online access to some journals may be limited to computers within the tufts.edu domain. You can get at these from other domains by going through the Tisch Library page to its electronic journals page, searching for the journal of choice, and then typing in your Tufts ID number.
You can find a useful calendar of bioinformatics events at the ISCB website. [By the way, I encourage you to join the ISCB -- it's quite cheap for grad students and cost-effective if you go to one conference or subscribe to one journal.]