WCBMF4761-1: Computational Genomics for Spring 2022
Days and Time
Mondays and Wednesdays 5:40 PM-6:55 PM
Location
833 MUDD
Allowed For:
- Undergraduate
- Masters
- Professional
- PhD
Prerequisites:
The class is geared towards the graduate or advanced undergraduate students. The class is offered by the Department of Computer Science assumes students are independent programmers. The class is interdisciplinary and in addition to computer science students, encourages participation of students with primarily biological, rather than computational background. The class assumes working knowledge in probability and statistics, with students offered take on extra introductory work if they feel their background requires a refresher. The class assumes no background knowledge in biology.
Notes:
None
Instructor:
Pe'er, Itsik G
Description
Disruptive technologies in genomics continuously cause exponential increase in data throughput that outpaces available computing resources, and therefore requires sophisticated analysis methods. The class is designed to provide a broad introduction to computational methods used in genomics, getting students to the level of carrying out a project in the field.
The class syllabus should be viewed from a dual perspective. In terms of genomics applications, the class covers high throughput ("Next Generation") sequencing of genomic DNA (mapping, calling, assembly de-novo, structural variants), RNA (transcript modeling, quantification) and functional assays (peak calling), homology and evolution, basic statistical and population genetics, and some topics pending students' interest, out of natural selection, ancient DNA, microbiome analysis, personal genomes and other offerings. In terms of the computational methodology perspectives, tools covered in depth include error-tolerant hashing, indexing using the Borrows-Wheeler transform, de-Bruijn graphs, hypothesis testing, dynamic programming, hidden Markov models, and coalescent theory.
Grading includes 90% equally tri-partitioned into (1) weekly problem sets till a late midterm, (2) said midterm, and (3) a final project, that includes presentation and reporting tasks throughout the semester (replacing the problem sets on some weeks). Ideal team size for the project is 2, with experience showing that interdisciplinary teams often come up with the best work. If class size is incompatible with presentation of all projects, a separate section will be scheduled. The remaining 10% of the grade is for class participation.
Bibliography:
No books are required. Students often find one or both of the following useful:
Durbin et al., Biological Sequence Analysis, Cambridge, 1998 ISBN: 9780521629713
Campeau & Pevzner, Bioinformatics Algorithms: An Active Learning Approach, Active Learning Publishers, 2014, ISBN: 099374602