CS 86998/EE 6898/EECS 6898: Topics - Information Processing: From Data to Solutions, Fall 2012
Time: Friday, 1:10-3:00pm
Place: CSB 453 (CS Conference Room)
Professors
Julia Hirschberg (Office Hours: TBD) julia_AT_cs.columbia.edu, 212-939-7114
Shih-fu Chang (Office Hours: TBD) sfchang_AT_ee.columbia.edu, 212-854-6894
Teaching Assistant TBD
Announcements | Academic Integrity | Description
Readings | Resources | Requirements | Syllabus
Description
This course is designed for participants in the NSF IGERT program "From Data to Solutions". Students in the seminar may be IGERT Trainees, IGERT affiliates, or other students having the permission of one of the instructors. The course will consist of a series of presentations by faculty and staff at Columbia and CUNY who will describe interesting problems involving very large amounts of data (text, audio, image, video) that require interdisciplinary collaboration with faculty and students in Computer Science, Electrical Engineering, Statistics, Psychology, Biomedical Informatics, Business and Journalism. Students taking the course will complete short reading assignments for each class, turn in one-page reports on each of the presentations, and prepare a final longer report on one of the problems presented as a final project. Actual experimental implementations will be welcome, but not mandatory. Some proposed projects may be selected and invited to continue in the following semester or summer under the supervision of the instructors or other participating faculty or researchers from industry. There are no prerequisites for the course and no exams; however, students who are members of the IGERT: From Data to Solutions project (Trainees and Affiliates) will have preference in enrollment. This is a required course for IGERT Trainees.
Requirements/Assignments
Students will be expected to complete all reading assignments before the class for which they are assigned. Students will prepare short reports on each of the presentations. These must be submitted in CourseWorks before the following class. Each student will prepare a longer report outlining an approach to one of the interdisciplinary problems describe in the presentations. There will be no midterm or final exam. Grades will be based on class participation, weekly short reports, and final report.
A guide to weekly reporting can be found here.
For information on final reports and presentations .follow the links in the listing for the last session of the course, below.
Grading
Class participation: 30%
Short Reports 30%
Final Report 40%
Academic Integrity
Readings
Required readings are available on line from links in the syllabus below.
Announcements
Resources
Syllabus
Date | Topic | Readings | Presenters |
---|---|---|---|
Week 1 (9/7) |
Disruption and Resurrection: New Models for Saving Narrative Journalism |
Stories, optional:
Shirky12,
Fallows10
|
Michael Schapiro (Journalism) and
David
Elson (Google) |
Week 2 (9/14) |
Mining Audio | Ellis&Lee06 |
Dan Ellis (Electrical Engineering) |
Week 3 (9/21) |
Statistical Machine Translation (Slides1, Slides2, Slides3, Slides4) | Knight97 |
Michael Collins (Computer Science) |
Week 4 (9/28) | On a road to personalized therapy, understanding cancer through mining big data: A lesson on heterogeneity | Akaviaetal12 | Dana Pe'er (Biology) |
Week 5 (10/5) |
Mine Your Own Business | Netzeretal12 | Oded Netzer (Business) |
Week 6 (10/12) |
This is an all-day workshop. Please register at http://caossnyc.org/registration if you are not already registered. | Workshop on Computational and Online Social Science | |
Week 7 (10/19) |
Research Methods and Design | Gleitmanetal11 | Michelle Levine (Psychology) |
Week 8 (10/26) |
Observational Studies in Big Healthcare Data: Are They Any Good? | Ioannidis05 | David Madigan (Statistics), Tony Jebara (Computer Science), and Chris Wiggins (APAM) |
Week 9 (11/2) |
Data-Intensive Science: Methods for Reproducibility and Dissemination of Computational Results | Donohoetal09; Roundtable10; Stoddenetal10 | Victoria Stodden (Statistics) |
Week 10 (11/9) | Redaction and Declassification of Government Archives | Trachtenberg, Burr, RFK papers (optional) | Matt Connelly (History) |
Week 11 (11/16) |
Inferring Gold Standards from Multiple, Noisy Annotations | Bob Carpenter (Statistics) | |
Week 12 (11/30) |
Learning from electronic health records and online health communities |
Hripcsak&Albers12 | Noemie Elhadad and George Hripcsak (Biomedical Informatics) |
Week 13 (12/7) |
Introduction to University Tech Transfer and Patents 101 | UnsoldPatents, PatentasSword, SmartphoneDeals, CTVFAQ, ToPromoteInnovation (Ch1, pp 1-2 (and ftnt 7), pg. 4-7; 9-12; 26-27; 31-35; 37 (heading 2)) and suggested (Ch2, 3-8 (start with I. to end of section), Ch3, 1-2, 30-33, start with IV to end of D)) | Orin Herskowitz (CTV) and Jeff Sears(OGC) |
Finals week (12/14) 1:10-4pm |
Final Reports due
|