Summary
It is very hard to hand design programs to solve many real world problems, e.g. distinguishing images of cats versus dogs. Machine learning algorithms allow computers to learn from example data, and produce a program that does the job. Neural networks are a class of machine learning algorithm originally inspired by the brain, but which have recently have seen a lot of success at practical applications. They are at the heart of production systems at companies like Google and Facebook for image processing, speech-to-text, and language understanding. This course gives an overview of both the foundational ideas and the recent advances in neural net algorithms.
Roughly the first 2/3 of the course focuses on supervised learning --- training the network to produce a specified behavior when one has lots of labeled examples of that behavior. The last 1/3 focuses on unsupervised learning and reinforcement learning.
Logistics
Time | MW 2:40-3:55 | |||
---|---|---|---|---|
Location | CSB 451 | |||
Instructor | Richard Zemel <zemel[at]cs.columbia.edu> | |||
Teaching Assistants | Arpit Prashant Bahety <ab5232[at]columbia.edu> | Xudong Lin <xudong.lin[at]columbia.edu> | Sachit Menon <sachit.menon[at]columbia.edu> | Anusha Misra <am5684[at]columbia.edu> |
Office Hours |
|
Lectures & Tutorials
Lecture 1 Introduction & Linear Models
Sep 13, 15-
Readings: Roger Grosse's notes: Linear Regression, Linear Classifiers, Training a Classifier
Tutorial 1 Multivariable Calculus Review
Sep 15-
iPython notebook You can view the notebook via Colab
Lecture 2 Multilayer Perceptrons & Backpropagation
Sep 20, 22-
Readings: Roger Grosse's notes:
Multilayer Perceptrons,
Backpropagation
Tutorial 2 Autograd and PyTorch
Sep 15-
iPython notebook You can view the notebook via Colab
Lecture 3 Automatic Differentiation & Distributed Representations
Sep 27, 29-
Readings: Roger Grosse's notes:
AutoDiff,
Distributed Representations
Lecture 4 Optimization
Oct 4-
Readings: Roger Grosse's notes:
Optimization
Lecture 5(a) Convolution Neural Networks, Part I
Oct 6Tutorial 3 Neural Network Training
Oct 6Lecture 6 Generalization
Oct 18- Readings: Roger Grosse’s notes: Generalisation
Lecture 7 Interpretability
Oct 20Tutorial 4 Best Practices in CNNs
Oct 20Lecture 8 Recurrent Neural Networks
Oct 27-
Readings: Roger Grosse’s notes:
RNNs,
Exploding Vanishing Gradients
Related Papers: Long Short-Term Memory , Neural Machine Translation, Show, Attend and Tell
Tutorial 5 Recurrent Neural Networks
Nov 3Lecture 9 Attention
Nov 8-
Related Papers:
Show, Attend and Tell
Lecture 10 Transformers
Nov 10-
Related Papers:
Transformers, BERT Pre-training
Tutorial 6 Natural Language Processing
Nov 10Lecture 11 Generative Modelling: Autoregressive Models and GANs
Nov 15-
Related Papers:
Pixel RNNs, WaveNet, Pixel CNNs, Generative Adversial Nets, Cycle GANs
Assignments
Exams
Projects
Prerequisites
In order to be successful in this course, you should have a strong knowledge of the following subjects (as from a previous undergraduate course):
- Machine Learning
- Multivariable Calculus
- Linear Algebra
- Probability & Statistics
Grading
Assignments | 36% |
---|---|
Midterm (Oct 13 in class) | 16% |
Project (Due Mon Dec 13) | 20% |
Final (Wed Dec 22, 1-2:30) | 28% |
Attendance and participation | Bonus |