COMS 4771

Tentative Schedule

Topic 1
Introduction, Maximum Likelihood Estimation, Classification via Probabilistic Modeling, Bayes Classifier, Naive Bayes, Evaluating Classifiers
[lec 1]
notes/reading: MLE basics | Multivariate Gaussian | Naive Bayes | Optimality of Bayes classifier | CML 1.1-1.2, 1.4, 2.1 | DL Part I
Topic 2
Generative vs. Discriminative classifiers, Nearest Neighbor classifier, Coping with drawbacks of k-NN, Decision Trees, Model Complexity and Overfitting
[lec 2]
notes/reading: NN and k-d trees | decision trees | CML 1.3, 2.2-2.5, 3.1-3.2
Topic 3
Decision boundaries for classification, Linear decision boundaries (Linear classification), The Perceptron algorithm, Coping with non-linear boundaries, Kernel feature transform, Kernel trick
[lec 3]
notes/reading: CML 4.1-4.8, 11.1-11.2
Exam #1
Topic 4
Support Vector Machines, Large margin formulation, Constrained Optimization, Lagrange Duality, Convexity, Duality Theorems, Dual SVMs
[lec 4]
notes/reading: SVM basics | Lagrange Duality | CVX book | CML 7.7, 11.5-6
Topic 5
Regression, Parametric vs. non-parametric regression, Ordinary least squares regression, Logistic regression, Lasso and ridge regression, Optimal regressor, Kernel regression, consistency of kernel regression
[lec 5]
notes/reading: Logistic regression | Kernel regression
Topic 6
(if time permits)
Statistical theory of learning, PAC-learnability, Occam's razor theorem, VC dimension, VC theorem, Concentration of measure
[lec 6]
notes/reading: PAC and VC tutorial | Intro. Learning Theory | CML 12.3, 12.5-6
Exam #2
Topic 7
Unsupervised Learning, Clustering, k-means, Hierarchical clustering, Gaussian mixture modeling, Expectation Maximization Algorithm
[lec 7]
notes/reading: GMMs and EM | CML 15.1
Topic 8
Dimensionality Reduction, Principal Components Analysis (PCA), non-linear dimension reduction (manifold learning)
[lec 8] notes/reading: PCA tutorial | CML 15.2 | Non-linear Dimension Reduction
Topic 9
(if time permits)
Graphical Models, Bayesian Networks, Markov Random Fields, Inference and learning on graphical models, Markov Chains, Hidden Markov Models (HMMs)
[lec 9] notes/reading: Graphical Models tutorial | HMM tutorial
Exam #3

Resources

Books on ML
The Elements of Statistical Learning by Hastie, Tibshirani and Friedman (link)
Pattern Recognition and Machine Learning by Bishop (link)
A Course in Machine Learning by Daume (link)
Deep Learning by Goodfellow, Bengio and Courville (link)
Understanding Machine Learning by Shalev-Shwartz and Ben-David (link)
Software
MATLAB: download info, learning the basics.
Python: download info, tutorial
LaTeX: download info, learning the basics, typesetting common math formulas.
Background
Applied Probability: Events, discrete and continuous random variables, densities, expectations, joint-, conditional- and marginal distributions, independence, concepts of standard deviation, variance, covariance, and correlation, law of large numbers, central limit theorem.

Applied Statistics: Bayes Rule, Priors, Posteriors, Maximum Likelihood Principle (MLE), Basic distributions such as Bernoulli, Binomial, Multinomial, Poisson, Gaussian. Multivariate versions of these distributions, especially Multivariate Gaussian Distribution.

Linear Algebra: Vector spaces, subspaces, matrix inversion, matrix multiplication, linear independence, rank, determinants, orthonormality, basis, solving systems of linear equations. Eigenvectors/values, Eigen- and Singular Value Decomposition. Identifying and working with popular types of matrices - e.g. symmetric matrices, positive (semi-) definite matrices, non-singular matrices, unitary matrices, rotation matrices, etc. Linear maps, fundamental subspaces (column space, row space, null space, left null space), operators, (orthogonal) projections.

Multivariate Calculus: Limits and sequences of functions. Taylor expansions and approximations. Take derivatives and integrals of common functions, gradient, Jacobian, Hessian, classification of stationary points, compute maxima and minima of common functions. Differentiation of vector valued functions.

Mathematical maturity: Ability to communicate technical ideas clearly.

Basic algorithm design and analysis: Time and space complexity analysis, asymptotic notation (eg, big-O, big-Ω, big-Θ), complexity analysis of iterative and recursive processes.

Basic datastructures: Graphs, Trees, Lists, Tables. Basic representation, traversal and analysis techniques on such datastructures.

Programming: Ability to program in a high-level language, and familiarity with basic algorithm design, data structures, coding principles and efficient data processing in your preferred high-level language.
Misc. Books
The Matrix Cookbook
Convex Optimization
Mathematics for Machine Learning

Machine Learning

COMS 4771 Summer 2025 B

Tentative Schedule

Course Info

Resources

Policies