E6998 Robustness and Security in ML Systems

Spring 2020 -- Junfeng Yang

  • Location: MUD 327
  • Time: T 10:10am-12:00pm
  • Credits: 3
  • Instructor: Junfeng Yang
  • Address: 519 CSB
  • Office Hours: By appointment
  • TA: Vikram Nitin
  • Office Hours: TBD
  • Email: sysml-course@lists.cs. Best way to reach us.

Course Description

Over the past few years, Machine Learning (DL) has made tremendous progress, achieving or surpassing human-level performance for a diverse set of tasks including image classification, speech recognition, and playing games such as Go. These advances have led to widespread adoption and deployment of ML in security- and safety-critical systems such as self-driving cars, malware detection, and aircraft collision avoidance systems. This wide adoption of DL techniques presents new challenges as the predictability and correctness of such systems are of crucial importance. Unfortunately, ML systems, despite their impressive capabilities, often demonstrate unexpected or incorrect behaviors in corner cases for several reasons such as biased training data, overfitting, and underfitting of the models. In safety- and security-critical settings, such incorrect behaviors can lead to disastrous consequences such as a fatal collision of a self-driving car. For example, a Google self-driving car recently crashed into a bus because it expected the bus to yield under a set of rare conditions but the bus did not. A Tesla car in autopilot crashed into a trailer because the autopilot system failed to recognize the trailer as an obstacle due to its “white color against a brightly lit sky” and the “high ride height.” Such corner cases were not part of Google’s or Tesla’s test set and thus never showed up during testing. Other examples include Microsoft’s Tay chatbot tweeting racist words because it was misled by malicious twitter users, and Google removing “gorilla” as an image class after its image classification algorithm incorrectly classified dark skined people as gorillas.

These challenges have drawn huge attention from researchers in machine learning, security, systems, and programming language communities. A number of techniques and theories have been proposed to increase the robustness and security of machine learning. In this course, we will study the most practical and most important of these techniques and theories with a focus on deep learning. For details on the topics we'll cover, please go to the Course Syllabus page.

Course Goal

The general goal of this course is to help you understand the challenges and solutions to make ML robust and secure. This understanding will make you a more effective ML programmer or scientist. If you are interested in doing research in this emerging area, this course will help you get started

Course Format and Student Workload

This course will center around paper readings, presentations, and discussions; and a final project . The course readings include a list of research papers selected from top machine learning, security, systems, and programming language conferences. We will discuss roughly two papers every class meeting. For the in-depth discussions to be possible, you will have to read the papers carefully before class.

You have three main responsibilities in the course:

Prerequisite

COMS W3137 Data Structures and Algorithms, COMS W3157 Advanced Programming (or good working knowledge of C/C++), and COMS W3827 Fundamentals of Computer Systems; or equivalents of these three courses. Good working knowledge of machine learning and deep learning.

Linux environment. For instance, you should know how to write a make file.

Enrollment

This semester's enrollment for this class will be limited. Please register early if you plan to take this class in this semester. If the class is full and you would like to take the class, please email the instructor and come to the first day of class.

The enrollment is open to PhD, MS and undergraduate students. If you are an undergraduate and would like to take the course, please email the instructor for permission.

Materials

There is no required textbook; all relevant materials will be made available online at the Course Syllabus page.

Grading

40%: Class participation. To encourage in-depth discussion, 40% of the grade will be assigned to in-class participation and paper presentation.
60%: Final project.