The goal of the Columbia DSI/TRIPODS Deep Learning Workshop is to showcase research in the foundations and applications of deep learning going on at Columbia University and beyond; as well as to identify research directions, open problems, and potential collaborations. The workshop is organized by the Center for Foundations of Data Science and the Columbia TRIPODS Institute.
Event Contact Information: Data Science Institute, 212-854-5660, datascience@columbia.edu
Registration
Registration is free but required.
- Registration page (click the button labeled “Reserve your seat” at the bottom of the page)
Schedule
- 9:00-9:05: Opening remarks
- 9:05-9:35: Xuan (Sharon) Di - “Deep Learning in Transportation”
- 9:35-10:05: Maxim Topaz - “NimbleMiner: a user driven multi-lingual approach to healthcare text mining with deep learning”
- 10:05-10:35: Mingoo Seok - “Recent Advances in AI and ML Hardware Design”
- 10:35-11:00: Break
- 11:00-12:00: Matus Telgarsky - “Open problems and recent advances in deep learning theory: representation, generalization, and optimization”
- 12:00-1:30: Lunch
- 1:30-2:00: Zoran Kostic - “Tracking objects and fields in videos using deep learning”
- 2:00-2:30: Suman Jana - “Building Verifiably Secure Machine Learning Systems”
- 2:30-3:00: Break
- 3:00-3:30: Ali Hirsa - “Deep Learning & its applications in Quantitative Finance”
- 3:30-4:00: Paul Sajda - “Deep Learning for Fusion and Inference in Multimodal Neuroimaging”
- 4:00-4:30: Panel discussion (Matus Telgarsky, Suman Jana, Zoran Kostic), moderated by John Wright
Talk abstracts and speaker bios
Assistant Professor of Computer Science, University of Illinois at Urbana-Champaign
- Title: Open problems and recent advances in deep learning theory: representation, generalization, and optimization
- Abstract: This talk will survey recent progress in deep learning theory, with a focus on three specific results. First is a representation result: there exist deep networks which can be approximated by shallow networks only if they have exponentially as many nodes. Secondly, standard generalization tools will be adapted to deep networks by being sensitive to not just the data and the class of predictors, but also their relationship to the training algorithm (gradient descent). Lastly, preliminary results on deep network optimization will be presented for the special case of deep linear networks.
- Bio: Matus Telgarsky is an assistant professor at the University of Illinois, Urbana-Champaign, specializing in machine learning theory. He received his PhD from UCSD under Sanjoy Dasgupta, and is a receipient of a 2018 NSF CAREER Award. This summer he will co-organize a Simons Institute program on deep learning theory with Aleksander Madry and Elchanan Mossel.
Assistant Professor of Civil Engineering and Engineering Mechanics, Columbia University
- Title: Deep Learning in Transportation
- Abstract: In this talk, I introduce an urban taxi demand forecast problem using deep learning. Two deep neural networks for taxi demand predictions are compared, i.e., the ST-ResNet and FLC-Net, on NYC taxi trip record dataset. Our experimental results show that DNN outperforms most traditional machine learning techniques, but such superior results can only be achieved with proper design of the DNN architecture, where domain knowledge plays a key role.
- Bio: Xuan (Sharon) Di is a tenure-track Assistant Professor in the Department of Civil Engineering and Engineering Mechanics at Columbia University in the City of New York since September 2016 and serves on a committee for the Smart Cities Center in the Data Science Institute. Prior to joining Columbia, she was a Postdoctoral Research Fellow at the University of Michigan Transportation Research Institute (UMTRI). She received her Ph.D. degree from the Department of Civil, Environmental, and Geo-Engineering at the University of Minnesota, Twin Cities in 2014. Dr. Di received a number of awards including the Transportation Data Analytics Contest Winner from Transportation Research Board (TRB), the Dafermos Best Paper Award Honorable Mention from the TRB Network Modeling Committee, Outstanding Presentation Award from INFORMS, and the Best Paper Award and Best Graduate Student Scholarship from North-Central Section Institute of Transportation Engineers (ITE). She also serves as the reviewer for a number of journals, including Transportation Science, Transportation Research Part B/C/D, European Journal of Operational Research, Networks and Spatial Economics, and Transportation. Dr. Di directs the DitecT (Data and innovative technology-driven Transportation) Lab @ Columbia University. Her research interests include emerging transportation systems optimization, shared mobility modeling, and data-driven urban mobility modeling, leveraging optimization, game theory, and data analytics. Details about DitecT Lab and Prof. Sharon Di’s research can be found in the following link: http://sharondi-columbia.wixsite.com/ditectlab.
Professor of Professional Practice in the Department of Industrial Engineering and Operations Research, Columbia University
- Title: Deep Learning & its applications in Quantitative Finance
- Abstract: TBD
- Bio: Professor Ali Hirsa joined IEOR in July 2017. He has been associated with Columbia University as an Adjunct Professor since 2000. He is also Managing Partner at Sauma Capital, LLC a New York Hedge Fund. Previously he was Managing Director and Global Head of Quantitative Strategy at DV Trading, LLC, a Partner and Head of Analytical Trading Strategy at Caspian Capital Management, LLC. Prior to joining Caspian, Ali worked in a variety of quantitative positions at Morgan Stanley, Banc of America Securities, and Prudential Securities. Ali was also a Fellow at Courant Institute of New York University in the Mathematics of Finance Program from 2004 to 2014. Ali’s research interests are algorithmic trading, machine learning, data mining, computational/quantitative finance and optimization. His focus has been on developing learning algorithms on signal extraction from data. Ali is author of “Computational Methods in Finance”, Chapman & Hall/CRC 2012 and co-author of “An Introduction to Mathematics of Financial Derivatives”, third edition, Academic Press and is the editor of Journal of Investment Strategies. He has several publications and is a frequent speaker at academic and practitioner conferences. Ali is co-inventor of “Methods for Post Trade Allocation” (US Patent 8,799,146). The method focuses on allocation of filled orders (post-trade) on any security to multiple managed accounts which has to be fair and unbiased. Current existing methods lead to biases and the invention provides a solution to this problem. He is currently vice chair of Board of Visitors of College of Computer, Mathematical, and Natural Sciences at University of Maryland College Park since June 2016 and was on Board of Visitors at A. James Clark School of Engineering from 2008 to 2018 and served as a trustee on University of Maryland College Park Foundation from 2011 to 2016. Ali received his PhD in Applied Mathematics from University of Maryland at College Park under the supervision of Professors Howard C. Elman and Dilip B. Madan.
Assistant Professor of Computer Science, Columbia University
- Title: Building Verifiably Secure Machine Learning Systems
- Abstract: Machine Learning (ML) models are increasingly deployed in safety- and security-critical domains including self-driving cars and malware detection, where the correctness and predictability of a system’s behavior for corner case inputs are of great importance. However, state-of-the-art ML systems, despite their impressive capabilities, often have unexpected/incorrect behaviors in corner cases as demonstrated by the recent fatal collision involving Tesla’s autopilot system. Unfortunately, traditional software verification techniques are not well-suited for modern ML systems like deep neural networks. In this talk, I will describe new approaches to developing efficient systematic verification techniques that can ensure the safety and security of real-world systems using deep neural networks.
- Bio: Suman Jana is an assistant professor in the department of computer science and the data science institute at Columbia University. His primary research interest is at the intersection of computer security and machine learning. His research has received six best paper awards, a Google faculty fellowship, and an ARO young investigator award.
Associate Professor of Professional Practice in the Department of Electrical Engineering, Columbia University
- Title: Tracking objects and fields in videos using deep learning
- Abstract: Many visually observable phenomena contain details of importance which need to be tracked in position and/or appearance. We have been investigating two topics of practical value which require tracking in video recordings - the measurement of peripheral edema and the estimation of trajectories of multiple objects in a busy traffic intersection. We overview video-preprocessing and deep learning techniques which we have deployed for these two problems. Whereas preliminary results are promising, both projects now require significant increase in the number of reliably-labeled ground truth samples. We discuss issues that we are tackling in acquiring new data for these two problems - one in a medical outpatient setting, and the other in the context of project COSMOS.
- Bio: Zoran Kostic completed his Ph.D. in Electrical Engineering at the University of Rochester and his Dipl. Ing. degree at the University of Novi Sad in Serbia. He spent most of his career in industry where he worked in research, product development and in leadership positions. Zoran’s expertise spans mobile, wireless and statistical communications, signal processing, multimedia, system-on-chip development and applications of parallel computing. His present research addresses Internet of Things systems and physical data analytics, smart cities, and applications of deep learning in autonomous vehicle navigation, medicine and health.
Professor of Biomedical Engineering, Radiology (Physics) and Electrical Engineering, Columbia University
- Title: Deep Learning for Fusion and Inference in Multimodal Neuroimaging
- Abstract: TBD
- Bio: Paul Sajda is Professor of Biomedical Engineering, Electrical Engineering and Radiology (Physics) at Columbia University. He is also a Member of Columbia’s Data Science Institute. Sajda is interested in what happens in our brains when we make a rapid decision and, conversely, what processes and representations in our brains drive our underlying preferences and choices, particularly when we are under time pressure. His work in understanding the basic principles of rapid decision-making in the human brain relies on measuring human subject behavior simultaneously with cognitive and physiological state. Important in his approach is his use of machine learning and data analytics to fuse these measurements for predicting behavior and infer brain responses to stimuli. Sajda applies the basic principles he uncovers to construct real-time brain-computer interfaces that are aimed at improving interactions between humans and machines. He is also applying his methodology to understand how deficits in rapid decision-making may underlie and be diagnostic of many types of psychiatric diseases and mental illnesses. Of particular interest to Sajda is how different areas in the human brain interact to change our arousal state and modulate our decision-making. Specifically he is using simultaneous EEG and fMRI together with pupillometry to identify and track spatiotemporal interactions between the anterior cingulate cortex, dorsolateral prefrontal cortex, and subcortical nuclei such as the locus coeruleus. He has found that the dynamics of these interactions are altered under stress, particularly when dealing with high-pressure decisions with critical performance boundaries. These findings are being transitioned to applications ranging from to tracking pilot cognitive state while operating fighter aircraft to identifying biomarkers of healthy thought patterns in patients being treated for major depressive disorder and/or complicated grief. Sajda is a co-founder of several neurotechnology companies and works closely with a range of scientists and engineers, including neuroscientists, psychologists, computer scientists, and clinicians.
Associate Professor of Electrical Engineering, Columbia University
- Title: Recent Advances in AI and ML Hardware Design
- Abstract: Computing technology has been a backbone of our society. Its importance is hard to overemphasize. Today, we again confirm its extreme importance with recent advances in artificial intelligence and deep learning. Those emerging workloads impose an unprecedented amount of arithmetic complexity and data access beyond our existing computing systems can barely handle. Across the computing systems from data centers, to mobile, and to extreme implants will face a major challenge in achieving short computational delay, energy-efficiency, and accuracy for truly enabling intelligent systems. In this seminar, we will outline the bottlenecks, notably the end of the Moore’s Law and the memory wall problem. We will then discuss several approaches that our group has been working on, including in-memory computing, analog mixed-signal (AMS) feature extraction, and algorithm-hardware optimization. We will introduce several test-chip prototypes and their measurement results.
- Bio: Mingoo Seok is an associate professor of Electrical Engineering at Columbia University. He received the BS from Seoul National University, South Korea, in 2005, and the MS and Ph.D. degree from the University of Michigan in 2007 and 2011, respectively, all in electrical engineering. His research interests are various aspects of VLSI circuits and architecture, including ultra-low-power integrated systems, cognitive and machine-learning computing, adaptive technique for the process, voltage, temperature variations, and transistor wear-out, event-driven controls, and hybrid continuous and discrete computing. He won the 2015 NSF CAREER award. He is the technical program committee members for multiple conferences including IEEE International Solid-State Circuits Conference (ISSCC). He has been as an associate editor for IEEE Transactions on Circuits and Systems Part I (TCAS-I) (2014-2016), IEEE Transactions on VLSI Systems (TVLSI) (2015-present) and for IEEE Solid-State Circuits Letter (SSCL) (2017-present).
Elizabeth Standish Gill Associate Professor of Nursing, Columbia University Medical Center
- Title: NimbleMiner: a user driven multi-lingual approach to healthcare text mining with deep learning
- Abstract: Background: Massive amounts of important clinical data are captured as text narratives within electronic health records. These narrative data require innovative tools that help clinicians and researchers to extract meaning. Natural language processing (NLP) is a set of powerful techniques that can help in processing and deriving meaning from clinical narratives. However, for nursing and allied health professions NLP development and implementation remains relatively poor. We developed and conducted several case studies evaluating an open source software called NimbleMiner that allows clinicians to create rapid NLP solutions to extract concepts of interest from clinical narratives in mulptiple languages (indulging English, Hebrew and Russian). The presentation will demonstrate the software architecture and share the results of several promising case studies. Methods: NimbleMiner is built to help users (clinicians and/or researchers) interact with word embedding language models (skip-gram models- word2vec) while creating and refining predictive models to identify a concepts of interest in clinical narratives. These clinical narratives might include inpatient nursing notes, homecare visit notes, discharge summaries, among others. First, NimbleMiner allows users rapidly create vocabularies of clinical terms within a particular area of interest (e.g., identify words and expressions describing fall history or alcohol abuse). Next, NimbleMiner applies the resulting vocabulary terms to identify and label the positive cases, (positive cases are defined as clinical notes where the concept of interest is described). NimbleMiner creates a training set of notes on which a machine learning predictive models are developed. Finally, the resulting machine learning model can be applied to generate predictions for new (unknown) notes. Results: We conducted several studies where NimbleMiner was compared with other NLP systems. For example, NimbleMiner was successfully used to identify fall-related information in homecare visit notes, find alcohol abuse instances in hospital discharge summaries and identify several diseases in hospital notes in Hebrew (including diabetes, heart failure, etc.). In most projects, NimbleMiner outperformed traditional NLP systems in terms of system accuracy and showed less time needed for NLP implementation. Conclusions: We developed an open access NLP system for rapid clinician-driven machine learning of clinical narratives in multiple languages. The presentation will demonstrate the application and encourage participants to make use of the important previously underutilized narrative clinical data.
- Bio: Dr. Maxim (Max) Topaz PhD, RN, MA is the Elizabeth Standish Gill Associate Professor of Nursing at the Columbia University Medical Center. He is also affiliated with Columbia University Data Science Institute and the Center for Home Care Policy & Research at the Visiting Nurse Service of New York. His research focuses on data science and he finds innovative ways to use the most recent technological breakthroughs, like text or data mining, to improve human health. Dr. Topaz’s research moto is “Data for good”. Dr. Topaz is one of the pioneers in applying natural language processing on data generated by nurses. His current work focusses on developing natural language processing solutions to advance clinical decision making.
About Columbia TRIPODS Institute
The Columbia TRIPODS Institute aims to articulate methodological foundations for data science, spanning mathematics, statistics, and computing. Our emphasis is on foundations to support practice, through the analysis of successful heuristics, the development of well-structured computational toolkits, and the development of theory to support the entire cycle of data science.
We thank the National Science Foundation for financial support through the award CCF 1740833, and Jessica Rodriguez and Sharnice Ottley for administrative support.