Colloquium: Learning Like a Human: How, Why, and When

The computer science colloquium takes place on Mondays and Fridays from 11:15 a.m. - 12:15 p.m.

This week's speaker, Tianyi Zhou (University of Washington), will be giving a talk titled "Learning Like a Human: How, Why, and When".

Abstract

Machine learning (ML) can surpass humans on certain complicated yet specific tasks. However, most ML methods treat samples/tasks equally, e.g., by taking a random batch per step and repeating many epochs' training on all data, which may work promisingly on well-processed data using sufficient computation but is extraordinarily suboptimal and inefficient from human perspectives, since we would never teach children or students in such a way. On the contrary, human learning is more strategic and smarter in selecting/generating the training contents for different learning stages via experienced teachers, collaboration of learners, curiosity and diversity in exploration, tracking of learned knowledge and progress, distributing a task into sub-tasks, etc., which have been underexplored in ML. The selection and scheduling of data/tasks is another type of intelligence as important as the optimization of model parameters on given data/tasks. My recent work aims to bridge this gap between human and machine intelligence. As we entering a new era of hybrid intelligence between humans and machines, it is important to make AI not only perform like humans in outcome presentations but also benefit from human-like strategies in its training.

In this talk, I will present several curriculum learning techniques we developed for improving supervised/semi-supervised/self-supervised learning, robust learning with noisy data, reinforcement learning, ensemble learning, etc., especially when the data are imperfect and thus a curriculum can make a big difference. Firstly, I will show how to translate human strategies in curriculum generation to discrete-continuous hybrid optimizations, which are challenging to solve in general but we can develop efficient and provable algorithms using techniques from submodular and convex/non-convex optimization. Curiosity and diversity play important roles in these formulations. Secondly, we build both empirical and theoretical connections between curriculum learning and the training dynamics of ML models on individual samples. Empirically, we find that deep neural networks are fast in memorizing some data but also fast in forgetting some others, so we can accurately allocate those easily forgotten data using training dynamics in very early stages and make the future training only focus on them. Moreover, we find that the consistency of model output overtime for an unlabeled sample is a reliable indicator of its prediction correctness and delineates the forgetting effects on previously learned data. In addition, the learning speed on samples/tasks provides critical information for future exploration. These discoveries are consistent with human learning strategies and lead to more efficient curricula for a rich class of ML problems. Theoretically, we derive a data selection criterion solely from the optimization of learning dynamics in continuous time. Interestingly, the resulted curriculum matches the previous empirical observations and has a natural connection to the neural tangent kernel in recent deep learning theories.

Biography

Tianyi Zhou is a Ph.D. candidate in the Paul G. Allen School of Computer Science and Engineering at University of Washington, advised by Professor Jeff A. Bilmes. His research interests are in machine learning, optimization, and natural language processing. His recent research focuses on transferring human learning strategies to machine learning in the wild, especially when the data are unlabeled, redundant, noisy, biased, or are collected via interaction, e.g., how to automatically generate a curriculum of data/tasks during the course of training. The studied problems cover supervised/semi-supervised/self-supervised learning, robust learning with noisy data, reinforcement learning, meta-learning, ensemble method, spectral method, etc. He has published ~50 papers at NeurIPS, ICML, ICLR, AISTATS, NAACL, COLING, KDD, AAAI, IJCAI, Machine Learning (Springer), IEEE TIP, IEEE TNNLS, IEEE TKDE, etc., with ~2000 citations. He is the recipient of the Best Student Paper Award at ICDM 2013 and the 2020 IEEE Computer Society Technical Committee on Scalable Computing (TCSC) Most Influential Paper Award.

Category
Start date
Friday, Feb. 12, 2021, 11:15 a.m.
End date
Friday, Feb. 12, 2021, 12:15 p.m.
Location

Online - Zoom Link

Share