Abstract: This talk is about the two ends of LLM training: pre-training and in-deployment learning. I will present an approach to disentangle knowledge from skill in model pre-training. This brings about a new class of LLMs that externalize knowledge, with dramatically different characteristics from common LLMs along dimensions of scale, factuality, and updateability. On the other end, I will discuss two in-deployment learning methods. I will describe how in-context learning abilities extend beyond supervised settings, showing that LLMs display in-context reinforcement learning from rewards. Finally, if time allows, I will describe continual learning from implicit interaction signals, demonstrating that LLMs can retrospectively decode latent interaction cues by observing how humans respond to their outputs.
Bio: Yoav Artzi is an Associate Professor in the Department of Computer Science and Cornell Tech at Cornell University, a visiting faculty researcher at Google DeepMind, and arXiv's associate faculty director. His research focuses on language modeling and learning in interactive and situated scenarios. His work was acknowledged by awards and honorable mentions at ACL, EMNLP, NAACL, and IROS, as well as a TACL test-of-time award. Yoav holds a B.Sc. from Tel Aviv University and a Ph.D. from the University of Washington.
Location: NCS 120