ASSET Seminar featuring Misha Belkin (University of California, San Diego)

04-15-26
12:00 pm
Penn AI Foundations Icon

Talk Title: From kernel machines to  the linear representation hypothesis for monitoring and steering LLMs

Abstract: A trained Large Language Model (LLM) contains much of human knowledge. Yet, it is difficult to gauge the extent or accuracy of that knowledge, as LLMs do not always "know what they know'' and may even be unintentionally or actively misleading. In this talk I will discuss feature learning  introducing Recursive Feature Machines — a powerful generalization of the classical kernel methods designed for extracting relevant features from tabular data. I will demonstrate how this technique enables us to detect and precisely guide LLM behaviors toward almost any desired concept by manipulating a fixed vector in the LLM activation space. I will also discuss how the same method allows for probing for whether LLM  exhibits motivated reasoning.

Sponsored by:

Speaker

Headshot of Misha Belkin Misha Belkin Professor of Computer Science and Engineering at UC San Diego; Professor at the Halicioglu Data Science Institute; Amazon Scholar

Mikhail Belkin is a Professor at the Halicioglu Data Science Institute and the Computer Science and Engineering Department at UC San Diego, and an Amazon Scholar. Previously, he was a Professor in the Department of Computer Science and Engineering and the Department of Statistics at The Ohio State University. He received his Ph.D. in Mathematics from the University of Chicago, where he was advised by Partha Niyogi. His research broadly spans the theory and applications of machine learning, deep learning, and data analysis.