Learning to Unlearn: Robust and Efficient Machine Unlearning for Large Foundation Models

Title: Learning to Unlearn: Robust and Efficient Machine Unlearning for Large Foundation Models

Abstract: As large foundation models gain massive popularity, the ability to remove sensitive information without fully retraining from scratch becomes crucial. This talk introduces three recent works for machine unlearning that can erase data or knowledge selectively from pretrained models. Instance-wise unlearning leverages adversarial examples with weight importance metrics to erase unwanted information without access to the pretraining dataset. Then we examine gradient ascent for post-training unlearning in large language models, demonstrating how targeted token sequences can be forgotten efficiently to reduce privacy risks with minimal impact on overall performance. Finally, we propose the novel Inverted Hinge Loss and data-adaptive LoRA, which collectively address unstable optimization and knowledge degradation. By reviewing the frontiers of Generative AI and various safety problems, this talk ends with highlighting current and future topics in LG AI Research.

Bio: Moontae Lee is an Assistant Professor of Information and Decision Sciences at the University of Illinois Chicago and concurrently serves as Head of the Superintelligence Laboratory at LG AI Research. His journey into Large Language Models (LLMs) began in 2019 as a Visiting Scholar at Microsoft Research Redmond, where he helped initiate an ambitious Universal Language Modeling project. His research spans Question Answering, textual Reasoning, and the development of LLMs for both natural and programming languages. Moontae’s current focus centers on two key directions: (1) constructing high-quality training and reasoning datasets, and (2) strategically controlling reasoning paths within large models. In parallel, he is actively engaged in AI Safety and Ethics, particularly in aligning model behaviors with both societal values and individual preferences—ultimately aiming to expand collaborative and value-sensitive AI behavior as a foundation for machine morality.

Trustworthiness of Machine-Learning-Based Systems (TrustML) Research Cluster

About UBC

UBC Campuses

UBC Sites