Multi-Agent Language Models: Games, Reasoning, and Alignment

戴晓武, University of California, Los Angeles

Time: TBD Location: TBD

Abstract

Large Language Models (LLMs) are prone to inconsistencies and hallucinations. We introduce Peer Elicitation Games (PEG), a training-free, game-theoretic framework for aligning LLMs through a peer elicitation mechanism involving a generator and multiple discriminators instantiated from distinct base models. Discriminators interact in a peer evaluation setting, where rewards are computed using a determinant-based mutual information score that provably incentivizes truthful reporting without requiring ground-truth labels. We establish theoretical guarantees showing that each agent, via online learning, achieves sublinear regret in the sense their cumulative performance approaches that of the best fixed truthful strategy in hindsight. Moreover, we prove last-iterate convergence to a truthful Nash equilibrium, ensuring that the actual policies used by agents converge to stable and truthful behavior over time. I'll also discuss the extension of PEG to the multi-agent reasoning and inference-time alignment.

Biography

Xiaowu Dai is an assistant professor in the Departments of Statistics and Data Science, and of Biostatistics at UCLA. Before joining UCLA, he did a postdoc at UC Berkeley working with Prof. Mike Jordan, and received a Ph.D. in Statistics at UW-Madison advised by Prof. Grace Wahba. His research focuses on statistical theory and methodology for real-world problems that blend computational, inferential, and economic considerations.