While ML-based AI systems are increasingly deployed in safety-critical settings, they continue to remain unreliable under adverse conditions that violate underlying statistical assumptions. In my work, I aim to (i) understand the conditions under which a lack of reliability can occur and (ii) reason rigorously about the limits of robustness, during both training and test phases.
In the first part of the talk, I demonstrate the existence of strong but stealthy training-time attacks on federated learning, a recent paradigm in distributed learning. I show how a small number of compromised agents can modify model parameters via optimized updates to ensure desired data is misclassified by the global model, while bypassing custom detection methods. Experimentally, this model poisoning attack leads to a lack of reliable prediction on standard datasets.
Test-time attacks via adversarial examples, i.e. imperceptible perturbations to test inputs, have sparked an attack-defense arms race. In the second part of the talk, I step away from this arms race to provide model-agnostic fundamental limits on the loss under adversarial input perturbations. The robust loss is shown to be lower bounded by the optimal transport cost between class-wise distributions using an appropriate adversarial point-wise cost, the latter of which can be efficiently computed via a linear program for empirical distributions of interest.
To conclude, I will discuss my ongoing efforts and future vision towards building continuously reliable and accessible ML systems by accounting for novel attack vectors and new ML paradigms such as generative AI, as well as developing algorithmic tools to improve performance in data-scarce regimes.
Short Bio:
Arjun Bhagoji is a Research Scientist in the Department of Computer Science at the University of Chicago. He obtained his Ph.D. in Electrical and Computer Engineering from Princeton University, where he was advised by Prateek Mittal. Before that, he received his Dual Degree (B.Tech+M.Tech) in Electrical Engineering at IIT Madras, where he was advised by Andrew Thangaraj and Pradeep Sarvepalli. Arjun's research has been recognized with a Spotlight at the NeurIPS 2023 conference, the Siemens FutureMakers Fellowship in Machine Learning (2018-2019) and the 2018 SEAS Award for Excellence at Princeton University. He was a 2021 UChicago Rising Star in Data Science, a finalist for the 2020 Bede Liu Best Dissertation Award in Princeton's ECE Department and a finalist for the 2017 Bell Labs Prize.