Abstract: Large-scale data mining of private information (such as medical or financial records) has the potential to revolutionize the way we live. On the positive side, it can enable large-scale studies on the comparative effectiveness of treatments, novel risk factors for diseases, and patient-centered personalized medicine. There are many challenges to overcome in order to realize these benefits, ranging from the way we represent data to legal and policy arrangements between medical institutions. Uniting these is a concern about the privacy -- we want to guarantee low privacy risk and high precision, or utility. The fundamental question to answer is this : how much data do we need to guarantee acceptable levels privacy and utility?
In this talk I describe practical approaches for managing this tradeoff. These methods guarantee differential privacy, a cryptographically-motivated definition of privacy which has been widely adopted in the computer science community. I will discuss basic ideas from differential privacy and how to use them to build algorithms for classification and dimensionality reduction, which are two of the most common tasks in machine learning. I will also describe some exciting future prospects for privacy-preserving algorithms in signal processing, optimization, and learning (joint work with Kamalika Chaudhuri (UCSD), Claire Monteleoni (GWU), Kaushik Sinha (Wichita State U), and Shuang Song (UCSD)).
Bio: Anand Sarwate is currently a Research Assistant Professor at the Toyota Technological Institute at Chicago, a philanthropically endowed academic institute located on the University of Chicago campus. Prior to that he was a postdoc in the Information Theory and Applications Center (ITA) at UC San Diego. He received his PhD from UC Berkeley in 2008, and undergraduate degrees in Mathematics and Electrical Engineering from MIT in 2002. He received the Demetri Angelakos and Samuel Silver awards from the EECS department at UC Berkeley. He is broadly interested in algorithms applied to problems in distributed systems, signal processing, machine learning, statistics, and privacy and security. He will be joining the Department of Electrical and Computer Engineering as an Assistant Professor in January 2014.