Statistical inference with privacy and computational constraints
The vast amount of digital data we create and collect has revolutionized many scientific fields and industrial sectors. Yet, despite our success in harnessing this transformative power of data, computational and societal trends emerging from the current practices of data science necessitate upgrading our toolkit for data analysis. In this talk, we discuss how practical considerations such as privacy and memory limits affect statistical inference tasks. In particular, we focus on two examples: First, we consider hypothesis testing with privacy constraints. More specifically, how one can design an algorithm that tests whether two data features are independent or correlated with a nearly-optimal number of data points while preserving the privacy of the individuals participating in the data set. Second, we study the problem of entropy estimation of a distribution by streaming over i.i.d. samples from it. We determine how bounded memory affects the number of samples we need to solve this problem.
Maryam Aliakbarpour is a postdoctoral researcher at Boston University and Northeastern University, where she is hosted by Prof. Adam Smith and Prof. Jonathan Ullman. Before that, she was a postdoctoral research associate at the University of Massachusetts Amherst for a year, hosted by Prof. Andrew McGregor. In the Fall of 2020, she was a visiting participant in the Probability, Geometry, and Computation in High Dimensions Program at the Simons Institute at Berkeley. Maryam received her Ph.D. and M.S. degrees from MIT, where Prof. Ronitt Rubinfeld was her advisor. Her research focuses on theoretical computer science, statistical learning theory, and differential privacy.