Systems for ML: It’s all about the choices

November 3, 2022

3:00-4:15pm ET

Cummings 270, Zoom

Speaker: Neeraja Yadwadkar, University of Texas at Austin, ECE Dept.

Host: Raja Sambasivan

Abstract

This talk focuses on the following fundamental question: What does Machine Learning (ML) need from Systems? I will use inference serving as an example to make a case that the “Systems for ML” research is primarily about making the right choices. Today, many applications rely on inference from machine learning models, especially neural networks. For instance, applications on Facebook issue tens-of- trillions of inference queries per day with different privacy, performance, accuracy, and cost constraints. Unfortunately, existing distributed inference serving systems ignore ease-of-use and hence result in significant cost inefficiency, especially at large scales. Existing systems force developers to manually search through thousands of model-variants — versions of already-trained models with differing hardware, resource footprints, latencies, costs, privacy, and accuracies — to meet the diverse application requirements. As requirements, query load, and applications themselves evolve over time, developers must make these decisions dynamically for each inference query to avoid excessive costs through naive autoscaling. To avoid navigating through the large and complex trade-off space of model-variants, developers often fix a variant across queries and replicate it when load increases. However, given the diversity across variants and hardware platforms across the cloud-edge spectrum, a lack of understanding of the trade-off space incurs significant costs. For applications to use machine learning, we must automate issues that affect ease-of-use, privacy, performance, and cost efficiency for users and providers. We argue for managed distributed inference serving for a variety of models, across the cloud-edge spectrum.

Bio:

Neeraja is an assistant professor in the department of Electrical and Computer Engineering at UT Austin. She is a Cloud Computing Systems researcher, with a strong background in Machine Learning (ML). Most of her research straddles the boundaries of Systems and ML: using and developing ML techniques for systems, and building systems for ML. Before joining UT Austin, she was a postdoctoral fellow in the Computer Science department at Stanford University, and before that, received her PhD in Computer Science from UC Berkeley. She had previously earned a bachelors in Computer Engineering from the Government College of Engineering, Pune, India.

Please join meeting in Cummings 270 or via Zoom.

Join Zoom Meeting: https://tufts.zoom.us/j/96038251227

Meeting ID: 960 3825 1227

Passcode: see colloquium email

Dial by your location: +1 646 558 8656 US (New York)

Meeting ID: 960 3825 1227

Passcode: see colloquium email