Paradigms of AI alignment: components and enablers
Abstract
The focus of AI alignment is to figure out how to get advanced AI systems to do what we want them to do and not knowingly act against our interests. Alignment research focuses on developing different components of an aligned AI system (e.g. outer and inner alignment) or enabling more effective work on the components (e.g. through improving interpretability or progress on foundational questions). This talk will give an overview of research directions in each of these areas.
Bio:
Victoria is a senior research scientist at DeepMind focusing on AGI alignment. She has worked on various alignment problems but is often best known for creating a comprehensive database of specification gaming examples. She has a PhD in statistics and machine learning from Harvard. Victoria is also a co-founder of the Future of Life Institute, a non-profit organization working to mitigate technological risks to humanity and increase the chances of a positive future.