Dealing with Sparse Data and/or Domain Mismatch: The Case for Hybrid Approaches

April 13, 2023

3:00-4:15 pm ET

Cummings 270

Speaker: Abeer Alwan

Host: Soha Hassoun

Abstract

The performance of Automatic Speech Recognition (ASR) and speaker identification systems (SID) degrade significantly in the case of domain mismatch and/or when training data are limited. Examples include ASR systems trained on adult speech and tested on children’s speech and SID systems trained on one speaking style and tested with another style (for example, reading versus conversational speech). In this talk, I will describe several of our robust ASR and SID techniques that are inspired by models of human speech production and perception, and linguistics. In the Child ASR area, which presents a low-resource and domain-mismatched challenge, our work focuses on developing meaningful data augmentation and normalization approaches in addition to a novel Self- Supervised Learning (SSL) framework. I will also describe recent work on dialect and depression identification in low-resource scenarios. In the SID area, we focus on understanding what aspects of a voice are talker specific. This information not only helps in developing robust SID algorithms, especially in low-resource and domain mismatched cases, but also in understanding the human limits in perceiving speaker differences.

Bio:

Abeer Alwan received her Ph.D. in EECS from MIT in 1992. Since then, she has been with the ECE department at UCLA where she is a Full Professor, established the Speech Processing and Auditory Perception Laboratory, and has served as Vice Chair for the ECE Undergraduate and for Graduate affairs. Dr. Alwan is a recipient of several awards including: the NSF Research Initiation Award, the NIH FIRST Award, the UCLA-TRW Excellence in Teaching Award, Okawa Foundation Award in Telecommunication, and the Engineer’s Council Educator Award. She is a Fellow of the Acoustical Society of America, IEEE, and International Speech Communication Association (ISCA). She was a Fellow at the Radcliffe Institute, Harvard University, Distinguished Lecturer for ISCA, co-Editor in Chief of Speech Communication, Chair of the IEEE Flanagan Committee, Vice Chair of the IEEE Awards committee, and an elected member of the Board of Governors of the IEEE Signal Processing Society.