Phonetic String Transduction

September 2, 2010
11a-12:15p
Halligan 111A

Abstract

Phonetic string transduction transforms a source string into a target representation according to its pronunciation. Grapheme-to- phoneme conversion and name transliteration are two important example applications. The problem is challenging because: 1) the source string does not unambiguously specify the target representation, and 2) the training data includes only example source-target pairs without the structural information that indicates subword alignments. In this talk, I present a discriminative training framework based on efficient, online, max- margin learning. The framework unifies character segmentation and sequence modeling into a single learning model by using a phrase- based decoder. The system achieved the best results on several language pairs in the NEWS Machine Transliteration Shared Task. It is also the state-of-the-art system in grapheme-to-phoneme conversion.

Bio Sittichai Jiampojamarn is a PhD candidate in the Department of Computing Science at the University of Alberta. His main research areas are Natural Language Processing (NLP) and Machine Learning (ML). He is particularly interested in structure prediction problems that include complex latent variables, such as grapheme-to- phoneme conversion and name transliteration. He has published 15 papers in academic conferences and contributed two open-source software projects.