Language Processing as Signal Processing

Thu Aug 22 15:30:00 EDT 2013

Signal processing has traditionally involved continuous-valued signals (audio, video, sonar) that we transform, enhance, and recognize. Language, represented in terms of word sequences, is incorporated into signal processing as a discrete process generated by a Markov source that is used as a prior in, e.g., speech recognition. Words are characterized with non-parametric multinomial distributions depending on the word history or other categorical variables. However, social media and online interactions have many more applications for language processing, and there is growing interest in continuous-space representations of language, which offers the potential for using signal processing tools to solve these problems. In this talk, we survey work in continuous-space modeling of language, including latent semantic analysis, neural network models, and an exponential model that treats unseen events as a rank regularization problem.  These models provide transformations of language that map words to a continuous space where neighbors have syntactic/semantic similarity. We can extend this approach to consider mixed discrete and continuous models by incorporating methods for learning sparse elements of language. Inspection of the sparse component provides insights into the idiosyncracies of speakers and speaking style as well as vocabulary acquisition.



Mari Ostendorf

Mari Ostendorf is a Professor of Electrical Engineering at the University of Washington. After receiving her PhD in electrical engineering from Stanford University, she worked at BBN Laboratories, then Boston University, and then joined the University of Washington (UW) in 1999. At UW, she is an Endowed Professor of System Design Methodologies in Electrical Engineering and an Adjunct Professor in Computer Science and Engineering and in Linguistics. From 2009-2012, she served as the Associate Dean for Research and Graduate Studies in the College of Engineering. She has previously been a visiting researcher at the ATR Interpreting Telecommunications Laboratory and at the University of Karlsruhe, a Scottish Informatics and Computer Science Alliance Distinguished Visiting Fellow at the University of Edinburgh, and an Australia-America Fulbright Scholar at Macquarie University. Prof. Ostendorf's research interests are in dynamic and linguistically-motivated statistical models for speech and language processing. Her work has resulted in over 200 publications and 2 paper awards. Prof. Ostendorf has served as co-Editor of Computer Speech and Language, as the Editor-in-Chief of the IEEE Transactions on Audio, Speech and Language Processing, and she is currently the VP Publications for the IEEE Signal Processing Society. She is also a member of the ISCA Advisory Council. She is a Fellow of IEEE and ISCA, a recipient of the 2010 IEEE HP Harriett B. Rigas Award, and a 2013 IEEE Signal Processing Society Distinguished Lecturer.

University of Washington