This paper describes the techniques developed to enhance the search coverage of domain-specific Language Models (LM) by incorporating domain-independent language knowledge from the Google 1 trillion n-grams corporal created from general World Wide Web (WWW) documents. The purpose of our study is to explore a Natural Language (NL) based multimodal search interface for TV-oriented online Electronic Programming Guides (EPG) supporting both typed and spoken queries. The proposed method uses a two-pass procedure that combines the domain knowledge derived from an EPG-specific training corpus with the Google n-gram corporal as a domain-independent phrase dictionary with built-in usage counts from the web page authors. The enhanced LMs are able to achieve an absolute improvement of 23% on the model coverage (recall accuracy) without reducing the precision accuracy measured by Word Error Rate (WER) on a test set of in-domain spoken query utterances.
