The World-Wide Web contains vast quantities of structured data on a variety of domains, such as hobbies, products and reference data. Moreover, the Web provides a platform that can encourage publishing more data sets from governments and other public organizations and support new data management opportunities, such as effective crisis response, data journalism and crowd-sourcing data sets. To enable such wide-spread dissemination and use of structured data on the Web, we need to create an ecosystem that makes it easier for users to discover, manage, visualize and publish structured data on the Web.
I will describe two projects that started in research and went into production at Google that aim to create pieces of this ecosystem. The first, Google Fusion Tables, is a service that makes it easy for users to manage and share data, and to create visualizations that can be published on the Web. The second, the WebTables Project, helps users discover interesting data sets by finding high-quality tables on the Web and providing effective search over the resulting collection of 200 million tables.
Alon Halevy heads the Structured Data Management Research group at Google. Prior to that, he was a professor of Computer Science at the University of Washington in Seattle, where he founded the database group, and a principal member of technical staff at AT&T Research. In 1999, Dr. Halevy co-founded Nimble Technology, one of the first companies in the Enterprise Information Integration space, and in 2004, Dr. Halevy founded Transformic Inc., a company that created search engines for the deep web, and was acquired by Google. Dr Halevy is a Fellow of the Association for Computing Machinery, received the Presidential Early Career Award for Scientists and Engineers (PECASE) in 2000, and was a Sloan Fellow (1999–2000). He received his Ph.D in Computer Science from Stanford University in 1993. He is the author of The Infinite Emotions of Coffee, a book depicting stories of coffee culture in 30 countries and co-author of Principles of Data Integration, to be published in June, 2012 by Morgan Kaufmann.