att_abstract={{The Web contains a significant volume of structured data in various
domains, but a lot of data are dirty and erroneous, and they can be
propagated through copying. While data integration techniques allow
querying structured data on the Web, they take the union of the
answers retrieved from different sources and can thus return conflicting
information. Data fusion techniques, on the other hand, aim
to find the true values, but are designed for offline data aggregation
and can take a long time.

This paper proposes the first online data fusion system. It starts
with returning answers from the first probed source, and refreshes
the answers as it probes more sources and applies fusion techniques
on the retrieved data. For each returned answer, it shows the likelihood
that the answer is correct, and stops retrieving data for it after
gaining enough confidence that data from the unprocessed sources
are unlikely to change the answer. We address key problems in
building such a system and show empirically that the system can
start returning correct answers quickly and terminate fast without
sacrificing the quality of the answers.}},
	att_authors={xd0649, ds8961},
	att_copyright={{VLDB Foundation}},
	att_copyright_notice={{The definitive version was published in Very Large Databases, 2011. {{, 2011-08-29}}
	author={Xuan Liu and Xin Dong and Beng Chin Ooi and Divesh Srivastava},
	institution={{VLDB Conference}},
	title={{Online Data Fusion}},