Thursday, July 30, 2009

Cooperative Update Exchange in the Youtopia System

In the Youtopia project, we are building a system to allow communities to share structured data. There is an increasing amount of such data being shared on the Web, but there are few tools to make this sharing and management easy for everyone. What is needed is the equivalent of Wiki software for structured data, i.e. a collaborative database management system.

Designing a collaborative DBMS poses unique and exciting technical challenges because all functionality must be fully decentralized. In a recent paper to appear in VLDB 2009, we tackle one such challenge, namely, the problem of update exchange.

Update exchange is a process where changes to data in one table are automatically propagated to other tables as directed by a set of mappings (or tuple-generating dependencies). For example, suppose table A contains information about restaurants in New York State and table B contains data about restaurants in Ithaca, as well as restaurant reviews. If a new restaurant is opened in Ithaca and
entered into table A, a mapping can automatically propagate that information to table B, notifying the owner of table B of the new restaurant and giving them a chance to supply a review.

In our paper, we introduce a model for update exchange which is fundamentally cooperative. It is a variant of the classical chase that includes human intervention at crucial points. Thus, the users have a great deal of control over how the mappings fire. An additional advantage of our model is that the mappings may form arbitrary cycles - we are able to drop the acyclicity restrictions often found in the literature without risking infinite sequences of new tuple insertions.

Our paper presents our update exchange model and explains how to design a practical system that incorporates it. The key observation is that human intervention is slow and unpredictable, and the system must never delay new incoming queries and updates just because it is awaiting human input on an older update. This means that in general, multiple chases will be going on in the system simultaneously. We study the potential for interference among such chases. We provide a serializability framework and concrete algorithms that make it possible to avoid all such interference, should it be unacceptable in a given deployment scenario.

We are currently putting our work into practice and building a prototype of the Youtopia system. Stay tuned for updates, and we hope to see you at the talk in Lyon!

No comments:

Post a Comment