[University home]

On-Demand Data Integration: Dataspaces by Refinement

The broad objective of dataspaces is that structured data should be able to be made available in an integrated way, with minimal effort directed at the development and maintenance of the mappings that are central to classical integration. Our adaptation to the emphasis on automation and reduced cost is to seek to support incremental refinement of automated mappings, using information from different sources (e.g. the users, the data in the different sources). The aim of our project is therefore to investigate how a dataspace management system (DSMS) can provide incremental integration of heterogeneous sources. We assume that for incrementality to be effective, a DSMS should: (i) provide different qualities of data integration at different costs; (ii) indicate to users the likely quality or at least the origin of query answers; (iii) allow users to influence the behaviour of a dataspace by stating their non-functional requirements, providing feedback on the quality of answers, and supplying sample answers that would meet their expectations; and (iv) enable users to share or to personalise their usage of the dataspace based on their preferences or distinctive requirements. The ensuing objectives are to improve understanding of dataspaces by designing, evaluating and revising techniques that enable incremental, user-directed data integration. In particular, we propose: