The broad objective of dataspaces is that structured data should be able to be made available in an integrated way, with minimal effort directed at the development and maintenance of the mappings that are central to classical integration. Our adaptation to the emphasis on automation and reduced cost is to seek to support incremental refinement of automated mappings, using information from different sources (e.g. the users, the data in the different sources). The aim of our project is therefore to investigate how a dataspace management system (DSMS) can provide incremental integration of heterogeneous sources. We assume that for incrementality to be effective, a DSMS should: (i) provide different qualities of data integration at different costs; (ii) indicate to users the likely quality or at least the origin of query answers; (iii) allow users to influence the behaviour of a dataspace by stating their non-functional requirements, providing feedback on the quality of answers, and supplying sample answers that would meet their expectations; and (iv) enable users to share or to personalise their usage of the dataspace based on their preferences or distinctive requirements. The ensuing objectives are to improve understanding of dataspaces by designing, evaluating and revising techniques that enable incremental, user-directed data integration. In particular, we propose:
- To design a software framework that supports the flexible development of dataspace management systems through the replacement of key components, including schema mapping and result ranking algorithms.
- To investigate the annotation of schema mappings with measures of their likely quality from a range of sources.
- To explore how lineage information, combined with indications as to likely quality, can be conveyed to users, and in turn to identify how feedback from users on the quality of results can be reflected in annotations.
- To investigate how the quality of query answers and mappings can be improved given explicit user direction.
- To explore how annotations and user preferences can be used in the ranking of query results.