Translating Web Data

Authors: 
Popa, Lucian; Velegrakis, Yannis; Miller, Renee; Hernandez, Mauricio; Fagin, Ronald
Author: 
Popa, L
Velegrakis, Y
Miller, R
Hernandez, M
Fagin, R
Year: 
2002
Venue: 
In Proceedings of VLDB, pages 598--609, 2002
URL: 
http://www.cs.toronto.edu/~miller/papers/PVHMF02.pdf
Citations: 
437
Citations range: 
100 - 499

Mapping and translating data stored in dierent formats continues to be an important problem in modern information systems. We present a novel framework for mapping among XML and relational schemas in which a high-level mapping is translated into semantically meaningful queries that transform source data into the target representation. Our approach works in two phases. In the rst phase, a high-level mapping, expressed as a set of attribute-to-attribute correspondences, is processed and converted into a logical mapping that captures the design choices made in the source and target schemas (including their hierarchical organization and the grouping of attributes into nested tables and sets). The second phase translates the logical mapping into a query that can be executed over the source schemas and is guaranteed to produce data satisfying the constraints and structure of the target schema. To this end, target attribute values may need to be invented to ensure that the data respects the constraints (including nested referential constraints) and the (possibly nested) structure of the target schema. Our approach is unique in that 1) we consider not only relational schemas, but also XML schemas with (nested) constraints; 2) for this large class of schemas, the mapping algorithm is complete in that it produces all mappings that are consistent with the schema constraints; 3) our data translation algorithm correctly translates source data even if there is missing data in the target (attributes with no correspondence to the source). We have implemented the mapping algorithm in a high-level schema mapping tool.