Inverting Schema Mappings: Bridging the Gap between Theory and Practice

Arenas, M; Perez, J; Reutter, J; Riveros, C
Arenas, M
Perez, J
Reutter, J
Riveros, C
VLDB 2009
Citations range: 
10 - 49
Arenas2009InvertingSchemaMappingsBridgingtheGapbetweenTheoryand.pdf203 KB

The inversion of schema mappings has been identified as one of the fundamental operators for the development of a
general framework for metadata management. In fact, during the last years three alternative notions of inversion for
schema mappings have been proposed (Fagin-inverse [10], quasi-inverse [14] and maximum recovery [2]). However, the procedures that have been developed for computing these operators have some features that limit their practical applicability. First, these algorithms work in exponential time and produce inverse mappings of exponential size. Second, these algorithms express inverses in some mappings languages which include features that are difficult to use in practice. A typical example is the use of disjunction in the conclusion of the mapping rules, which makes the process of exchanging data much more complicated. In this paper, we propose solutions for the two problems mentioned above. First, we provide a polynomial time algorithm that computes the three inverse operators mentioned above given a mapping specified by a set of tuple-generating dependencies (tgds). This algorithm uses an output mapping language that can express these three operators in a compact way and, in fact, can compute inverses for a much larger class of mappings. Unfortunately, it has already been proved that this type of mapping languages has to include some features that are difficult to use in practice and, hence, this is also the case for our output mapping language. Thus, as our second contribution, we propose a new and natural notion of inversion that overcomes this limitation. In par- ticular, every mapping specified by a set of tgds admits an inverse under this new notion that can be expressed in a mapping language that slightly extends tgds, and that has the same good properties for data exchange as tgds. Finally, as our last contribution, we provide an algorithm for
computing such inverses.