Integration workbench: Integrating schema integration tools

Mork, P; Rosenthal, A; Seligman, L; Korb, J; Samuel, K
Mork, P
Rosenthal, A
Seligman, L
Korb, J
Samuel, K
Proc. ICDE Workshops
Citations range: 
10 - 49
Mork2006IntegrationworkbenchIntegratingschemaintegrationtools.pdf296.23 KB

A key aspect of any data integration endeavor is establishing a transformation that translates instances of one or more source schemata into instances of a target schema. This schema integration task must be tackled regardless of the integration architecture or mapping formalism. In this paper we provide a task model for schema integration. We use this breakdown to motivate a workbench for schema integration in which multiple tools share a common knowledge repository. In particular, the workbench facilitates the interoperation of research prototypes for schema matching (which automatically identify likely semantic correspondences) with commercial schema mapping tools (which help produce instance-level transformations). Currently, each of these tools provides its own ad hoc representation of schemata and mappings; combining these tools requires aligning these representations. The workbench provides a common representation so that these tools can more rapidly be combined. Researchers have built many systems to semiautomatically perform schema matching [1]. Schema mapping tools generally provide the user with a graphical interface in which lines connecting related entities and attributes can be annotated with functions or code to perform any necessary transformations. From these mappings, they synthesize transformations for entire databases or documents. These tools have been developed by commercial vendors (including Altova's MapForce, BEA's AquaLogic, and Stylus Studio's XQuery Mapper) and research projects (such as Clio [2], COMA++ [3] and the wrapper toolkit in TSIMMIS [4]). Currently an integration engineer can choose to embrace a specific development environment. The engineer benefits from the automated support provided by that vendor, but cannot leverage new tools as they become available. The alternative is to splice together a number of tools, each of which has its own internal representation for schemata and mappings. In one case, we needed four different pieces of software to transform a mapping from one tool's representation into another. By adopting an open, extensible workbench, integration engineers can more easily leverage automated tools as they become available and choose the best tool for the problem at hand.