Clio Grows Up: From Research Prototype to Industrial Tool

Authors: 
Haas, Laura; Hernandez, Mauricio; Ho, Howard; Popa, Lucian; Roth, Mary
Author: 
Haas, L
Hernandez, M
Ho, H
Popa, L
Roth, M
Year: 
2005
Venue: 
Proc. SIGMOD 2005
URL: 
http://www.dit.unitn.it/~p2p/RelatedWork/Matching/clio-industrial.pdf
DOI: 
http://doi.acm.org/10.1145/1066157.1066252
Citations: 
212
Citations range: 
100 - 499
AttachmentSize
Haas2005ClioGrowsUpFromResearch.pdf131.55 KB

Clio, the IBM Research system for expressing declarative schema mappings, has progressed in the past few years from a research prototype into a technology that is behind some of IBM's mapping technology. Clio provides a declarative way of specifying schema mappings between either XML or relational schemas. Mappings are compiled into an abstract query graph representation that captures the transformation semantics of the mappings. The query graph can then be serialized into different query languages, depending on the kind of schemas and systems involved in the mapping. Clio currently produces XQuery, XSLT, SQL, and SQL/XML queries. In this paper, we revisit the architecture and algorithms behind Clio. We then discuss some implementation issues, optimizations needed for scalability, and general lessons learned in the road towards creating an industrial-strength tool.