A Randomized Approach for the Incremental Design of an Evolving Data Warehouse

Theodoratos, D; Dalamagas, T; Simitsis, A; A Simitsis, M
Theodoratos, D
Dalamagas, T
Simitsis, A
A Simitsis, M
Proc. 20th International Conference on Conceptual Modeling (ER 01), LNCS 2224
Citations range: 
10 - 49
Theodoratos2001ARandomizedApproachfortheIncrementalDesignofanEvolving.pdf475.41 KB

A Data Warehouse (DW) can be used to integrate data from multiple distributed data sources. A DW can be seen as a set of materialized views that determine its schema and its content in terms of the schema and the content of the data sources. DW applications require high query performance. For this reason, the design of a typical DW consists of selecting views to materialize that are able to answer a set of input user queries. However, the cost of answering the queries has to be balanced against the cost of maintaining the materialized views. In an evolving DW application, new queries need to be answered by the DW. An incremental selection of materialized views uses the materialized views already in the DW to answer parts of the new queries, and avoids the re-implementation of the DW from scratch. This incremental design is complex and an exhaustive approach is not feasible. We have developed a randomized approach for incrementally selecting a set of views that are able to answer a set of input user queries locally while minimizing a combination of the query evaluation and view maintenance cost. In this process we exploit “common sub-expressions” among new queries and between new queries and old views. Our approach is implemented and we report on its experimental evaluation.