Automating Database Schema Evolution in Information System Upgrades

Curino, Carlo; Moon, Hyun J.; Zaniolo, Carlo
Curino, C
Zaniolo, C
Moon, H
Hot Topics In Software Upgrade
Citations range: 
10 - 49

The complexity, cost, and down-time currently created by the database schema evolution process is the source of incessant problems in the life of information systems and a major stumbling block that prevent graceful upgrades. Furthermore, our studies shows that the serious problems encountered by traditional information systems are now further exacerbated in web information systems and cooperative scientific databases where the frequency of schema changes has increased while tolerance for downtimes has nearly disappeared. The PRISM project seeks to develop the methods and tools that turn this error-prone and time-consuming process into one that is controllable, predictable and avoids down-time. Toward this goal, we have assembled a large testbed of schema evolution histories, and developed a language of Schema Modification Operators (SMO) to express concisely these histories. Using this language, the database administrator can specify new schema changes, and then rely on PRISM to (i) predict the effect of these changes on current applications, (ii) translate old queries and up- dates to work on the new schema version, (iii) perform data migration, and (iv) generate full documentation of intervened changes. Furthermore, PRISM achieves good usability and scalability by incorporating recent advances on mapping composition and invert- ibility in the implementation of (ii). The progress in automating schema evolution so achieved provides the enabling technology for other advances, such as light-weight database design methodologies that embrace changes as the regular state of software. While these topics remain largely unexplored, and thus provide rich opportunities for future research, an important area which we have been investigated is that of archival information systems, where PRISM query mapping techniques were used to support flash- back and historical queries for database archives under schema evolution.