10:15 a.m., Monday, March 8, 1993
719 Broadway, 12th floor conference room
We have devised a simple framework for automating physical design in relational or post-relational DBMSs and in database programming languages. Within this framework, design options are uniformly represented as ``features'', and designs are represented by ``conflict''-free sets of features. (Mutually exclusive features conflict. An example would be two primary indexes on the same table.) The uniform representation of design options as features accommodates a greater variety of design options than previous approaches; adding a new design option (e.g. a new index type) merely entails characterizing it as a feature with appropriate parameters. We propose an approximation algorithm, based on this framework, that finds low-cost physical designs. In an initial phase, the algorithm examines the logical schema, data statistics, and queries, and generates ``useful features''-features that might reduce query costs. In a subsequent phase, the algorithm uses the DBMS's cost estimates to find ``best features''-features that belong to the lowest-cost designs for each individual query. Finally, the algorithm searches among conflict-free subsets of the best features of all the queries to find organizations with low global cost estimates. We have implemented a prototype physical design assistant for the INGRES relational DBMS, and we evaluate its designs for several benchmarks, including ASSSAP. Our experiments with the prototype show that it can produce good designs, and that the critical factor limiting their quality is the accuracy of query cost estimates. The prototype implementation isolates dependencies on INGRES, permitting our framework to produce design assistants for a wide range of relational, nested-relational, and object-oriented DBMSs.