This section
evaluates
the Model D optimizer
in terms of the
four measures described in section 5:
extensibility,
efficiency,
accuracy and
the quality of plans produced.
The basis for this evaluation
is the performance of the Model D optimizer
on the 17 TPC-D queries.
To do the evaluation,
the 17 TPC-D queries were hand parsed into
an initial query in
the Model D
logical algebra
and submitted to the Model D optimizer.
The initial query tree
used a left-deep join ordering
to allow the left-deep only join order heuristic
rules to work.
The optimizer then output
the optimal plan,
and its estimated cost.
The cost model parameters that were used
are shown in Appendix CM.
The stored relation
cardinality information
are in Appendices C.3 and C.4.
A TPC-D scale factor of 1.0
was used,
resulting in a LINEITEM table of 6 million tuples,
and an overall database size of about 3 gigabytes [Li 96].
The indices defined are shown in Appendix C.2.
The optimizer
was run on
a sun4m workstation
with 115 Megabytes of memory
and running SunOS 4.1.4.
The optimization time
reported is the user time
from the Unix time command.
The optimization time
is somewhat dependent
on the load
on the machine
(many other users were on the machine during the tests),
but the use of user time
instead of system time or elapsed time
minimized the effect
of other unrelated jobs
on the system.
In the tables below,
a substitute query,
8y,
is used in addition
to TPC-D query 8.
The difference
between TPC-D query 8
and query 8y
is that the ORDER relation
used by 8y
has been altered
to include
an explicit ship year attribute
in addition to
the shipdate attribute
specified in TPC-D.
As a result,
in 8y
there is no need
for a FUNC_OP
to compute the year value.
This improves cardinality estimates
(the column unique cardinality
for FUNC_OP'
s are not computed very well in Model D),
and simplifies the query.
Because of
the improved (lower) cardinality estimates
of query 8y,
its optimal plan has a slightly lower estimated cost
than the plan for query 8.
More significantly,
query 8y uses
a special relation,
NATION-REGION,
not part of the TPC-D specification,
which is the Cartesian product
of relations NATION and REGION.
Use of this relation
reduces the number of joins from 7 to 6,
which improves the optimization
time significantly.