For the initial
column unique cardinalities
(cucards),
minimum values (cmin),
and maximum values (cmax)
of all of the TPC-D attributes,
see Appendix C.4,
Catalog Statistics for TPC-D.
The logical properties of new groups
are computed by the function find_log_prop,
defined for every logical operator.
The description of this for every logical (bulk) operator
is in Appendix L.
However, a few general observations can be made.
In the update
of every equivalence class
of the schema logical property
by a logical operator
the cucard3
but not cmin and cmax
is updated.
This is because
the schema
represents dynamic information
and cucards change
as operations are applied to a table.
Although the cmin and cmax values
would, in general,
change
(e.g. by a SELECT operator with a range predicate),
this is not modeled in Model D.
The initial cucard's and cmin and cmax
(as calculated by the GET logical operator),
would be determined
by accessing the catalog
using the ATTR_CAT
entries of the schema.
The
selectivity
of a predicate
is an estimate of
the fraction of tuples
for which the predicate
will evaluate to TRUE.
Recall that
in order
to determine
the cardinality
of a table
produced by the SELECT logical operator,
the selectivity
of the predicate
(second input taken by the SELECT operator)
must be determined.
The problem (addressed below)
is how to compute
the selectivity of these predicates.
3. In Model D the dynamically updated cucard
is not used in selectivity estimates
(see discussion below)
but is used in calculating join cardinalities.