Explanation of the formula 3n - (2(n+1) + 1) + n for the number of expressions


           Make a table of all n range variables, corresponding to the n stored relations that will be joined. Then mark each range variable as either -, L or R. There are 3n different assignments of these symbols. Each assignment can be thought of as a logical expression -- except that as we see below some assignments would correspond to expressions that are not WELL-FORMED (I call them illegal). The L means that the stored relation is on the left hand side of a join expression, the R means that the stored relation is on the right hand side of a join expression and the - indicates that the stored relation is not in this expression.

           Then subtract off assignments which have only L's or only R's (not both). There will be 2n -1 assignments with only L's, and the same number with only R's. Since a join always takes two inputs, these correspond to invalid join expressions. But, in Cascades, you will have a single GET expression for every stored relation, add back n assignments -- one for each of the n stored relations.

           Finally subtract off the assignment of all -'s (no L's or R's) -- this corresponds to nothing -- there isn't even a group for this.

           So we start with 3n assignments, subtract off 2n - 1, 2n - 1 and 1, and add back n. This adds up to
3n - (2(n+1) + 1) + n.

As we said before, the final addition of n is due to logical expressions with Model D GET operator -- these are not be counted in [Pellenkoft 96] because they are only considering the number of join expressions.


           Here is a sample of a few table entries (in NO PARTICULAR ORDER ) and their corresponding logical expressions for the number of possible expressions in the query AXBXC

table			expressions or why its illegal -- 

A B C			A B and C are tables to be joined

- - -			illegal expression -- corresponds to nothing: subtract 1
- - R			corresponds to GET C (Cascades only)
- R -			corresponds to GET B (Cascades only)
- R R			illegal - no left hand side -- there are 2 ~n of these
- - L			corresponds to GET C but already counted as - - R
- L L			illegal - no right hand side -- there are 2 ~n of these
L L L			also illegal - no right hand side -- part of the 2 ~n 
L R R			A X BC
R L L			BC X A
L R L			AC X B
R L R			B X AC
L L R			AB X C
R R L			C X AB
- L R			B X C
- R L			C X B
L R -			A X B
...

 ^
 |
3 ~n entries in table

there are 3 ~n total entries (and about 2 * 2 ~n of these are illegal --)



           Note that each group will correspond to a number m of relations that have been joined, where m is less than n, the total number of relations joined in the query. The number of logical expressions in a group is equal to
2m - 2

This is because the binary representation of every number from 1 to 2m represents a unique (binary) partition (where order matters) of the set of m relations, and hence an expression in a group which represents the join of m relations. For the query AXBXC (i.e. n=3), consider the group AXB (i.e. m = 2):

table			expression

A B 			A B are tables to be joined in this group

0 1			A X B	think of 0 as L, and 1 as R
1 0			B X A

or the group AXBXC (i.e. m = 3, the final group):

A B C			A B and C are tables to be joined in this group

0 0 1			AB X C
0 1 0			AC X B
0 1 1			A X BC
1 0 0			BC X A
1 0 1			B X AC
1 1 0			C X AB

This corresponds to the 6 logical expressions in group 4 of Figure 3.3.
So another way to compute the total number of logical expressions (in this case not including the n GET logical expressions) is:
sum from m=1 to n of
(n choose m) * 2m - 2

From private conversation between Len Shapiro and Bennet Vance -- [Vance 96].