The description of a logical model is based on the logical operators in the model. The logical part of model D is comprised of the operators from relational algebra,2 plus a few to handle the SQL extensions (aggregation, ordering and duplicates).

Logical Operators

In Cascades, each logical operator takes3 a fixed number of inputs. This fixed number is called the operator's arity. In Cascades relational models, logical operators generally take tables as inputs, and produce a single table as output.

In Cascades relational models, we distinguish the logical concept of a table (roughly, the internal representation of relations) from a stored relation (roughly, a disk file). Stored relations have persistence and are listed in the database catalog. In Cascades, stored relations are accessible only through certain operators, like the logical operator, GET. Tables are temporary collections4 of tuples produced in the evaluation of a query. Tables are the output of GET and all other logical operators, and they are usually also the inputs to logical operators.5

2. Model D doesn't actually implement all of the relational algebra operators -- the model has no union, difference, intersection or division. These were not needed to represent the TPC-D queries.
3. We say that an operator takes an input. Strictly speaking, the Cascades operator datatype doesn't have inputs, rather, expressions (defined later), which contain operators, have inputs. This distinction between the operator and expression datatypes, somewhat unique to Cascades, provides a cleaner way to transform expressions, where the inputs of one expression may be "transferred" to another expression. Each operator has an arity -- this arity will determine the number of inputs expected in an expression containing this operator.
4. In Model D, stored relations must be relations (-- no duplicates allowed), but tables, to support the SQL extensions to the relational model, are allowed to have duplicates.
5. The same is true of physical operators: the names of stored relations are parameters to the physical operators FILESCAN and INDEXED_FILTER. The output of each of these (and of all logical and physical operators) is a table. The specifics of the internal representation of tables (i.e. whether they are stored on disk, in memory or are just streams of tuples, etc.) are in the domain of the physical model and are discussed in section 6.

 3. Fundamental Concepts:   3.1: EQJOIN     3.2a: Query Tree     3.2b: Equiv.     3.3a: Init Memo     3.3b: Memo     3.3c: Key     3.4: Complexity     3.5: Phys Memo     3.6: Plan     3.7: Pred.     3.8: Op Types     3.9: Rule

 Page 2