The Hugs 98 User Manual
top | back | next
7 An overview of Hugs extensions
The Hugs interpreter can be run in two different modes.
- Haskell 98 mode: This should be used for the highest level of compatibility
with the Haskell 98 standard; known deviations
from the standard are documented in Section 9. In this mode,
any attempt to use Hugs specific extensions should trigger an error
message. Although there are some fairly substantial differences
between Haskell 1.4 and Haskell 98, our experience is that most
programs written for Haskell 1.4 or earlier will need only minor
modifications before they can be loaded and used from Hugs in
Haskell 98 mode. Note, however, that some of the demo programs
included in the standard Hugs distribution will notwork
in Haskell 98 mode.
- Hugs mode: This enables a number of advanced Hugs features such as
type system extensions, restricted type synonyms, etc. Most of
these features
are described in more detail in the following sections. The
underlying core language remains as in Haskell 98 mode: For example,
the member function of the Functor class is still
called fmap, there is no Eval class, fixity declarations
can appear anywhere that a type signature is permitted, comprehension
syntax is still restricted to lists, and so on.
The choice between the two modes is made when the interpreter is started,
and it is (by design) not possible to change mode without exiting and
restarting Hugs.
The default mode is usually Haskell 98; this can also be set explicitly
by starting Hugs with the command line option +98. To select
the Hugs mode, you should start the interpreter with the command line
option -98. The mode in which the interpreter is running is
displayed as part of the startup banner, and is also included in the
information produced by using the :set command without any arguments.
The intention here is that beginners will get Haskell 98 mode by
default, while more experienced users will be able to set up alias,
batch or script files, or file associations, etc. to provide simple
ways of invoking the interpreter in either mode. On Win 32 machines,
for example, one can set up file associations so that you can right click on
a .hs or .lhs file and get a choice of loading the file into
either a Haskell 98 or Hugs mode session.
The remainder of this section sketches some of the extensions that are
currently supported when the interpreter is running in Hugs mode.
7.1 Type class extensions
In Hugs mode, several of the Haskell 98 restrictions on type classes
are relaxed. This allows the use of multiple parameter classes, and
more flexible forms of instance declarations.
7.1.1 Multiple parameter classes
Haskell 98 allows only one type argument to be specified for any given
type class. As a result, each type class corresponds to a set of types.
For example, a class constraint Eq t tells us that the
type t is assumed or required to be an instance of the
class Eq, and the class Eq itself corresponds to the set
of all equality types. In Hugs mode, this restriction is relaxed so
that programmers can also define classes with multiple parameters, each
of which corresponds to a multi-place relation on types.
Multiple parameter type classes seem to have many potentially interesting
applications [multi]. However, some practical attempts to use them
have failed as a result of frustrating ambiguity problems. This occurs
because the mechanisms that are used to resolve overloading are not
aggressive enough. Or, to put it another way, the type relations that
are defined by a collection of class and instance declarations are often
too general for practical applications, where programmers might expect
stronger dependencies between parameters. In the rest of this section
we will describe these problems in more detail. We will also describe
the mechanisms introduced in the September 1999 release of Hugs that
allow programmers to declare explicit dependencies between parameters,
avoiding these difficulties in many cases, and making multiple parameter
classes more useful for some important practical applications.
7.1.1.1 Ambiguity problems
During the past ten years, many Haskell users have looked into the
possibility of building a library for collection types, using a
multiple parameter type class that looks something like the
following:
class Collects e ce where
empty :: ce
insert :: e -> ce -> ce
member :: e -> ce -> Bool
The type variable e used here represents the element type,
while ce is the type of the container itself. Within this
framework, we might want to define instances of this class for
lists or characteristic functions (both of which can be used to
represent collections of any equality type), bit sets (which can
be used to represent collections of characters), or hash tables
(which can be used to represent any collection whose elements
have a hash function). Omitting standard implementation details,
this would lead to the following declarations:
instance Eq e => Collects e [e] where ...
instance Eq e => Collects e (e -> Bool) where ...
instance Collects Char BitSet where ...
instance (Hashable e, Collects a ce)
=> Collects e (Array Int ce) where ...
All this looks quite promising; we have a class and a range
of interesting implementations. Unfortunately, there are
some serious problems with the class declaration. First,
the empty function has an ambiguous type:
empty :: Collects e ce => ce
By `ambiguous' we mean that there is a type variable e that
appears on the left of the => symbol, but not on the right.
The problem with this is that, according to the theoretical foundations
of Haskell overloading, we cannot guarantee a well-defined semantics
for any term with an ambiguous type. For this reason, Hugs rejects
any attempt to define or use such terms:
ERROR: Ambiguous type signature in class declaration
*** ambiguous type : Collects a b => b
*** assigned to : empty
We can sidestep this specific problem by removing the empty member
from the class declaration. However, although the remaining
members, insert and member, do not have ambiguous types, we
still run into problems when we try to use them. For example, consider
the following two functions:
f x y = insert x . insert y
g = f True 'a'
for which Hugs infers the following types:
f :: (Collects a c, Collects b c) => a -> b -> c -> c
g :: (Collects Bool c, Collects Char c) => c -> c
Notice that the type for f allows the two
parameters x and y to be assigned different
types, even though it attempts to insert each of the two
values, one after the other, into the same collection.
If we're trying to model collections that contain only
one type of value, then this is clearly an inaccurate
type. Worse still, the definition for g is
accepted, without causing a type error. As a result,
the error in this code will not be flagged at the point
where it appears. Instead, it will show up only when
we try to use g, which might even be in a different
module.
7.1.1.2 An attempt to use constructor classes
Faced with the problems described above, some Haskell programmers
might be tempted to use something like the following version of
the class declaration:
class Collects e c where
empty :: c e
insert :: e -> c e -> c e
member :: e -> c e -> Bool
The key difference here is that we abstract over the type
constructor c that is used to form the collection
type c e, and not over that collection type itself,
represented by ce in the original class declaration.
This avoids the immediate problems that we mentioned above:
- empty has type Collects e c => c e,
which is not ambiguous.
- The function f from the previous section has a
more accurate type:
f :: (Collects e c) => e -> e -> c e -> c e
- The function g from the previous section is now
rejected with a type error as we would hope because the type
of f does not allow the two arguments to have different
types.
This, then, is an example of a multiple parameter class that does
actually work quite well in practice, without ambiguity problems.
There is, however, a catch. This version of the Collects class
is nowhere near as general as the original class seemed to be: only
one of the four instances in Section 7.1.1 can be
used with this version of Collects because only one of
them---the instance for lists---has a collection type
that can be written in the form c e, for some type
constructor c, and element type e.
7.1.1.3 Adding dependencies
To get a more useful version of the Collects class, Hugs provides a
mechanism that allows programmers to specify dependencies between the
parameters of a multiple parameter class (For readers with an
interest in theoretical foundations and previous work: The use of
dependency information can be seen both as a
generalization of the proposal for `parametric type classes' that was
put forward by Chen, Hudak, and Odersky [paramTC], or as a special
case of the later framework for improvement [improvement] of
qualified types. The underlying ideas are also discussed in a more
theoretical and abstract setting in a manuscript [implparam], where
they are identified as one point in a general design space for systems
of implicit parameterization.).
To start with an abstract example, consider a declaration such as:
class C a b where ...
which tells us simply that C can be thought of as a binary
relation on types (or type constructors, depending on the kinds
of a and b). Extra clauses can be included in the
definition of classes to add information about dependencies
between parameters, as in the following examples:
class D a b | a -> b where ...
class E a b | a -> b, b -> a where ...
The notation a -> b used here between
the | and where symbols---not to be confused with a
function type---indicates that the a parameter
uniquely determines the b parameter, and might be read
as "a determines b." Thus D is not just
a relation, but actually a (partial) function. Similarly, from the two
dependencies that are included in the definition of E, we can
see that E represents a (partial) one-one mapping between types.
More generally, dependencies take the form x1 ... xn -> y1 ... ym,
where x1, ..., xn, and y1, ..., yn are
type variables with n>0 and m>=0, meaning that the
y parameters are uniquely determined by the x parameters.
Spaces can be used as separators if more than one variable appears
on any single side of a dependency, as in t -> a b. Note that a
class may be annotated with multiple dependencies using commas as
separators, as in the definition of E above.
Some dependencies that we can write in this notation are redundant,
and will be rejected by Hugs because they don't serve any useful purpose,
and may instead indicate an error in the program. Examples of dependencies
like this include a -> a, a -> a a, a ->, etc.
There can also be some redundancy if multiple dependencies are given,
as in a->b, b->c, a->c, and in which some subset implies the
remaining dependencies. Examples like this are not treated as errors.
Note that dependencies appear only in class declarations, and not in
any other part of the language. In particular, the syntax for instance
declarations, class constraints, and types is completely unchanged.
By including dependencies in a class declaration, we provide a
mechanism for the programmer to specify each multiple parameter class
more precisely. The compiler, on the other hand, is responsible for
ensuring that the set of instances that are in scope at any given
point in the program is consistent with any declared dependencies.
For example, the following pair of instance declarations cannot
appear together in the same scope because they violate the dependency
for D, even though either one on its own would be acceptable:
instance D Bool Int where ...
instance D Bool Char where ...
Note also that the following declaration is not allowed, even by itself:
instance D [a] b where ...
The problem here is that this instance would allow one particular
choice of [a] to be associated with more than one choice
for b, which contradicts the dependency specified in the
definition of D. More generally, this means that, in
any instance of the form:
instance D t s where ...
for some particular types t and s, the only variables
that can appear in s are the ones that appear in t,
and hence, if the type t is known, then s will be
uniquely determined.
The benefit of including dependency information is that it allows us to
define more general multiple parameter classes, without ambiguity
problems, and with the benefit of more accurate types. To illustrate
this, we return to the collection class example, and annotate the
original definition from Section 7.1.1 with a
simple dependency:
class Collects e ce | ce -> e where
empty :: ce
insert :: e -> ce -> ce
member :: e -> ce -> Bool
The dependency ce -> e here specifies that the
type e of elements is uniquely determined by the
type of the collection ce. Note that both
parameters of Collects are of kind *; there
are no constructor classes here. Note too that all of
the instances of Collects that we gave in
Section 7.1.1 can be used together
with this new definition.
What about the ambiguity problems that we encountered
with the original definition? The empty function
still has type Collects e ce => ce, but it is no
longer necessary to regard that as an ambiguous type:
Although the variable e does not appear on the
right of the => symbol, the dependency for
class Collects tells us that it is uniquely
determined by ce, which does appear on the right
of the => symbol. Hence the context in
which empty is used can still give enough information
to determine types for both ce and e, without
ambiguity. More generally, we need only regard a type as
ambiguous if it contains a variable on the left of the => that
is not uniquely determined (either directly or indirectly) by
the variables on the right.
Dependencies also help to produce more accurate types for user defined
functions, and hence to provide earlier detection of errors, and less
cluttered types for programmers to work with. Recall the previous
definition for a function f:
f x y = insert x y = insert x . insert y
for which we originally obtained a type:
f :: (Collects a c, Collects b c) => a -> b -> c -> c
Given the dependency information that we have for Collects,
however, we can deduce that a and b must be equal
because they both appear as the second parameter in
a Collects constraint with the same first parameter c.
Hence we can infer a shorter and more accurate type for f:
f :: (Collects a c) => a -> a -> c -> c
In a similar way, the earlier definition of g will now be
flagged as a type error.
Although we have given only a few examples here, it should be clear
that the addition of dependency information can help to make multiple
parameter classes more useful in practice, avoiding ambiguity problems,
and allowing more general sets of instance declarations.
7.1.2 More flexible instance declarations
Hugs mode does not place any syntactic restrictions on the form of
type expression or class constraints that can be used in an instance
declaration. (Apart from the normal restrictions to ensure that such
type expressions are well-formed, of course.)
For example, the following definitions are all acceptable:
instance (Eq [Tree a], Eq a) => Eq (Tree a) where ...
instance Eq a => Eq (Bool -> a) where ...
instance Num a => Num (String,[a]) where ...
Compare this with the restrictions of Haskell 98, which allow only
variables (resp. `simple' types) as the arguments of classes on the
left (resp. right) hand side of the => sign. The price for
this extra flexibility is that it is possible to code up arbitrarily
complex instance entailments, which means that checking entailments,
and hence calculating principal types, is, in the general case,
undecidable. The setting for the -c option, described
in Section 4.2, will cause the type checker to fail if the complexity
of checking of entailments rises above a certain level. Usually, this
results from examples that would otherwise cause the type checker
to go into an infinite loop.
It is possible that some syntactic restrictions on instance declarations
might be introduced at some point in the future in a way that will offer
much of the flexibility of the current approach, but in a way that
guarantees decidability.
7.1.3 Overlapping instances
The command line option +o can be used to enable support for
overlapping instance declarations, provided that one of each overlapping
pair is strictly more specific than the other. This facility has been
introduced in a way that does not compromise the coherence of the type
system. However, its semantics differs slightly from the semantics of
overlapping instances in Gofer, so users may sometimes be surprised with
the results. This is why we have decided to allow this feature to be
turned on or off by a command line option (the default is off).
If practical experience with overlapping instances is positive then
we may change the current default, or even remove the option.
If the command line option +m is selected, then a lazier
form of overlapping instances is supported, which we refer to
as `multi instance resolution.' The main idea is to omit the
normal tests for overlapping instances, but to generate an
error message if the type checker can find more than one way
to resolve overloading for a particular instance of the class.
For example, with the +m option selected, then the two
instance declarations in the following program are accepted,
even though they have overlapping (in fact, identical) constraints
on the right of the => symbol:
class Numeric a where describe :: a -> String
instance Integral a => Numeric a where describe n = "Integral"
instance Floating a => Numeric a where describe n = "Floating"
As it turns out, these instances do not cause any problems in
practice because they can be distinguished by the contexts on
the left of the => symbol; no standard type is an instance
of both the Integral and the Floating classes:
Main> describe (23::Int)
"Integral"
Main> describe (23::Float)
"Floating"
Main>
Note that this experimental feature may not be supported
in future releases.
7.1.4 More flexible contexts
Haskell 98 allows only class constraints of the form C (a t1 ... tn)
to appear in the context of any declared or inferred type, where C is
a class, a is a variable, and t1, ..., tn are
arbitrary types (n>=0). Class constraints of this form are sometimes
characterized as being in head normal form. In many practical cases, we
have n=0, corresponding to class constraints of the form C a.
In Hugs mode, these restrictions are relaxed, and any type,
whether in head normal form or not, is permitted to appear
in a context. For example, the principal type of an
expression (\x -> x==[]) is Eq [a] => [a] -> Bool,
reflecting the fact that the equality function is used to compare
lists of type [a]. In previous versions of Hugs, and in Haskell 98,
an inferred type of Eq a => [a] -> Bool would have been produced
for this term. The latter type can still be used if an explicit type
signature is provided for the term, assuming that an instance declaration
of the form:
instance Eq a => Eq [a] where ...
is in scope. For example, the following program is valid:
f :: Eq a => [a] -> Bool
f x = x==[]
Note that contexts are not reduced by default because this gives more
general types (and potentially more efficient handling of overloading).
7.2 Extensible records: Trex
Hugs supports a flexible system of extensible records, sometimes
referred to as "Trex". The theoretical foundations for this,
and a comparison with related work, is provided in a report by
Gaster and Jones [GasterJones]. This section provides some
background details for anybody wishing to experiment with the
implementation of extensible records that is supported in the
current distribution of Hugs. Please note that support for this
extension in any particular build of the Hugs system is determined
by a compile-time setting. If the version of Hugs that you are
using was built without including support for extensible records,
then you will not be able to use the features described here.
The current implementation does not use our prefered syntax for record
operations; too many of the symbols that we would like to have used are
already used in conflicting ways elsewhere in the syntax of Haskell 98.
7.2.1 Basic concepts
In essence, records are just collections of values, each of which is
associated with a particular label. For example:
(a = True, b = "Hello", c = 12::Int)
is a record with three components: an a field, containing a boolean
value, a b field containing a string, and a c field containing
the number 12. The order in which the fields are listed is not
significant, so the same record value could also be written as:
(c = 12::Int, a = True, b = "Hello")
These examples show simple ways to construct record values. We can
also inspect the values held in a record using selector functions.
These are written with a # character, followed immediately
by the name of a field.
For example:
Prelude> #a (a = True, b = "Hello", c = 12::Int)
True
Prelude> #b (a = True, b = "Hello", c = 12::Int)
"Hello"
Prelude> #c (a = True, b = "Hello", c = 12::Int)
12
Prelude>
Note, howevever, that there is a conflict here
with the syntax of Haskell 98 that you should be aware of if you are
running in Hugs mode with an infix operator # and with support
for records enabled. Under these circumstances, an expression of
the form f#g will parse as f (#g) --- the application
of a function f to a selector function #g --- and not
as f # g --- the application of an infix # operator to two
arguments f and g. To obtain the second of these
interpretations, there must be at least
one space between the # and g tokens.
Record values can also be inspected by using pattern matching, with a
syntax that mirrors the notation used for constructing a record. For
example:
Prelude> (\(a=x, c=y, b=_) -> (y,x)) (a = True, b = "Hello", c = 12::Int)
(12,True)
Prelude>
The order of fields in a record pattern is significant because it
determines the order---from left to right---in which they are
matched. In the following example, an attempt to match the
pattern (a=[x], b=True) against the record (b=undefined, a=[]),
fails because [x] does not match the empty list, but a match
against (a=[2],b=True) succeeds, binding x to 2:
Prelude> [ x | (a=[x], b=True) <- [(b=undefined, a=[]), (a=[2],b=True)]]
[2]
Prelude>
Changing the order of the fields in the pattern
to (b=True, a=[x]) forces matching to start with
the b component. But the first element in the list
of records used above has undefined in its b component,
so now the evaluation produces a run-time error message:
Prelude> [ x | (b=True, a=[x]) <- [(b=undefined, a=[]), (a=[2],b=True)]]
Program error: {undefined}
Prelude>
Although Hugs lets you work with record values, it does not, by
default, allow you to print them. More accurately, it does not
automatically provide instances of the Show class for
record values. So a simple attempt to print a record value
will result in an error like the following:
Prelude> (a = True, b = "Hello", c = 12::Int)
ERROR: Cannot find "show" function for:
*** expression : (a=True, b="Hello", c=12)
*** of type : Rec (a::Bool, b::[Char], c::Int)
Prelude>
The problem here occurs because Hugs attempts to display the
record by applying the show function to it, and no
version of show has been defined. If you do want to
be able to display record values, then you should load or
import the Trex module---which is usually included
in the lib/hugs directory of the Hugs distribution:
Prelude> :load Trex
Trex> (a = True, b = "Hello", c = 12::Int)
(a=True, b="Hello", c=12)
Trex> (c = 12::Int, a = True, b = "Hello")
(a=True, b="Hello", c=12)
Trex>
Note that the fields are always displayed with their labels in
alphabetical order. The fact that the fields appear in a specific
(but, frankly, arbitrary) order is very important---show is
a normal function, so its output must be uniquely determined by its
input, and not by the way in which that input value is written.
The records used in the example above have exactly the same value,
so we expect exactly the same output for each.
In a similar way, it is sometimes useful to test whether two records
are equal by using the == operator. Any program that requires
this feature can obtain the necessary instances of the Eq class
by importing the Trex library, as shown above.
Of course, like all other values in Haskell, records have types,
and these are written using expressions of the
form Rec r where Rec is a built-in type constructor
and r represents a `row' that associates labels with
types. For example:
Trex> :t (c = 12::Int, a = True, b = "Hello")
(a=True, b="Hello", c=12) :: Rec (a::Bool, b::[Char], c::Int)
Trex>
The type here tells us, unsurprisingly, that the
record (a=True,b="Hello",c=12) has three components:
an a field containing a Bool,
a b field containing a String, and
a c field of type Int.
As with record values themselves, the order of the components in a
row is not significant:
Trex> (a=True, b="Hello", c=12) :: Rec (b::String, c::Int, a::Bool)
(a=True, b="Hello", c=12)
Trex>
However, the type of a record must be an accurate reflection of the
fields that appear in the corresponding value. The following example
produces an error because the specified type does not list all of the
fields in the record value:
Trex> (a=True, b="Hello", c=12) :: Rec (b::String, c::Int)
ERROR: Type error in type signature expression
*** term : (a=True, b="Hello", c=12)
*** type : Rec (a::Bool, b::[Char], c::a)
*** does not match : Rec (b::String, c::Int)
*** because : field mismatch
Trex>
Notice that Trex does not allow the kind of subtyping on record values
that would allow a record like (a=True, b="Hello", c=12) to be
treated implicitly as having type Rec (b::String, c::Int), simply
by `forgetting' about the a field. Finding an elegant, efficient,
and tractable way to support this kind of implicit coercion in a way that
integrates properly with other aspects of the Hugs type system remains an
interesting problem for future research.
7.2.2 Extensibility
An important property of the Trex system is that the same label name
can appear in many different record types, and potentially with a
different value type in each case. However, all of the features that we
have seen so far deal with records of some fixed `shape', where the set of
labels and the type of values associated with each one are fixed, and there
is no apparent relationship between records of different type.
In fact, all record values and record types in Trex are built-up
incrementally, starting from an empty record and extending it with
additional fields, one at a time. It is for this reason that Trex
values are often referred to as extensible records.
In the simplest case, any given record r can be extended with a
new field labelled l, provided that r does not already
include an l field. For example, we can
construct (a=True, b="Hello") by
extending (a = True) with a field b="Hello":
Trex> (b = "Hello" | (a = True))
(a=True, b="Hello")
Trex>
Alternatively, we can construct the same result by
extending (b = "Hello") with a
field a=True:
Trex> (a = True | (b = "Hello"))
(a=True, b="Hello")
Trex>
The syntax of the current implementation allows us to add several new
fields at a time (the corresponding syntax for pattern matching is
also supported):
Trex> (a=True, b="Hello", c=12::Int | (b1="World"))
(a=True, b="Hello", b1="World", c=12)
Trex>
On the other hand, a record cannot be extended with a field of the same
name, even if it has a different type. The following examples illustrate
this:
Trex> (a=True | (a=False))
ERROR: Repeated label "a" in record (a=True, a=False)
Trex> (a=True | r) where r = (a=12::Int)
ERROR: (a::Int) already includes a "a" field
Trex>
Notice that Hugs produced two different kinds of error message here.
In the first case, the presence of a repeated label was detected
syntactically. In the second example, the problem was
detected using information about the type of the record r.
Much the same syntax can be used in patterns to decompose record values:
Trex> (\(b=bval | r) -> (bval,r)) (a=True, b="Hello")
("Hello",(a=True))
Trex>
In the previous examples, we saw how a record could be extended with
new fields. As this example shows, we can use pattern matching to do
the reverse operation, removing fields from a record.
We can also use pattern matching to understand how selector functions
like #a, #b, and so on are implemented. For example,
the selector #x is equivalent to the
function (\ (x=value | _) -> value).
A selector function like this is polymorphic in the sense that it can
be used with anyrecord containing an x field, regardless of
the type associated with that particular component, or of any other fields
that the record might contain:
Trex> (\(x=value | _) -> value) (x=True, b="Hello")
True
Trex> (\(x=value | _) -> value) (name="Hugs", age=2, x="None")
"None"
Trex>
To understand how this works, it is useful to look at the type that
Hugs assigns to this particular selector function:
Trex> :type (\(x=value | _) -> value)
\(x=value | _) -> value :: r\x => Rec (x::a | r) -> a
Trex>
There are two important pieces of notation here that deserve further
explanation:
- Rec (x::a | r) is the type of a record with
an x component of type a. The row
variabler represents the rest of the row;
that is, it represents any other fields in the record
apart from x. This syntax---for record type
extension---was chosen to mirror the syntax that we have
already seen in the examples above for record
value extension.
- The constraint r\x tells us that the type on the right
of the => symbol is only valid if "r lacks x,"
that is, if r is a row that does not contain an x field.
If you are already familiar with Haskell type classes, then you may
like to think of \x as a kind of class constraint,
written with postfix syntax, whose instances are precisely the
rows without an x field.
For example, if we apply our selector function to a
record (x=True,b="Hello") of type Rec (b::String, x::Bool),
then we instantiate the variables a and r in the type above
to Bool and (b::String), respectively.
In fact, the built-in selector functions have exactly the same type as
the user-defined selector shown above:
Prelude> :type #x
#x :: b\x => Rec (x::a | b) -> a
Prelude>
The row constraints that we see here can also occur in the type of any
function that operates on record values if the types of those records
are not fully determined at compile-time. For example, given the following
definition:
average r = (#x r + #y r) / 2
Hugs infers a principal type of the form:
average :: (Fractional a, b\y, b\x) => Rec (y::a, x::a | b) -> a
However, any of the following, more specific types could be specified in
a type declaration for the average function:
average :: (Fractional a) => Rec (x::a, y::a) -> a
average :: (r\x, r\y) => Rec (x::Double, y::Double | r) -> Double
average :: Rec (x::Double, y::Double) -> Double
average :: Rec (x::Double, y::Double, z::Bool) -> Double
Each of these types is an instance of the principal type given above.
These examples show an important difference between the system of
records described here, and the record facilities provided by SML.
In particular, SML prohibits definitions that involve records for
which the complete set of fields cannot be determined at compile-time.
So, the SML equivalent of the average function described above would
be rejected because there is no way to determine if the record r will
have any fields other than x or y. SML programmers usually
avoid such problems by giving a type annotation that completely specifies the
structure of the record. But, of course, if a definition is limited
in this way, then it also less flexible.
With the current implementation of our type system, there is an advantage
to knowing the full type of a record at compile-time because it allows
the compiler to generate more efficient code. However, unlike SML, the
type system also offers the extra flexibility of polymorphism and
extensibility over records if that is needed.
7.3 Other type system extensions
In this section, we describe several other type system extensions that
are currently available in Hugs mode.
7.3.1 Enhanced polymorphic recursion
As required by the Haskell 98 report, Hugs supports full polymorphic
recursion, even for functions with overloaded types. This means
that Hugs will accept definitions like the following:
p :: Eq a => a -> Bool
p x = x==x && p [x]
(Note that the type signature here is not optional.)
In fact, Hugs goes further than is implied by the Haskell 98 report
by using programmer supplied type signatures to reduce type checking
dependencies within individual binding groups. For example, the
following definitions are acceptable, even though there is no
explicit type signature for the function q:
p :: Eq a => a -> Bool
p x = x==x && q [x]
q x = x==x && p [x]
This is made possible by the observation that we can calculate
a type for q, without needing to calculate the type
of p at the same time because the type of p is
already specified.
7.3.2 Rank 2 polymorphism
Hugs provides a facility that allows the definition of functions
that take polymorphic arguments. This includes functions defined
at the top-level, in local definitions, in class members, and in
primitive declarations. In addition, Hugs allows the definition of
datatypes with polymorphic and qualified types. The following examples
illustrate the syntax that is used:
amazed :: (forall a. a -> a) -> (Bool,Char)
amazed i = (i True, i 'a')
twice :: (forall b. b -> f b) -> a -> f (f a)
twice f = f . f
There are a number of important points to note here.
- In Hugs mode, forall is a reserved word.
- Quantified variables may be of any kind,
including * (types) or * -> * (unary type
constructors), as in the examples above.
- Variables quantified in a forall type must appear
in the scope of the quantifier. Unused quantified variables would
serve no useful purpose, and are perhaps most likely to occur as
the result of mispelling a variable name.
- Nested quantifiers are not allowed, and quantifiers can only appear
in the types of function arguments, not in the results.
- A function can only take polymorphic arguments if an explicit type
signature is provided for that function. Any call to such a function
must have at least as many arguments as are needed to include the
rightmost argument with a quantified type. For example, neither of
the functions amazed or twice defined above can be
partially applied.
- It is not necessary for all polymorphic arguments to appear at the
beginning of a type signature. For example, the following type
signature is valid:
eg :: Int -> (forall a. [a] -> [a]) -> Int -> [Int]
However, as a consequence of the rules given above,
the eg function defined here must always be applied
to at least two arguments, even though the first of these
does not have a polymorphic type.
- In the definition of a function, there must be at least as many
arguments on the left hand side of the definition as are needed to
included the rightmost argument with a quantified type. Only
variables (or a wildcard, _) can be used as arguments
on the left hand side of a function definition where a value of
polymorphic type is expected.
- Arbitrary expressions can be used for polymorphic arguments in a
function call, provided that they can be assigned the necessary
polymorphic type. For example, all of the following expressions
are valid calls to the amazed function defined above:
amazed (let i x = x in i)
amazed (\x -> x)
amazed (id . id . id . id)
amazed (id id id id id)
A similar syntax can be used to include polymorphic components in
datatypes, as illustrated by the following examples:
data Monad1 m = MkMonad1 {
unit1 :: (forall a. a -> m a),
bind1 :: (forall a b. m a -> (a -> m b) -> m b)
}
data Monad2 m = MkMonad2 (forall a. a -> m a)
(forall a b. m a -> (a -> m b) -> m b)
listMonad1 = MkMonad1 {unit1 = \x->[x],
bind1 = \x f -> concat (map f x)}
listMonad2 = MkMonad1 (\x->[x]) (\x f -> concat (map f x))
In this case, MkMonad1 and MkMonad2 have types:
(forall b. b -> m b) -> (forall b c. m b -> (b->m c) -> m c) -> Monad1 m
(forall b. b -> m b) -> (forall b c. m b -> (b->m c) -> m c) -> Monad2 m
respectively, while listMonad1 and listMonad2 have types:
Monad1 []
Monad2 []
Note that an expression like (MkMonad2 (\x->[x])) will
not be allowed because, by the rules above, the
constructor MkMonad2 can only be used when both arguments
are provided. An attempt to correct this problem by eta-expansion,
such as (\b -> MkMonad2 (\x->[x]) b), will also fail because
the new variable, b, that this introduces is now lambda-bound
and hence the type that we obtain for it will not be as general as
the MkMonad2 constructor requires. We can, however, use an
auxiliary function with an explicit type signature to achieve the
desired effect:
halfListMonad :: (forall a b. [a] -> (a -> [b]) -> [b]) -> Monad2 []
halfListMonad b = MkMonad2 (\x -> [x]) b
In the current implementation, the named update syntax for Haskell
datatypes (in expressions like exp{field=newValue}) cannot be
used with datatypes that include polymorphic components.
The runST primitive that is used in work with lazy state
threads is now handled using the facilities described here to define
it as a function:
runST :: (forall s. ST s a) -> a
As a result, it is no longer necessary to build the ST type
into the interpreter; to make use of these facilities, a program
should instead import the ST library (or it's lazier
variant, LazyST).
A further consequence of this is that the ST and LazyST
libraries cannot be used when Hugs is running in Haskell 98 mode,
because that prevents the definition and use of values
like runST that require rank 2 types.
7.3.3 Type annotations in patterns
Hugs allows patterns of the form (pat :: type) to be used as type
annotations (in the style of Standard ML). To allow effective
type inference, the type specified here must
be a monotype (no forall part or class constraints are allowed), but
it may include variables, which, with one exception noted below, have the
same scope as the patterns in which
they appear. For example, the term \(x::Int) -> x has
type Int -> Int, while the
expression \(x::a) (xs::[a]) -> xs ++ [x] has
type a -> [a] -> [a].
Use of this feature is subject to the following rules:
- It is an error for a variable to be used in a type where a more
specific type is inferred. For example, (\(x::a) -> not x) is
not a valid expression.
- It is an error for distinct variables to be used where the
types concerned are the same. For example, the expression
(\(x::a) (y::b) -> [x,y]) is not valid.
- Type variables bound in a pattern may be used in type signatures
or further pattern type annotations within the scope of the
binding. For example:
f (x::a) = let g :: a -> [a]
g y = [x,y]
in g x
In current versions of Haskell, there is no way to write a type
for the local function g in this example because of the convention
that free type variables are implicitly bound by a universal
quantifier. In this example, the variable is instead bound in
the pattern (x::a) and so the type assigned to g is actually
monomorphic.
- Type signatures do not introduce bindings for type variables,
but may involve type variables bound in an enclosing scope.
For example, there is no direct relation between the
variable t appearing in the type signature and the
variable t appearing in the pattern annotation in the
following code:
pair :: t -> s -> (t,s)
pair x (y::t) = (x,y::t)
The explanation for this is that the type signature
for pair (which might, in practice, be separated from
the definition) is not in the scope of the binding of the
variables x and y.
- In the current implementation, pattern type annotations that
include variables are allowed on the left hand side of a
pattern binding, but scope only over the right hand side of the
binding.
7.3.4 Existential types
Hugs supports a form of existential types in datatype definitions in
the style originally suggested by Perry and by
Laufer
Existentially
quantified type variables must be bound by an explicit forall construct
preceding the name of the constructor in which the existentially quantified
variables appear. The apparently counterintuitive use of forall to
capture existentially quantified variables becomes clearer when we look at
an example:
data Appl = forall a. MkAppl (a -> Int) a (a -> a)
and consider that the MkAppl constructor defined here
does indeed have a fully polymorphic type:
MkAppl :: (a -> Int) -> a -> (a -> a) -> Appl.
Because the variable a does not appear in the result type, the choice
of a in any particular use of MkAppl will be hidden. As a result,
when a MkAppl constructor is used in a pattern match, we must be
careful that the hidden type does not `escape' into the result type
or into the enclosing assumptions. For example, the following
definitions are acceptable:
good1 (MkAppl f x i) = f x
good2 (MkAppl f x i) = map f (iterate i x)
but the next two definitions are not:
bad1 (MkAppl f x i) = x
bad3 y = let g (MkAppl f x i) = length [x,y] + 1 in True
The facilities for type annotations in patterns that were described
in Section 7.3.3 can be used in conjunction with existentials, as
in the example:
good (MkAppl f (x::a) i) = map f (iterate i x :: [a])
In this case, the typing annotations are redundant, although they do
still provide potentially useful information for the programmer.
A datatype whose definition involves existentially quantified variables
cannot use the standard Haskell mechanisms for deriving instances
of standard classes like Eq and Show. If instances of
these classes are required, then they must be provided explicitly by
the programmer. It is possible, however, to attach type class
constraints to existentially quantified variables in a datatype
definition. For example, we can define a type of "show"able
values using the definition:
data Showable = forall a. Show a => MkShowable a
This will mean that all of the operations of the specified classes, in
this case just Show, are available when a value of this type is
unpacked during pattern matching. For example, this can be put to
good use to define a simple instance of Show for
the Showable datatype:
instance Show Showable where
show (MkShowable x) = show x
This definition can now be used in examples like the following:
Main> map show [MkShowable 3, MkShowable True, MkShowable 'a']
["3", "True", "'a'"]
Main>
7.3.5 Restricted type synonyms
Hugs supports the use of restricted type synonyms, first introduced in
Gofer, and similar to the mechanisms for defining abstract datatypes
that were provided in several earlier languages. The purpose of
a restricted type synonym is to restrict the expansion
of a type synonym to a particular set of functions. Outside of the
selected group of functions, the synonym constructor behaves like
a standard datatype. More precisely, a restricted type synonym
definition is a top level declaration of the form:
type T a1 ... am = rhs in f1, ..., fn
where T is a new type constructor name
and rhs is a type expression typically involving some of the
(distinct) type variables a1, ..., am.
The major difference with a normal type synonym definition
is that the expansion of
the type synonym can only be used within the binding group of one
of the functions f1, ..., fn (all of which must be
defined by top-level definitions in the module containing
the restricted type synonym definition). In the definition of any
other value, T
is treated as if it had been introduced by a definition of the form:
data T a1 ... am = ...
For a simple example of this,
consider the following definition of a datatype of stacks in terms of
the standard list type:
type Stack a = [a] in emptyStack, push, pop, top, isEmpty
emptyStack :: Stack a
emptyStack = []
push :: a -> Stack a -> Stack a
push = (:)
pop :: Stack a -> Stack a
pop [] = error "pop: empty stack"
pop (_:xs) = xs
top :: Stack a -> a
top [] = error "top: empty stack"
top (x:_) = x
isEmpty :: Stack a -> Bool
isEmpty = null
The type signatures here are particularly important. For example,
because emptyStack is mentioned in the definition of the restricted
type synonym Stack, the definition of emptyStack is type
correct. The declared type for emptyStack is Stack a which
can be expanded to [a], agreeing with the type for the empty
list []. However, in an expression outside the binding group
of these functions, the Stack a type is quite distinct from
the [a] type:
? emptyStack ++ [1]
ERROR: Type error in application
*** Expression : emptyStack ++ [1]
*** Term : emptyStack
*** Type : Stack b
*** Does not match : [a]
?
The binding group of a value is to the set of values whose
definitions are in the same mutually recursive group of bindings.
In particular, this does not extend to class and instance declarations
so we can define instances such as:
instance Eq a => Eq (Stack a) where
s1 == s2 | isEmpty s1 = isEmpty s2
| isEmpty s2 = isEmpty s1
| otherwise = top s1 == top s2 && pop s1 == pop s2
As a convenience, Hugs allows the type signatures of functions
mentioned in the type synonym declaration to be specified within the
definition. Thus the above example could also have been written as:
type Stack a = [a] in
emptyStack :: Stack a,
push :: a -> Stack a -> Stack a,
pop :: Stack a -> Stack a,
top :: Stack a -> a,
isEmpty :: Stack a -> Bool
emptyStack = []
...
If a type signature is included as part of the definition of a restricted
type synonym, then the declaration should not be repeated elsewhere in the
module; Hugs will reject any attempt to do this by complaining about
a repeated type signature.
7.4 Implicit parameters
Hugs supports an experimental implementation of
Implicit Parameters, which provides a technique for introducing
dynamic binding of variables into a language with a Hindley-Milner based
type system. This is based on as-yet-unpublished work by Jeff Lewis,
Erik Meijer and Mark Shields. The prototype implementation, and much
of the following description, was provided by Jeff Lewis.
A variable is called dynamically bound when it is bound by the calling
context of a function and statically bound when bound by the callee's
context. In Haskell, all variables are statically bound. Dynamic binding
of variables is a notion that goes back to Lisp, but was later discarded
in more modern incarnations, such as Scheme. Dynamic binding can be very
confusing in an untyped language, and unfortunately, typed languages, in
particular Hindley-Milner typed languages like Haskell, only support static
scoping of variables.
However, by a simple extension to the type class system of Haskell,
we can support dynamic binding. Basically, we express the use
of a dynamically bound variable as a constraint on the type.
These constraints lead to types of the form (?x::t') => t,
which says "this function uses a dynamically-bound variable
?x of type t'".
For example, the following expresses the type of a sort
function, implicitly parameterized by a comparison function
named cmp.
sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
The dynamic binding constraints are just a new form of
predicate in the type class system.
An implicit parameter is introduced by the special form ?x,
where x is any valid identifier. Use if this construct
also introduces new dynamic binding constraints.
For example, the following definition shows how we can define
an implicitly parameterized sort function in terms of an
explicitly parameterized sortBy function:
sortBy :: (a -> a -> Bool) -> [a] -> [a]
sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
sort = sortBy ?cmp
Dynamic binding constraints behave just like other type class constraints
in that they are automatically propagated. Thus, when a function is used,
its implicit parameters are inherited by the function that called it.
For example, our sort function might be used to pick out the least
value in a list:
least :: (?cmp :: a -> a -> Bool) => [a] -> a
least xs = fst (sort xs)
Without lifting a finger, the ?cmp parameter is propagated to
become a parameter of least as well.
With explicit parameters, the default is that parameters must always
be explicit propagated. With implicit parameters,
the default is to always propagate them.
However, an implicit parameter differs from other type class
constraints in the following way: All uses of a particular implicit
parameter must have the same type.
This means that the type of (?x, ?x) is (?x::a) => (a, a),
and not (?x::a, ?x::b) => (a, b), as would be the case for
type class constraints.
An implicit parameter is bound using an expression of the
form e with binds, or equivalently as dlet binds in e,
where both with and dlet (dynamic let) are new keywords.
These forms bind the implicit parameters arising in the body, not
the free variables as a let or where would do.
For example, we define the min function by binding cmp.
min :: [a] -> a
min = least with ?cmp = (<=)
Syntactically, the binds part of a with or dlet construct
must be a collection of simple bindings to variables (no function-style
bindings, and no type signatures); these bindings are neither polymorphic
or recursive.
The Hugs 98 User Manual
top | back | next
May 22, 1999