CS 581

Mechanics Syllabus Assignments Supplemental CS 581: Fall 2011	hook CS MCECS PSU

Tools Tutorial

This year Tim Sheard and I have implemented some of the constructions used in this class in Haskell.

Code fragments can be found in the directory: Code/FA

Haskell Platform: http://hackage.haskell.org/platform/

Dot vizualization tool is part of Graphviz: http://www.graphviz.org/ (Downloads available for Windows, Mac and Linux)

Tim Sheard's Haskell resource page: http://web.cecs.pdx.edu/~sheard/course/Cs163/Haskell/HaskellLinks.html

Getting started:

I recommend downloading the Haskell platform and graphviz tools for your platform.

Haskell platform includes the interpreter ghci. This provides a read-eval-print loop. ghci directives are prefixed with ":". shell directives are prefixed with ":!".

This tutorial is structured around a partial solution to Assignment 1 that I have created in the file HW1.hs.

From the shell, invoke ghci.

bash-3.2$ ghci
GHCi, version 6.10.4: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
Prelude>

The prompt "Prelude>" indicates that the Prelude module is loaded. This is the name of the basic, common environment provided to all Haskell programs. To load the file HW1.hs, use the command ":l HW1" or ":l HW1.hs".

Prelude> :l HW1.hs 
[ 1 of 14] Compiling DFA              ( DFA.hs, interpreted )
[ 2 of 14] Compiling PP               ( PP.hs, interpreted )
[ 3 of 14] Compiling Auxfuns          ( Auxfuns.hs, interpreted )
[ 4 of 14] Compiling NFAe             ( NFAe.hs, interpreted )
[ 5 of 14] Compiling NFA              ( NFA.hs, interpreted )
[ 6 of 14] Compiling RegExp           ( RegExp.hs, interpreted )
[ 7 of 14] Compiling PrintRegExp      ( PrintRegExp.hs, interpreted )
[ 8 of 14] Compiling Graphviz         ( Graphviz.hs, interpreted )
[ 9 of 14] Compiling PrintNFAe        ( PrintNFAe.hs, interpreted )
[10 of 14] Compiling PrintNFA         ( PrintNFA.hs, interpreted )
[11 of 14] Compiling PrintDFA         ( PrintDFA.hs, interpreted )
[12 of 14] Compiling Transform        ( Transform.hs, interpreted )
[13 of 14] Compiling Lecture2         ( Lecture2.hs, interpreted )
[14 of 14] Compiling HW1              ( HW1.hs, interpreted )
Ok, modules loaded: HW1, Lecture2, Graphviz, PrintRegExp, RegExp, Auxfuns, 
PrintNFAe, NFAe, PP, PrintNFA, NFA, PrintDFA, DFA, Transform.
*HW1>

This shows that ghci responds by loading the module HW1 and all of the modules that it references. Some of these modules are named explictly in the file HW1.hs, others are referenced by the named modules. The prompt is set to the name of the most recently successfully loaded module. All identifiers in scope in that module are in scope at the prompt.

To get a list of everything allowed at the prompt, type :?

To quit, type :q

Let's look at the file we just loaded. The following bits are incomplete fragments.

module HW1
where
import Text.PrettyPrint.HughesPJ(Doc,text,char,int,(<>),(<+>),($+$),($$),render)
import PP
import Auxfuns
import List

import qualified DFA as D
import qualified NFA as N
import qualified NFAe as Ne

import PrintDFA
import PrintNFA
import PrintNFAe
import Graphviz

import Lecture2

This segment declares the module being defined to be HW1. This can be optionally followed by a list of symbols to be exported. This module exports all symbols (which is the default).

The import clauses declare which modules are to be included. Text.PrettyPrint.HughesPJ is the name of a module from the standard library, which is hierarchically structured. This one is for pretty printing (formatting output). The list of symbols specifies that only those symbols will be introduced into the scope, and that they will be named directly (without dots).

The import qualified clauses import modules qualified names. In this case this is done because my definitions of DFA, NFA and NFA with epsilon moves are very similar. In this file, D.foo refers to foo from DFA, N.foo is the foo from NFA, and Ne.foo is the foo from NFAe.

The next fragment of the file solves problem 1:

-- Problem 1:  Sipser 1.3
data UD = U | D deriving (Eq, Ord, Show)

instance PP UD where pp = text . show

m13 = D.DFA { D.states = [1..5], D.symbols = [U,D], D.delta = d, D.start = 3, D.final =[3] }
      where d 1 U = 1
            d 1 D = 2
            d 2 U = 1
            d 2 D = 3
            d 3 U = 2
            d 3 D = 4
            d 4 U = 3
            d 4 D = 5
            d 5 U = 4
            d 5 D = 5

m13Diag = pdfG "p1" m13

The line beginning with -- is a comment. "--" begins a comment to end of line. {- and -} surround multi-line comments. They can be nested.

The line beginning data defines an algebraic data type UD. In this case it has two constructors, U and D.

The keyword deriving specifies that this data type is to automatically be an instance of the classes Eq, Ord, and Show. this means that it can be tested for equality, less than, and displayed as a string.

The next bit, beginning instance PP, is called an instance declaration. This is related to pretty printing. It specifies how the pp function will act on UD. Don't worry about this for now.

The declaration of m13 defines the DFA in Sipser's problem 1.3. DFA is a record type, defined in the module DFA.hs. Since the module is imported with qualified names, all values from that module are of the form "D.". Here we see that the states are given by the list [1..5], which elaborates to [1,2,3,4,5]. The alphabet is given by the list [U,D]. The transition function is given by d, which is specified in the where clause by enumerating the values from the table in Sipser. The start state is 3, the final state is the set containing 3.

Finally, the pdfG command is embedded in a value that when demanded will invoke the dot tool to generate a pdf file that writes a cartoon of the machine to the file p1.pdf. This is invoked at the top level loop as follows:

*HW1> m13Diag
Loading package syb ... linking ... done.
Loading package array-0.2.0.0 ... linking ... done.
Loading package containers-0.2.0.1 ... linking ... done.
Loading package unix-2.3.2.0 ... linking ... done.
Loading package filepath-1.1.0.2 ... linking ... done.
Loading package old-locale-1.0.0.1 ... linking ... done.
Loading package old-time-1.0.0.2 ... linking ... done.
Loading package directory-1.0.0.3 ... linking ... done.
Loading package process-1.0.1.1 ... linking ... done.
Loading package pretty-1.0.1.0 ... linking ... done.
Loading package random-1.0.0.1 ... linking ... done.
Loading package haskell98 ... linking ... done.
Loading package parsec-2.1.0.1 ... linking ... done.
Loading package mtl-1.1.0.2 ... linking ... done.
ExitSuccess
*HW1>

The "Loading package" messages will only appear once per session. ExitSuccess is the return value.

The graphic output of this can be displayed by viewing HW1/p1.pdf

The next fragment relates to problem 2. In that we need to build the intersection of two DFAs using the product construction. The product construction was used in Lecture2.hs (previously published) to do the union of two DFAs. Here we modify that to do the product (by changing the final states). The function intersectionDFA computes the intersection. The first line gives its type declaration. The bits to the left of the => are called class constraints, they specify that the two state spaces, q1 and q2, must be in the Ord class (the less than operation must be defined on them). On the right hand side we see the functional type. Given a DFA with state space q1 and alphabet s, and a second DFA with state space q2 and alphabet s, this builds a new DFA with state space q1 cross q2 and alphabet s.

The next three lines give the name of the function and its two arguments, which are both records.

The body of the function begins with =. It specifies the DFA that we are constructing to compute the product.

The state space is built from lists using a "list comprehension". This is a very powerful feature of Haskell that I encourage you to read about.

The transition function is specified by an anonymous function (a "lambda"). The \ symbol begins the anonymous function. It is followed by a specification of its arguments. The body of the anonymous function follows the ->.

The final states are specified with a list comprehension. The nub function is applied to this list. nub is a function on lists that deletes duplicate elements. It is from the List library module. The $ is an alternative notation for application. It functions the same as an open parenthesis that extends as far to the right as possible.

--  Problem 2:  1.4g  even length, odd number of a's
intersectionDFA :: (Ord q1, Ord q2) => D.DFA q1 s -> D.DFA q2 s -> D.DFA (q1, q2) s
intersectionDFA 
         (D.DFA { D.states = bigQ1, D.symbols = sigma, D.delta = d1, D.start = q10, D.final = f1})
         (D.DFA { D.states = bigQ2, D.delta = d2, D.start = q20, D.final = f2})  
    = D.DFA { D.states = [(q1,q2) | q1 <- bigQ1, q2 <- bigQ2],
              D.symbols = sigma,
              D.delta = \ (q1,q2) a -> (d1 q1 a, d2 q2 a),
              D.start = (q10,q20),
              D.final = nub $ [(q1,q2) | q1 <- f1, q2 <- f2]}

meven = D.DFA { D.states = [0,1], D.symbols = ['a','b'], D.delta = d, D.start = 0, D.final = [0] }
    where d 0 _ = 1
          d 1 _ = 0

moddas = D.DFA { D.states = [0,1], D.symbols = ['a','b'], D.delta = d, D.start = 0, D.final = [1] }
    where d i 'a' = (i+1) `mod` 2
          d i 'b' = i

m14g = intersectionDFA meven moddas

m14gDiag = do { pdfG "p2.1" meven
              ; pdfG "p2.2" moddas
              ; pdfG "p2.3" m14g
              }

After the definition of intersection, we give the definition of the two simple DFAs mentioned in the problem. The first recognizes strings over a,b with even length. The second strings over a,b with an odd number of a's.

m14g is defined to be the intersection machine.

Cartoons are generated at p2.1.pdf, p2.2.pdf, and p2.3.pdf.

Here is the rest of the file:

-- Problem 3:  1.5d

-- DFAs are closed under complement
--
compDFA :: (Ord q) => D.DFA q s -> D.DFA q s
compDFA (m@(D.DFA { D.states = bigQ, D.final = f}) )
    = m { D.final = filter (\p -> not $ p `elem` f) bigQ }

masbs = D.DFA{D.states = [0,1,2], D.symbols = ['a','b'], D.delta = d, D.start = 0, D.final = [0,1] }
   where d 0 'a' = 0
         d 0 'b' = 1
         d 1 'a' = 2
         d 1 'b' = 1
         d 2 _ = 2

masbsDiag = do { pdfG "p3.1" masbs
               ; pdfG "p3.2" (compDFA masbs)
               }

-- Problem 4:  1.31

-- Reverse of a DFA is a regular language
--

reverseDFA :: (Ord q) => D.DFA q s -> Ne.NFAe (Either q ()) s
reverseDFA (D.DFA { D.states = bigQ, 
                    D.symbols = sigma, 
                    D.delta = d, 
                    D.start = q0, 
                    D.final = f}) 
    = Ne.NFAe {Ne.states = (Right ()) : map Left bigQ,
               Ne.symbols = sigma,
               Ne.delta = delta',
               Ne.start = Right (),
               Ne.final = [Left q0]}
      where delta' (Right ()) Nothing  = map Left f
            delta' (Right ()) (Just _) = []
            delta' (Left p) (Just a)   = canonical $ [Left p' | p' <- bigQ, d p' a == p]
            delta' (Left p) Nothing    = []


-- Problem 5:  1.32

-- Binary 

bits = [0,1]

d (Just i) (a,b,c) 
    | (i + a + b) `mod` 2 == c = Just $ (i + a + b) `div` 2
    | otherwise = Nothing
d Nothing _ = Nothing

sums = D.DFA { D.states = [Just 0, Just 1, Nothing],
               D.symbols = [(a,b,c) | a <- bits, b <- bits, c <- bits],
               D.delta = d,
               D.start = Just 0,
               D.final = [Just 0]}

rsums = reverseDFA sums

rsum2 = nfaeToDfa rsums

m132Diag = do { pdfG "p5.1" sums
              ; pdfG "p5.2" rsums
              ; pdfG "p5.3" rsum2
              }

Check out the cartoons in HW1