Avl Trees

Data Definition

An Avl tree is a binary search tree. It maintains the search invariant, and it also maintains an invariant about the height of sub-trees. This invariant ensures that an Avl tree is always roughly balanced, and searching it will always be possible in time proportional to the log n where n is the number of elements stored in the tree.

A definition in Haskell for an Avl tree is:

data Avl a where
  Tip:: Avl a
  Node:: Balance -> Avl a -> a -> Avl a -> Avl a

data Balance where
  Same :: Balance
  Less :: Balance
  More :: Balance


The invariants are

When a tree is created using the Node constructor care must be taken to always use the correct tag (Same, Less, or More) to encode the exact relationship.

Using Rotations to build balanced trees.

The strategy used is that whenever a new tree is created by applying the Node constructor, and the height relation ship cannot be captured by Same, Less, or More we repeatedly apply rotation to transform the tree until one of the tags reflects the correct relationship of the transformed tree.

If we are clever and careful, we need only transformation we need is one where the the two sub-trees we are connecting using Node have a height difference of exactly two. We can capture this case with a pair of functions. If the left subtree has height 2 greater than the right sub-tree we use rotR, and if the right subtree has height 2 greater than the left sub-tree we use rotL.

rotR :: Avl a -> a -> Avl a -> (Balance, Avl a)
--        n+2            n       n+2 if Balance = Same, or
--                               n+3 if Balance = More

rotL :: Avl a -> a -> Avl a -> (Balance, Avl a)
--        n             n+2       n+2 if Balance = Same, or
--                                n+3 if Balance = More

The resulting tree might grow by height one, depending upon the exact heights of the two sub-trees, and we tag this possibility by returning a pair as outlined in comments above.

There are a number of possibilities for the height of the two subtrees. The possibilities for rotR are shown graphically below. The possibilites for rotL are similar and are not shown.


We can capture this by the following Haskell functions.

rotR :: Avl a -> a -> Avl a -> (Balance, Avl a)
--        n+2            n       n+2 if Balance = Same, or
--                               n+3 if Balance = More
rotR Tip u a = unreachable
rotR (Node Same b v c) u a = grew(Node Less b v (Node More c u a))
rotR (Node More b v c) u a = same(Node Same b v (Node Same c u a))
rotR (Node Less b v Tip) u a = unreachable
rotR (Node Less b v (Node Same x m y)) u a = same(Node Same (Node Same b v x) m (Node Same y u a))
rotR (Node Less b v (Node Less x m y)) u a = same(Node Same (Node More b v x) m (Node Same y u a))
rotR (Node Less b v (Node More x m y)) u a = same(Node Same (Node Same b v x) m (Node Less y u a))

rotL :: Avl a -> a -> Avl a -> (Balance, Avl a)
--        n             n+2       n+2 if Balance = Same, or
--                                n+3 if Balance = More
rotL a u Tip = unreachable
rotL a u (Node Same b v c) = grew(Node More (Node Less a u b) v c)
rotL a u (Node Less b v c) = same(Node Same (Node Same a u b) v c)
rotL a u (Node More Tip v c) = unreachable
rotL a u (Node More (Node Same x m y) v c) = same(Node Same (Node Same a u x) m (Node Same y v c))
rotL a u (Node More (Node Less x m y) v c) = same(Node Same (Node More a u x) m (Node Same y v c))
rotL a u (Node More (Node More x m y) v c) = same(Node Same (Node Same a u x) m (Node Less y v c))

The clauses with unreachable are not possible if the left subtree really has a height exactly two greater than the right subtree, because the tress described by the patterns have a height of a maximum of one.

We have been careful to show all possibilites of the tags.

The insertion function.

Insertion inserts an element into an arbitraryb tree that meets both the search invariant and the balance invariant and returns a new tree that also meets the two invariants.

It is similar in structure to the insertion algorithm we saw last week, which I include here for reference.

insertTree :: (Eq key,Ord key) => (key,value) -> BinSearchTree (key,value) ->  BinSearchTree (key,value)
insertTree (k,v) Empty = Branch Empty (k,v) Empty
insertTree (k,v) (Branch left (key,satelite) right)
  | k>=key =   Branch left                    (key,satelite) (insertTree (k,v) right)
  | k < key =  Branch (insertTree (k,v) left) (key,satelite) right
Note if a tree is empty, we build a new tree of height one, and if the tree is not empty, we branch either down the left or right subtree by comparing the element to be inserted to the value stored in the Node.

The insertion algortihm in Avl trees has similar structure, except we need extra care to maintain the balance invariants. This extra care requires observing the Balance tags of the existing tree, and creating the correct Balance tags for the tree being built.

ins :: (Ord a) => a -> Avl a -> (Balance, Avl a)
--                        n         n if Balance = Same, or
--                                  n+1 if Balance = More
ins x Tip = grew(Node Same Tip x Tip)
ins x (Node bal lc y rc)
  | x == y = same(Node bal lc y rc)
  | x < y  = case ins x lc of
               (Same,lc2) -> same(Node bal lc2 y rc)
               (More,lc2) ->
                 case bal of
                   Same -> grew(Node More lc2 y rc)
                   Less -> same(Node Same lc2 y rc)
                   More -> rotR lc2 y rc -- rebalance
  | x > y  = case ins x rc of
               (Same,rc2) -> same(Node bal lc y rc2)
               (More,rc2) ->
                 case bal of
                   Same -> grew(Node Less lc y rc2)
                   More -> same(Node Same lc y rc2)
                   Less -> rotL lc y rc2 -- rebalance

We defined the following functions to atg the size of the created tree.

same :: t -> (Balance, t)
same tree = (Same,tree)

grew :: t -> (Balance, t)
grew tree = (More,tree)

unreachable = undefined
