Graphs in Computer Science

Introduction

Graphs are mathematical concepts that have found many uses in computer science. Graphs come in many different flavors, many of which have found uses in computer programs. Some flavors are:

• Simple graph
• Undirected or directed graphs
• Cyclic or acyclic graphs
• labeled graphs
• Weighted graphs
• Infinite graphs
• ... and many more too numerous to mention.

Most graphs are defined as a slight alteration of the following rules.

• A graph is made up of two sets called Vertices and Edges.
• The Verticies are drawn from some underlying type, and the set may be finite or infinite.
• Each element of the Edge set is a pair consisting of two elements from the Vertices set.
• Graphs are often depicted visually, by drawing the elements of the Vertices set as boxes or circles, and drawing the elements of the edge set as lines or arcs between the boxes or circles. There is an arc between v1 and v2 if (v1,v2) is an element of the Edge set.

Adjacency. If (u,v) is in the edge set we say u is adjacent to v (which we sometimes write as u ~ v).

For example the graph drawn below:

Has the following parts.

• The underlying set for the Verticies set is the integers.
• The Vertices set = {1,2,3,4,5,6}
• The Edge set = {(6,4),(4,5),(4,3),(3,2),(5,2),(2,1),(5,1)}

Kinds of Graphs

Various flavors of graphs have the following specializations and particulars about how they are usually drawn.
• Undirected Graphs.

In an undirected graph, the order of the vertices in the pairs in the Edge set doesn't matter. Thus, if we view the sample graph above we could have written the Edge set as {(4,6),(4,5),(3,4),(3,2),(2,5)),(1,2)),(1,5)}. Undirected graphs usually are drawn with straight lines between the vertices.

The adjacency relation is symetric in an undirected graph, so if u ~ v then it is also the case that v ~ u.

• Directed Graphs.

In a directed graph the order of the vertices in the pairs in the edge set matters. Thus u is adjacent to v only if the pair (u,v) is in the Edge set. For directed graphs we usually use arrows for the arcs between vertices. An arrow from u to v is drawn only if (u,v) is in the Edge set. The directed graph below

Has the following parts.

• The underlying set for the Verticies set is capital letters.
• The Vertices set = {A,B,C,D,E}
• The Edge set = {(A,B),(B,C),(D,C),(B,D),(D,B),(E,D),(B,E)}

Note that both (B,D) and (D,B) are in the Edge set, so the arc between B and D is an arrow in both directions.

• Vertex labeled Graphs.

In a labeled graph, each vertex is labeled with some data in addition to the data that identifies the vertex. Only the indentifying data is present in the pair in the Edge set. This is silliar to the (key,satellite) data distinction for sorting.

Here we have the following parts.

• The underlying set for the keys of the Vertices set is the integers.
• The underlying set for the satellite data is Color.
• The Vertices set = {(2,Blue),(4,Blue),(5,Red),(7,Green),(6,Red),(3,Yellow)}
• The Edge set = {(2,4),(4,5),(5,7),(7,6),(6,2),(4,3),(3,7)}

• Cyclic Graphs.

A cyclic graph is a directed graph with at least one cycle. A cycle is a path along the directed edges from a vertex to itself. The vertex labeled graph above as several cycles. One of them is 2 » 4 » 5 » 7 » 6 » 2
• Edge labeled Graphs.

A Edge labeled graph is a graph where the edges are associated with labels. One can indicate this be making the Edge set be a set of triples. Thus if (u,v,X) is in the edge set, then there is an edge from u to v with label X

Edge labeled graphs are usually drawn with the labels drawn adjacent to the arcs specifying the edges.

Here we have the following parts.

• The underlying set for the the Vertices set is Color.
• The underlying set for the edge labels is sets of Color.
• The Vertices set = {Red,Green,Blue,White}
• The Edge set = {(red,white,{white,green}) ,(white,red,{blue}) ,(white,blue,{green,red}) ,(red,blue,{blue}) ,(green,red,{red,blue,white}) ,(blue,green,{white,green,red})}

• Weighted Graphs.

A weighted graph is an edge labeled graph where the labels can be operated on by the usual arithmetic operators, including comparisons like using less than and greater than. In Haskell we'd say the edge labels are i the Num class. Usually they are integers or floats. The idea is that some edges may be more (or less) expensive, and this cost is represented by the edge labels or weight. In the graph below, which is an undirected graph, the weights are drawn adjacent to the edges and appear in dark purple.

Here we have the following parts.

• The underlying set for the the Vertices set is Integer.
• The underlying set for the weights is Integer.
• The Vertices set = {1,2,3,4,5}
• The Edge set = {(1,4,5) ,(4,5,58) ,(3,5,34) ,(2,4,5) ,(2,5,4) ,(3,2,14) ,(1,2,2)}
• Directed Acyclic Graphs.

A Dag is a directed graph without cycles. They appear as special cases in CS applications all the time.

Here we have the following parts.

• The underlying set for the the Vertices set is Integer.
• The Vertices set = {1,2,3,4,5,6,7,8}
• The Edge set = {(1,7) ,(2,6) ,(3,1),(3,5) ,(4,6) ,(5,4),(5,2) ,(6,8) ,(7,2),(7,8)}
• Disconnected Graphs

Vertices in a graph do not need to be connected to other vertices. It is legal for a graph to have disconnected components, and even lone vertices without a single connection.

Vertices (like 5,7,and 8) with only in-arrows are called sinks. Vertices with only out-arrows (like 3 and 4) are called sources.

Here we have the following parts.

• The underlying set for the the Vertices set is Integer.
• The Vertices set = {1,2,3,4,5,6,7,8}
• The Edge set = {(1,7) ,(3,1),(3,8) ,(4,6) ,(6,5)}

Representing graphs in a computer

Graphs are often used to represent physical entities (a network of roads, the relationship between people, etc) inside a computer. There are numerous mechansims used. A good choice of mechanism depends upon the operations that the computer program needs to perform on the graph to acheive its needs. Possible operations include.

• Compute a list of all vertices
• Compute a list of all edges.
• For each vertex, u, compute a list of edges (u,v). This is often called the adjacency function.
• If the graph is labeled (either vertex labeled or edge labeled) compute tha label for each vertex (or edge).

Not all programs will need all of these operations, so for some programs, an efficent representation that can compute only the operations needed (but not the others), will suffice.

• Graphs as sets.

One way to represent graphs would be to directly store the Vertices set and the Edge set. This can make it difficult to efficiently compute adjacency information for particular vertexes quickly, so this representation is not used too often.
• Graphs as adjacency information.

Most programs need to compute all the vertices adjacent to a given vertex. This corresponds to finding a 1-step path in the graph. In fact, for many programs this is the only operation needed, so data structures that support this operation quickly and efficiently are often used. Possible choices include arrays, balanced trees, hash tables, etc.
1. Graphs as functions.

One useful abstraction is to think of the adjecency information as a function. Under this abstraction a graph is nothing more than a function.
```  type Graph vertex = vertex -> [vertex]
```
For example the undirected graph below:

can be represented as the function.

```  graph1::  Graph Int
graph1 6 = [4]
graph1 5 = [1,2,4]
graph1 4 = [3,5,6]
graph1 3 = [4,2]
graph1 2 = [1,3,5]
graph1 1 = [2,5]
graph1 _ = []
```
This mechansim can be extended to a wide variety of graphs types by slightly altering or enhancing the kind of function that represents the graph. Here are a few examples.

• Directed graph.

```  type Dgraph vertex = vertex -> [vertex]
```
The representation is the same as a undirected graph but the interpretation is different. In an undirected graph, f, with edge (2,3), we would have both
```    f 2  --->  [3, ... ]
f 3  --->  [2, ... ]
```
but in a directed graph we would have only the first of the results. Consider the directed graph below:

We could represent this as a Dgraph as follows:

```data Node = A | B | C | D | E

graph2:: Dgraph Node
graph2 A = [B]
graph2 B = [C,D,E]
graph2 C = []
graph2 D = [B,C]
graph2 E = [D]
graph2 _ = []
```
• Vertex labeled graph.

```  type VLgraph label vertex = vertex -> ([vertex],label)
```
Here the function not only returns the adjacency list for a vertex but also the label. For example:

```data Color = Blue | Red | Yellow | Green

graph4:: VLgraph Color Int
graph4 2 = ([4],Blue)
graph4 3 = ([7],Yellow)
graph4 4 = ([3,5],Blue)
graph4 5 = ([7],Red)
graph4 6 = ([2],Red)
graph4 7 = ([6],Green)
graph4 _ = ([],undefined)
```
• Edge labeled graph.

```  type ELgraph label vertex = vertex -> [(vertex,label)]
```
Here, the adjency list now contains a tuple, the adjacent vertex, and the label of the edge to that vertex.
```graph6:: ELgraph Int Int
graph6 1 = [(4,5),(2,2)]
graph6 2 = [(1,2),(4,5),(3,14),(5,4)]
graph6 3 = [(2,14),(5,34)]
graph6 4 = [(1,5),(2,5),(5,58)]
graph6 5 = [(2,4),(3,34),(4,58)]
graph6 _ = []
```
• DAG.

Here we have a simple graph, but the data must meet some invariants ensuring no cycles.
```graph7:: Graph Int
graph7 1 = [7]
graph7 2 = [6]
graph7 3 = [1,5]
graph7 4 = [6]
graph7 5 = [2,4]
graph7 6 = [8]
graph7 7 = [2,8]
graph7 8 = []
graph7 _ = []
```

Advantages of representing graphs as functions

• Simple and easy to understand
• Adapts easly to different kinds of graphs

Disadvantages of using graphs as functions

• Cannot be extended to accomodate queries about the set of Vertices or the set of Edges.
• Depending upon the compiler that compiles the functions may not be very efficient. In fact the worst case time could be proportional to the number of vertices.
• The graph must be known statically at compile time.
• Graphs as arrays of adjacent vertexes.

One mechanism that can ameliorate the disadvantages of using functions as a way to represent graphs is to use arrays instead. Using this mechanism requires that the underlying domain of Vertices be some type that can be used as indexes into an array.

In the rest of this note we will assume that Vertices are of type Int, and that the Vertices set is a finite range of the type Int. Thus a graph can be represented as follows:

```type ArrGraph = Array [Int]
```
We can now answer a number queries about graphs quickly and efficiently.
```type ArrGraph i = Array [i]

vertices:: ArrGraph i -> IO[Int]
edges:: ArrGraph i -> IO[(Int,i)]
children:: ArrGraph i -> i -> IO[i]

vertices g =
do { (lo,hi) <- boundsArr g
; return [lo..hi]}

edges g =
do { (lo,hi) <- boundsArr g
; ees <- toListArr g
; return [ (i,j) | (i,cs) <- zip [lo..hi] ees, j <- cs ] }

children g node = readArr g node
```

Advantages of representing graphs as arrays

• Simple and easy to understand
• Efficient access
• Graphs can be constructed at run-time
• Adapts easly to different kinds of graphs
```type VLArrGraph label = Array ([Int],label)  -- Vertex labeled graphs
type ELArrGraph label = Array [(Int,label)]  -- Edge labeled graphs
```

Disadvantages of representing graphs as arrays

• Requires that graph access be a Command rather than a computation.
• The domain of Vertices must be a type that can be used as an index into an array.

Algorithms on Graphs

There are many, many algorithms on graphs. In this note we will look at a few of them. They include:

• Searching Graphs
• Detecting Cycles in Graphs
• Shortest Path algorithms
See the code for some examples.