Graphs in Computer Science
Introduction
Graphs are mathematical concepts that have found many uses
in computer science. Graphs come in many different flavors, many of
which have found uses in computer programs. Some flavors are:
- Simple graph
- Undirected or directed graphs
- Cyclic or acyclic graphs
- labeled graphs
- Weighted graphs
- Infinite graphs
- ... and many more too numerous to mention.
Most graphs are defined as a slight alteration of the following
rules.
- A graph is made up of two sets called Vertices and Edges.
- The Verticies are drawn from some underlying type, and the set may be finite or infinite.
- Each element of the Edge set is a pair consisting of two elements
from the Vertices set.
- Graphs are often depicted visually, by drawing the elements
of the Vertices set as boxes or circles, and drawing the elements of the
edge set as lines or arcs between the boxes or circles. There is an arc
between v1 and v2 if (v1,v2) is an element of the Edge set.
Adjacency. If (u,v) is in the edge set we say u is adjacent to v
(which we sometimes write as u ~ v).
For example the graph drawn below:
Has the following parts.
- The underlying set for the Verticies set is the integers.
- The Vertices set = {1,2,3,4,5,6}
- The Edge set = {(6,4),(4,5),(4,3),(3,2),(5,2),(2,1),(5,1)}
Kinds of Graphs
Various flavors of graphs have the following specializations
and particulars about how they are usually drawn.
-
Undirected Graphs.
In an undirected graph, the order of the vertices in the pairs in the Edge set
doesn't matter. Thus, if we view the sample
graph above we could have written the Edge set as {(4,6),(4,5),(3,4),(3,2),(2,5)),(1,2)),(1,5)}.
Undirected graphs usually are drawn with straight lines between
the vertices.
The adjacency relation is symetric in an undirected graph, so if
u ~ v then it is also the case that v ~ u.
-
Directed Graphs.
In a directed graph the order
of the vertices in the pairs in the edge set matters. Thus
u is adjacent to v only if the pair (u,v) is in the Edge set.
For directed graphs we usually use arrows for the arcs between
vertices. An arrow from u to v is drawn only if (u,v) is in the Edge set.
The directed graph below
Has the following parts.
- The underlying set for the Verticies set is capital letters.
- The Vertices set = {A,B,C,D,E}
- The Edge set = {(A,B),(B,C),(D,C),(B,D),(D,B),(E,D),(B,E)}
Note that both (B,D) and (D,B) are in the Edge set, so the
arc between B and D is an arrow in both directions.
-
Vertex labeled Graphs.
In a labeled graph, each vertex
is labeled with some data in addition to the data that identifies
the vertex. Only the indentifying data is present in the
pair in the Edge set. This is silliar to the (key,satellite) data
distinction for sorting.
Here we have the following parts.
- The underlying set for the keys of the Vertices set is the integers.
- The underlying set for the satellite data is Color.
- The Vertices set = {(2,Blue),(4,Blue),(5,Red),(7,Green),(6,Red),(3,Yellow)}
- The Edge set = {(2,4),(4,5),(5,7),(7,6),(6,2),(4,3),(3,7)}
-
Cyclic Graphs.
A cyclic graph is a directed graph
with at least one cycle. A cycle is a path along the directed edges from a vertex to itself.
The vertex labeled graph above as several cycles.
One of them is 2 » 4 » 5 » 7 » 6 » 2
-
Edge labeled Graphs.
A Edge labeled graph is a
graph where the edges are associated with labels. One can indicate this be
making the Edge set be a set of triples. Thus if (u,v,X) is in
the edge set, then there is an edge from u to v with label X
Edge labeled graphs are usually drawn with the labels drawn adjacent to
the arcs specifying the edges.
Here we have the following parts.
- The underlying set for the the Vertices set is Color.
- The underlying set for the edge labels is sets of Color.
- The Vertices set = {Red,Green,Blue,White}
- The Edge set = {(red,white,{white,green})
,(white,red,{blue})
,(white,blue,{green,red})
,(red,blue,{blue})
,(green,red,{red,blue,white})
,(blue,green,{white,green,red})}
-
Weighted Graphs.
A weighted graph is an edge
labeled graph where the labels can be operated on by the
usual arithmetic operators, including comparisons like
using less than and greater than. In Haskell we'd say the
edge labels are i the Num class. Usually they are integers
or floats. The idea is that some edges may be more (or
less) expensive, and this cost is represented by the edge
labels or weight. In the graph below, which is an
undirected graph, the weights are drawn adjacent to the
edges and appear in dark purple.
Here we have the following parts.
- The underlying set for the the Vertices set is Integer.
- The underlying set for the weights is Integer.
- The Vertices set = {1,2,3,4,5}
- The Edge set = {(1,4,5)
,(4,5,58)
,(3,5,34)
,(2,4,5)
,(2,5,4)
,(3,2,14)
,(1,2,2)}
-
Directed Acyclic Graphs.
A Dag is a directed graph
without cycles. They appear as special cases in CS applications all the time.
Here we have the following parts.
- The underlying set for the the Vertices set is Integer.
- The Vertices set = {1,2,3,4,5,6,7,8}
- The Edge set = {(1,7)
,(2,6)
,(3,1),(3,5)
,(4,6)
,(5,4),(5,2)
,(6,8)
,(7,2),(7,8)}
-
Disconnected Graphs
Vertices in a graph do not need to be connected to other vertices.
It is legal for a graph to have disconnected components, and even lone vertices
without a single connection.
Vertices (like 5,7,and 8) with only in-arrows are called sinks.
Vertices with only out-arrows (like 3 and 4) are called sources.
Here we have the following parts.
- The underlying set for the the Vertices set is Integer.
- The Vertices set = {1,2,3,4,5,6,7,8}
- The Edge set = {(1,7)
,(3,1),(3,8)
,(4,6)
,(6,5)}
Representing graphs in a computer
Graphs are often used to represent physical entities
(a network of roads, the relationship between people, etc) inside
a computer. There are numerous mechansims used. A good choice of
mechanism depends upon the operations that the computer program needs to perform on the graph
to acheive its needs. Possible operations include.
- Compute a list of all vertices
- Compute a list of all edges.
- For each vertex, u, compute a list of edges (u,v).
This is often called the adjacency function.
- If the graph is labeled (either vertex labeled or edge labeled)
compute tha label for each vertex (or edge).
Not all programs will need all of these operations, so for some
programs, an efficent representation that can compute only the
operations needed (but not the others), will suffice.
-
Graphs as sets.
One way to represent graphs
would be to directly store the Vertices set and the Edge set. This
can make it difficult to efficiently compute adjacency
information for particular vertexes quickly, so this representation
is not used too often.
-
Graphs as adjacency information.
Most programs need
to compute all the vertices adjacent to a given vertex. This
corresponds to finding a 1-step path in the graph. In fact, for
many programs this is the only operation needed, so data structures
that support this operation quickly and efficiently are often used.
Possible choices include arrays, balanced trees, hash tables, etc.
-
Graphs as functions.
One useful abstraction is to
think of the adjecency information as a function. Under this abstraction
a graph is nothing more than a function.
type Graph vertex = vertex -> [vertex]
For example the undirected graph below:
can be represented as the function.
graph1:: Graph Int
graph1 6 = [4]
graph1 5 = [1,2,4]
graph1 4 = [3,5,6]
graph1 3 = [4,2]
graph1 2 = [1,3,5]
graph1 1 = [2,5]
graph1 _ = []
This mechansim can be extended to a wide variety of graphs types
by slightly altering or enhancing the kind of function that represents
the graph. Here are a few examples.
-
Directed graph.
type Dgraph vertex = vertex -> [vertex]
The representation is the same as a undirected graph
but the interpretation is different. In an undirected
graph, f, with edge (2,3), we would have both
f 2 ---> [3, ... ]
f 3 ---> [2, ... ]
but in a directed graph we would have only the first of the results.
Consider the directed graph below:
We could represent this as a Dgraph as follows:
data Node = A | B | C | D | E
graph2:: Dgraph Node
graph2 A = [B]
graph2 B = [C,D,E]
graph2 C = []
graph2 D = [B,C]
graph2 E = [D]
graph2 _ = []
-
Vertex labeled graph.
type VLgraph label vertex = vertex -> ([vertex],label)
Here the function not only returns the adjacency list for a vertex but also
the label. For example:
data Color = Blue | Red | Yellow | Green
graph4:: VLgraph Color Int
graph4 2 = ([4],Blue)
graph4 3 = ([7],Yellow)
graph4 4 = ([3,5],Blue)
graph4 5 = ([7],Red)
graph4 6 = ([2],Red)
graph4 7 = ([6],Green)
graph4 _ = ([],undefined)
-
Edge labeled graph.
type ELgraph label vertex = vertex -> [(vertex,label)]
Here, the adjency list now contains a tuple, the adjacent
vertex, and the label of the edge to that vertex.
graph6:: ELgraph Int Int
graph6 1 = [(4,5),(2,2)]
graph6 2 = [(1,2),(4,5),(3,14),(5,4)]
graph6 3 = [(2,14),(5,34)]
graph6 4 = [(1,5),(2,5),(5,58)]
graph6 5 = [(2,4),(3,34),(4,58)]
graph6 _ = []
-
DAG.
Here we have a simple graph, but the data must meet some invariants
ensuring no cycles.
graph7:: Graph Int
graph7 1 = [7]
graph7 2 = [6]
graph7 3 = [1,5]
graph7 4 = [6]
graph7 5 = [2,4]
graph7 6 = [8]
graph7 7 = [2,8]
graph7 8 = []
graph7 _ = []
Advantages of representing graphs as functions
- Simple and easy to understand
- Adapts easly to different kinds of graphs
Disadvantages of using graphs as functions
- Cannot be extended to accomodate queries about
the set of Vertices or the set of Edges.
- Depending upon the compiler that compiles the functions
may not be very efficient. In fact the worst case time could be proportional to the number of vertices.
- The graph must be known statically at compile time.
Graphs as arrays of adjacent vertexes.
One mechanism that can ameliorate the disadvantages of using functions
as a way to represent graphs is to use arrays instead. Using this
mechanism requires that the underlying domain of Vertices be some
type that can be used as indexes into an array.
In the rest of this note we will assume that Vertices are of type Int,
and that the Vertices set is a finite range of the type Int.
Thus a graph can be represented as follows:
type ArrGraph = Array [Int]
We can now answer a number queries about graphs quickly and efficiently.
type ArrGraph i = Array [i]
vertices:: ArrGraph i -> IO[Int]
edges:: ArrGraph i -> IO[(Int,i)]
children:: ArrGraph i -> i -> IO[i]
vertices g =
do { (lo,hi) <- boundsArr g
; return [lo..hi]}
edges g =
do { (lo,hi) <- boundsArr g
; ees <- toListArr g
; return [ (i,j) | (i,cs) <- zip [lo..hi] ees, j <- cs ] }
children g node = readArr g node
Advantages of representing graphs as arrays
Disadvantages of representing graphs as arrays
- Requires that graph access be a Command rather than a computation.
- The domain of Vertices must be a type that can be used as an index into an array.
Algorithms on Graphs
There are many, many algorithms on graphs. In this note we will
look at a few of them. They include:
- Searching Graphs
- Detecting Cycles in Graphs
- Shortest Path algorithms
See the code for some examples.
Back to the Daily Record.
Back to the class web-page.