CS163 Data Structures

CS163 Data Structures

Week #3 Notes

Stacks and Abstract Data Types

• Chapter 6 - Stacks

• The stack as an example of an Abstract Data Type in Program Development

• ADT Stack Operations

• Implementing Stacks using arrays

• Implementing Stacks using dynamic memory

Queues and Abstract Data Types

• Chapter 7 - Queues

• The Abstract Data Type Queue

• Implementations of the ADT Queue

Implementing ADT's

• Now that you understand what an ADT is, let's talk about the data structures used in the implementation of an ADT!

• First, we will examine implementing an ADT Ordered List. Here, we can insert and retrieve items from the list by their corresponding position (or location) in the list.

Our first thought might be to implement an ADT Ordered List using an array. It seems natural since both the array and an ordered list reference its items by position number. Thus, you can store the ordered list ith item in array items[i]. This array would be implemented as a data member.

If this is the case, we need to consider how much of the array the ordered list will occupy. Since it will probably not fill the entire array, we need to keep track of the array elements that have been assigned to the ordered list...and which ones are available for use in the future. To do this we simply keep track of the number of items in the list, assuming there is a maximum length defined for the list.

Then, each ADT operation should be implemented as a member function. When definition the member functions, think about the arguments you might need. If you implement the ADT using the class construct, consider whether or not you need to pass the array as one of the arguments. What about the length of the list? These considerations should be part of your design...and determined way before you begin implementation!

Therefore, our first attempt at an ordered list declaration might look like:

const int maxlen=100;

class ordered

{

public:

// place the constructor here...to perform all initialization steps

// place the prototypes for the member functions here

private:

int length;

int items[maxlen];

};

Then, in my main(), I can create as many ordered lists as is desired by saying:

ordered list1;

ordered list2, list3; //etc.

• Our design for each of the ADT operations is:

For Initialization:

Initialize the length data member to zero

To find the length of the list:

Return the value of the length data member

To insert an item at a specified position:

First, perform some error checking to make sure that the specified position is valid. Make sure that the specified position is not beyond the maximum length of the list; if it is report an error. Also, make sure that the position requested is not less than zero; if it is report an error.

Next, since we are working with an array data structure, we must consider shifting all of the data items to the right that have positions greater than the specified position. We only need to shift these data items up by one.

Lastly, we insert the new data item at the specified location. This data item must be specified as an argument to the insert member function. And, we update the length for this particular ordered list by increasing it by one.

To delete an item at a specified position:

First, perform some error checking to make sure that the specified position is valid. Make sure that the position requested is not less than zero; if it is report an error. And, make sure that the specified position represents a data item that has already been inserted; this can be done by comparing the specified position with the current length of the list.

Next, since we are working with an array data structure, we must consider shifting all of the data items to the left that have positions greater than the specified position. We only need to shift these data items up by one. This takes care of the "delete" operation.

Lastly, we update the length for this particular ordered list, by decreasing it by one.

To retrieve an item from the list...at a specified position:

Next, return to the client the data item located at the specified position in the array. Remember...the client shouldn't know that we are using an array to implement our ordered list ADT!

• Alternatively, we could implement this scheme using pointers. This solves the problem of running out of room because we can allocate memory dynamically as we need it. Arrays are really not good for ordered lists since they can't allow the list to exceed the fixed size of the array.

Implementing Abstract Data Types...using dynamic memory

• Now, let's look at an implementation of dynamic memory allocation for an ADT of an Ordered List. We will use the techniques learned in CS162 to link nodes (allocated using new) together, creating a linked list that represents the ADT Ordered List. As the length of the ordered list increases, you only need to use new to allocate new nodes for additional items. Unlike using arrays, using pointers does not impose a fixed maximum length on the size of the list (unless of course you run out of memory!).

• Therefore, our first attempt at an ordered list declaration might look like:

struct node

{

int data;

int *link;

};

class ordered

{

public:

// place the constructor here...to perform all initialization steps

// place the prototypes for the member functions here

private:

int length;

node *head;

node *current;

};

• Now let's look at the design considerations for some of the operations performed on the ordered list:

For Initialization:

Initialize the length data member to zero

and we must initialize our pointers to NULL!

To find the length of the list:

Return the value of the length data member

• Now think about the retrieve, insert, and delete operations. Consider adding another function which returns a pointer to the node at a specified position in the list. This will reduce duplication of code. Why? Because a linked list does not provide direct access to a specified position. Therefore, this new function (let's call it setptr), must traverse the list from its beginning until the specified point is reached. As you will see, this task is common to the implementation of retrieve, insert and delete. Notice performance issues....with arrays we were restricted to a fixed number of elements that can be in our ordered list...but we could very quickly access each of the array elements...without traversing the list!

• Here are the design considerations for setptr:

To traverse to a position in the list:

Goal: To return a pointer to the node at a given position in the ordered list; if the position is invalid, returns NULL.

First, perform some error checking to make sure that the specified position is valid. Make sure that the position requested is not less than zero; if it is return NULL. And, make sure that the specified position represents a data item that has already been inserted; this can be done by comparing the specified position with the current length of the list.

Next, begin at the head of the list and traverse the specified number of times to the next node. Once the ith position is reached, return the pointer to that node.

To retrieve an item from the list...at a specified position:

Use setptr to traverse to the specified position. Return the data item at that location...if the pointer is not NULL.

(This is substantially less efficient than the array implementation...since we can't access the specified element directly.)

To insert an item at a specified position:

If we are going to insert an item at the first location, we need to insert the new item at the beginning of the list (you learned how to do this in CS162!)

Otherwise, use setptr to traverse to the node before the specified position. Insert the new item after this node.

• On your own...write the design considerations for delete.

• Thinking about "walls"...because setptr returns a pointer you would not want any module outside of the ADT to call it; otherwise the "walls" around the ADT would be violated. The key is that your modules should be able to use the ADT without knowledge of the pointers that the implementation uses. Thus, only retrieve, insert, and delete should call setptr....and it should be hidden from the user. In addition, arguments should be selected for retrieve, insert, and delete that do not bind the user to either an array implementation or a linked list implementation!

• Make sure to compare approaches. Usually various implementations of an ADT have advantages and disadvantages which must be weighted when deciding which one to use. When comparing arrays versus dynamic memory allocation, our trade off is search speed versus memory limitations.

The approach we used with arrays was easy to implement. It behaves just like an ordered list and we can quickly find items. But, the array has a fixed size. When choosing how to implement an ADT, you must decide if the fixed-size restriction is a problem for your particular application. The answer is based on whether or not your application can predict in advance the maximum number of items in the ADT at any one time; if it can't, then using arrays is not adequate. In addition, if you CAN predict in advance the maximum number of items, you need to consider if you are willing to waste storage by declaring an array to be large enough to hold the maximum number of items. If the maximum number of items is very large, but this case rarely occurs (like a maximum of 1000 where the list usually only has 10 items!). In this case you must reserve space for all 1000 items even if you are "wasting" 990!

For such storage considerations, the dynamic method is preferable...since it only provides as much storage as the list needs. You don't need to predict the maximum size and you will not be wasting storage. But, what about implementing this with a dynamically allocated array? This might actually be the best of both worlds!

Other things to consider: If you are able to determine the maximum size, using arrays will actually save space because it does not have to store explicit pointer information. When dealing with large lists, this could save significant memory! And, provide faster access time!! This is called "direct access".

• As you may have already realized, the beauty of implementing our list using the concept of ADT's is that you can change from one implementation to another without impacting the rest of your program! You should be able to change from an array implementation to one using linked lists! Because the "wall" created by data abstraction isolates the rest of the program from how you implement an ADT, changing from one implementation to another should not require changes to you program.

ADTs: In Summary

• Remember that data abstraction is a technique for controlling the interaction between a program and its data structures and the operations performed on this data. It builds "walls" around a program's data structures. Such walls make programs easier to design, implement, read, and modify.

• Abstract Data Types are the specification of the operations to manage your data .. together with the actual data values

• Only after you specify your ADTs should you begin to think about implementing the data structures. Remember that your program should not depend on HOW your ADT is implemented!

Stacks

• Stacks are considered to be the easiest type of list to use. The only operations we have on stacks are to add things to the top of the stack and to remove things from the top of the stack. We never need to insert, delete, or access data that is somewhere other than at the top of the stack.

Think of a stack as a pile of paper on a desk. For example, let's think of a stack as a pile of quizzes; as each of you turns in your quiz...the quiz gets added to the top of the stack. When I grade them, I take them off of the stack, one at a time starting from the top! This means that the last quiz placed on the top of the stack will be the first quiz graded! This is called last in -- first out (abbreviated: LIFO)! Using stacks, I can't go and take the quiz at the bottom or from the middle; I must always start from the top.

• When we add things to the top of the stack we say we are pushing data onto the stack. When we remove things from the top of the stack we say we are popping data from the stack.

• Many computers implement function calls using a mechanism of pushing and popping information concerning the local variables on a stack. When you make a function call, it is like pushing the current state on the stack and starting with a new set of information relevant to this new function. When we return to the calling routine, it is like popping; the information concerning the execution of the function is lost!

• When implementing the code for a stack's operations, we need to keep in mind that we should have functions to push data and to pop data. This way we can hide the implementation details from the user as to how these routines actually access our memory...why? because you can actually implement stacks using either arrays or using linked lists with dynamic memory allocation.

• There are five important stack operations:

1) Determine whether the stack is empty

2) Add a new item onto the stack...push

3) Remove an item from the stack...pop

4) Initialize a stack (this operation is easy to overlook!)

5) Retrieve the item at the top of the stack...without modifying the stack

• Let's draw what a pointer-based stack implementation looks like:

• What are the differences between this and the array based implementation?

Array Implementation of Stacks

• Remember that with an array implementation, we are restricted by the size of the array. This limits the number of push operations we can do; we can't add an item to the stack if the stack's size limit has been reached. If this restriction is not acceptable, then you shouldn't implement your stack using arrays...think about dynamic memory as an alternative.

• With our first implementation of a stack, we will represent a stack as a class, which contains the array of items and an index to the top position in the array:

const int maxlen=100;

class stack

{

public:

// place the constructor here...to perform all initialization steps

// place the prototypes for the member functions here

private:

int top;

int items[maxlen];

};

• Now we are ready to design the ADT operations on the stack using arrays:

To Initialize a stack:

We must set the top of the stack to zero.

To determine if the stack is empty or not:

(Return TRUE if the stack is empty, FALSE otherwise.)

Check if the stack's top is zero. If it is return TRUE if it is not return FALSE!

To Push a data item onto a stack:

(Modify the stack with the new item specified, if the stack isn't full. If the stack is full then nothing should be done and a FALSE flag should be returned.)

First check is the top of the stack is greater than or equal to the maximum size allowable. If it is return FALSE.

Otherwise, increment the top of stack and add the new item to the array at that location. Return a TRUE flag...meaning success.

To Pop a data item off of the stack:

(If the stack is not empty, then the item at the top of the stack is removed and success is TRUE; otherwise, deletion is impossible and success is FALSE.)

If the stack is empty, return FALSE. Otherwise, decrement the top of stack and return TRUE.

Advantages

• We should remember that when we use ADT operations, we can very easily change the way in which we implement the code. Only the definition of the class and the implementation of the member functions must be altered...nothing else!

• In addition, just seeing the words push and pop in a routine will immediately tell anyone reading the code that you are using a stack data structure. Self documenting!!

• Lastly, separating the use of data structures from their implementation will help us improve the top-down design of both our data structures and our programs.

• There are many ways to design ADT operations, with different user interfaces depending on your objectives. There is nothing magical about the routines implemented in the previous example. There are a variety of ways to define the ADT operations. To a large extent, these different ways reflect style differences and coding standards.

Pointer-based Implementation of Stacks

• Many applications require a pointer-based implementation of a stack.

So that the stack can grow and shrink dynamically!!! It isn't limited in size like an array.

• The pseudo code necessary to implement a pointer-based implementation is as follows. The data structure would be defined as:

struct node

{

int data;

node *link;

};

class stack

{

public:

// place the constructor and public prototypes here...

private:

int length;

node *head;

node *current;

};

• The design of our ADT operations would look like:

For Initialization:

Initialize the length data member to zero

and we must initialize our pointers to NULL!

(Think about whether or not you really need a length??)

To determine if a stack is empty:

One approach would be to check if the length is zero. Another approach would be to check if the head pointer is NULL!

To Push a data item onto a stack:

Allocate memory for a new node. Save the data item in this node.

Then, connect the link pointer to the previous top of stack...because we now have a new top of stack. And, lastly, update the head pointer to point to this new node.

But...is this complete? Would we need to consider the case when the stack is empty to begin with?? Think about it!

To Pop a data item from a stack:

If the head pointer is NULL, then there is nothing to pop and deletion is impossible; in this case our success is FALSE.

Otherwise, return to the user the data item that was on the top of stack and deallocate the the top of the stack memory. Remember to update the head pointer to now point to the second item...the new top of stack. To perform these tasks you will need a temporary pointer...why?

• Notice with a linked list implementation we are not concerned about the stack being full.

• In addition, when using a linked list for stacks, we need to determine whether the beginning or the end of the linked list will be the top of the stack. As you can see, we have implemented the stack with the top at the beginning of the list. What is the draw back of using the tail of the list? How would pop be implemented? Notice, pop would have to trace all of the way from the head of the list to find out what the new top of stack would be. And, in order to easily add nodes to the end of the stack, we would need to keep an end of stack pointer.

• Notice that using the beginning of the list as the top of stack, the only information we need to keep track of is the location of the top...which is the pointer to the list of linked nodes!

Queues

• We can think of stacks as having only one end; because, all operations are performed at the top of the stack. That is why we call it a Last in -- First out data structure. A queue, on the other hand, has two ends: a front and a rear (or tail). With a queue, data items are only added at the rear of the queue and items are only removed at the front of the queue. This is what we call a First in -- First out structure (FIFO). They are very useful for modeling real-world characteristics (like lines at a bank).

• When establishing a queue data structure, we must consider specifying our Queue Abstract Data Type Operations.

For example, the Queue ADT Operations could be:

Create -- creates an empty queue

IsEmpty -- determines if the queue is empty

EnQueue -- adds an item to the rear of the queue;

DeQueue -- remove from the queue the item at the front

QueueFront -- retrieve the item at the top of the queue without modifying the queue itself.

Implementations of the ADT Queue: Circular Array

• Just like with Stacks, there are choices in how to implement queues. You might choose either an array based implementation or a pointer based implementation. First, we will examine an array based implementation; later we will examine a pointer based implementation.

• If we choose arrays, the size of our queue is fixed. Our first attempt at an array based implementation might start with a data structure such as: (Assume that our queue consists of integer values.)

const int MaxQueue=100;

class queue

{

public:

//member functions

private:

int front; //index to the front of the queue

int rear; //index to the rear of the queue

int items[MaxQueue+1]; //queue of data items

};

We might first initialize the front to 1 and the rear to 0. When a new item is added to the queue, rear is incremented and the item placed at the index of rear. When an item is removed, front is incremented. The queue is empty when rear < front. The queue is full when rear = maxqueue. Let's draw a picture. Does anyone see a problem with this?

• Yes, this results in a rightward drift. After a sequence of data items are added and removed, the queue will drift to the end of the queue; you might get to the point where rear might even equal maxqueue even when there are only a few items in the queue!

• One possible solution is to shift the array elements to the left after each deletion or whenever the rear reaches the end of the queue (rear = MaxQueue) We call this a linear implementation of a queue. It is not very efficient since most of the time would be spent shifting data items.

• A much better approach -- and the one we will concentrate on today -- is to use a circular array. With a circular approach, we advance the front to remove an item...and advance the rear to add an item. But, when either front or rear advances past the MaxQueue location, it wraps around and starts all over at location 1. This wrap around gets rid of the problem of rightward drift.

• The main difficulty with this approach is in determining if the queue is empty or full. We can simplify this by knowing that the queue is either empty or full when front is one position ahead of the rear. The problem is that we can't distinguish between empty or full this way.

• Therefore, we will also keep a counter of the number of items in the queue. Before adding an item to the queue, we can always check to see if the count is equal to the MaxQueue size; if it is, the queue is full. Before removing an item from the queue, we should check to see if the count is zero; if it is, then we know the queue is empty.

• Using this method, our declarations would be:

const int MaxQueue=100;

class queue

{

public:

//member functions

private:

int front; //index to the front of the queue

int rear; //index to the rear of the queue

int items[MaxQueue]; //queue of data items

int size; //current number of data items in the queue

};

• We need to first initialize front to 1; and rear to MaxQueue; and size to 0; this will be done in our constructor.

• To add a new item to the queue, we want to first increment rear, then add the item at rear's position. We can automatically get the wraparound effect by using % operator (mod). Therefore, to add a new item to the queue we need to:

rear = (rear % MaxQueue) +1;

items[rear] = new_data_item;

size++;

• Notice that if rear equals MaxQueue --- (rear % MaxQueue is zero), the new rear would be 1!!

• To remove an item from the queue we need to:

front = (front % MaxQueue) +1;

size--;

• The following attempts to implement our Qtack ADT operations using a circular array. Here is our algorithm:

Constructor: When a queue object is defined, we want to initialize the front, rear, and size data items. Front is set to index 1, rear to index MaxQueue, and size to zero. You might want to consider default arguments as an alternative, and allow for a dynamically allocated array of integers of user-defined size!

IsEmpty: To determine if the queue is empty, the size is compared with zero. If size is zero, true is returned; otherwise, false is returned.

EnQueue: Before an item can be added to a queue, we must check whether or not the queue is already full. Therefore, size must be compared with MaxQueue; if it is greater than or equal to MaxQueue then a failure flag must be set and returned. Otherwise, the new item (supplied as an argument) is added to the queue. To do this, we first increment the rear index, save the data items at that location in the array, and increment the size by one. Don't forget to set a success flag to true!

DeQueue: Before an item can be removed from a queue, we must check whether or not the queue is empty. If the size is zero, then a failure flag must be set and returned...because our queue is empty! Otherwise, the front of the queue should be incremented and the size decremented by one. Don't forget to set a success flag to true!

Implementations of the ADT Queue: Pointer-based

• Now implement the Queue ADT operations using dynamic memory.

• For queues, we will see that the pointer-based implementation is more straightforward than the array method. And, it doesn't restrict us to a fixed size queue.

• There are two ways to implement a Queue ADT using pointers...using two external pointers or using a circular linked list.

Or,

• Using a circular linked list, we can get away with only having one pointer; we no longer require front and back pointers. Our queue pointer points to the node at the rear of the queue to begin with. And, our queue pointer's "link pointer" points to the node at the front.

• Insertion and deletion are straightforward. To insert a node we need to do the following:

1) allocate memory for the new node

2) change the new node's link pointer

3) change the queue pointer's link pointer (i.e., the rear node)

4) update the queue pointer to point to the new node

• Before we get into the code to do this...deskcheck the algorithm for all boundary conditions; for example, does this algorithm work when the queue is empty?

As shown:

• Deletion is simpler than insertion. Items are deleted at the front of the queue (versus inserting at the rear). We only need to change one pointer! However, make sure to consider the special case when there is only one item in the queue. In this case, our queue pointer needs to be reset to NULL.

• The class for a pointer-based implementation of a queue might resemble:

struct node {

int data;

node * link;

};

class queue {

public:

//member functions

private:

node *queue_ptr; //points to the rear of the queue

};

• Using a circular linked list to create the Queue ADT operations:

Constructor: When a queue object is defined, we want to initialize the queue_ptr to NULL. No memory needs to be dynamically allocated until a node is added to the linked list!

IsEmpty: To determine if the queue is empty, the queue_ptr is compared with zero (NULL). If it is NULL, true is returned; otherwise, false is returned.

EnQueue: To add the new item (supplied as an argument) to the rear of the queue, memory must first be allocated for a node. Then, the data supplied as an argument must be saved in that node. If the queue is empty, then this new node's link pointer should point to itself. Otherwise, this new node's link pointer should point to the front of the queue (queue_ptr's link). Then, the previous rear of the list should be altered to point to this new node. And lastly, queue_ptr must be updated to point to the new rear of the list.

DeQueue: Before an item can be removed from a queue, we must check whether or not the queue is empty. If queue_ptr is NULL, then a failure flag must be set and returned...because our queue is empty! Otherwise, implement the following steps:

a) define a temporary pointer to a node

b) set this pointer to the front of the list

c) if there was only one node in the list, then set the queue_ptr to NULL;

d) otherwise, set queue_ptr's link value to point to the new front of the list

e) deallocate the memory; pointed to by our temporary pointer.

f) return success!