Department of
Computer Science

CS 350 Algorithms & Complexity
Wionter 2019

 

CS350 Home Page

General Description

Project

Deadlines, Project Proposal

Advice

Proposal Rubric

Report Rubric

Term Project

This page provides details about the term project component of CS350. The purpose of this project is to give each student an opportunity to do some in-depth practical work on the design, implementation, and analysis of algorithms.

Important: Although you will have approximately one month to work on the project, you are required to submit a project proposal by 5th November to receive credit (details below).

General Description

The theme of the project is to implement one or more of the algorithms that we have read about or studied during this course, and to run tests to determine whether the behavior that you see in practice agrees with the predictions of the formal analysis. More specifically, your project will explore one of the following three topics:

  • Convex Hull: Implement the brute force and the Quickhull algorithms for convex hull.  Compare their time efficiency empirically, and see how well these empirical results conform to what analysis predicts.

  • Dictionaries: Implement a dictionary using various kinds of hash table and various kinds of tree.  By “dictionary” I mean a data structure that maps keys (such as strings) to values (such as data records), not a lexicon of meanings for English words.  See how these data structures compare, for both space and time, and compare your empirical results with those predicted by analysis. Your data should include situations where entries are deleted as well as where they are added; for static data, good old binary search is hard to beat!

  • Good Sorts: Implement some of the asymptotically-good sort algorithms, that is, those that are in Ω(n lg n) on average.  Run them on a wide range of examples to see if the formal analysis is a good predictor of each algorithm's behavior in practice,  to see how the various improvements change their behavior, and how their space requirements compare.

If you have a burning desire to work on something other than one of the three project ideas listed above, please check with me and I'll be happy to give you my thoughts and a green light to proceed if I think the idea is appropriate and interesting.   If you want to read ahead and select some of the more-advanced algorithms from the textbook, that's fine.

You may use any reasonable programming language or platform for your implementation; please check with me if you have any concerns about your specific choices in this area. 

Collaboration

I'm strongly encouraging students to form teams of two or three.  Each team will submit single report.   I will not accept larger teams; experience has shown that the coordination problems of a larger team outweigh the technical problems, and its on the latter that I want you to focus your energies.  Students who have a special need to work on their own should ask for permission to do so.

Team projects will enable you to investigate your topic in greater depth—for example, with more thorough testing and analysis, or with a broader range of algorithms.  Generally, I believe that you learn more, and understand more deeply, by working with a partner or two, for example, by pair programming. 

What won't work is a project in which you try to divide the work among team members and then hope to put the pieces together at the end.  Such projects are doomed!  Please don’t try this.  Plan to work as a team; this means meeting in person.
If you have had prior success working remotely with partners, using screen-sharing and voice chat, you may want to try this.  My own experience is that nothing beats sitting together at the same keyboard.

Project Report:

At the end of the project, you will submit a written report that will be used as the sole basis for evaluating your project. Specifically, your report will be expected:

  1. To show that you understand and can implement standard algorithms (10%)
  2. To show that you can write programs that are understandable, and algorithmically sound (30%)
  3. To provide evidence that you understand how complexity theory shows up in practice (15%)
  4. To demonstrate your initiative, originality, and algorithmic insights (15%)
  5. To show that you can communicate your work clearly and concisely in a well-structured document, which should communicate clearly the purpose of your project, the experimental procedures, your results and your conclusions. (30%)
  6. For group projects, I expect a single group report.  I also expect a one-page write-up from each individual, describing that individual's role in the project, the role taken on by the other team members, and how well the collaboration followed the plan. 

Specific items that may appear in a report include:

  • Descriptions of the algorithm(s) that you are working with, in your own words.
  • Source code for particularly intricate or interesting parts of the algorithm.
  • Worked examples of your own devising to show how the algorithms work (This means that you are not to copy the ones in the book, the slides, the original papers where the algorithms were introduced, or resources authored by other people).
  • Details of the testing strategy that you have used. This is likely to involve the construction of code to generate large pseudo-random test cases, and code to verify that the results produced by your code are correct.
  • Summaries of experimental data (e.g., tables or graphs showing the algorithms' behavior over a range of different inputs).
  • Reflections on what you have learned as a result of your experience.

The rubric that I will use to grade the project reports is here.

Deadlines, Project Proposal

The project final report will be due by the start of class on Thursday 14th March which is the last scheduled class for CS350 before finals. However, you are expected to submit a “proposal” for your project, by 18:00 on Thursday 21st February which is worth five percent of the overall score for the class. You will forfeit those points if you do not submit a proposal by this time; there will no extensions for project proposals (except in case of documented illness).  Please submit using d2l.

My advice is to submit your proposal early, using d2l, since that way you will get feedback earlier.  This is especially important for students who propose a custom project (something other than Convex Hull, Dictionaries, or Good Sorts). You can also discuss custom projects with me informally on Piazza or in office hours.

Your proposal should identify:

  1. the name(s) of the student(s) working on the project;
  2. the choice of project topic;
  3. the implementation language;
  4. a list of the specific features that you expect to include in your final report; and
  5. a time plan that identifies at least 3 specific goals for each of the remaining 3 weeks of the term.
  6. A collaboration plan that describes how your team intends to work together. 

The rubric for grading the project proposals is here.

In addition, although they are not required, I encourage you to include additional preliminary materials with your proposal (e.g., in-progress implementations, testing code); I will review these materials and provide feedback.

Advice

Here are some general comments and thoughts that I hope will help you to focus your time and efforts where they are most effective.

Read the Rubrics!  Points are awarded according to the rubrics, so if  you miss something that's required, you won't get points for it.  Conversely, if you do something that's not on the rubric, you won't get points for it.

Budget your time. The most common cause of failure in the project is running out of time.  A day this week is worth just as much as the day before the deadline! With that in mind, you are very strongly encouraged to make a substantial start on the project immediately. I won't have any way to check that you’ve taken my advice on this, but you will put yourself at a huge disadvantage if you do not get started right away.

Effort where it matters most: The grading scheme for the project/term paper will be based on the items listed above in the section about the project report, so you should follow that, and pay attenton to the rubric, as you prepare your proposal and your report. For example, if your final project report does not include a significant component illustrating “how complexity theory shows up in practice”, then you will miss out on the points that are allocated for that item — even if the overall quality of your work is very high. For the same reason, you should avoid spending too much time on details that aren’t going to score you points. For example, writing code that provides sophisticated ways for entering test cases or viewing results might help in debugging or understanding the behavior of your implementation. However, if it doesn’t relate fairly directly to the items in the grading scheme, then that code will not contribute much to your final grade.

Automated testing: You’ll want to run your implementations on a lot of test cases so that you can get a good idea of the performance of the algorithms over a wide range of inputs and input sizes.   For example, you may be sorting lists that contain millions of values as you compare sort algorithms.  So, you’ll likely want to write some code for generating test cases automatically, and for checking that your algorithms are working correctly. There might even be more opportunity for demonstrating originality and initiative in the methods you devise for generating and checking test cases than in any other part of the project. (After all, I'm not asking you to be original with your algorithms.) It is often much easier to get an understanding of general trends in program behavior by running a large number of tests automatically than by running just a few examples by hand.

Automated management of results: As you run tests, you will accumulate lots of data.  It’s easy to loose track of it.  Consider writing results to (systematically-named) files, and making sure that the provenance of the data is clear: the date and version of the code, where the input data came from, etc.  Also consider writing the files in a form (such as tab-separated text) that can be opened by a spreadsheet without any additional hand processing.

Measurements: You’ll likely want to find a method to measure execution times as part of your program instead of having to rely on manual readings from a watch or external script. Have you found out how to access a clock or timer from whatever programming language you are using?  If the timer that you use doesn’t have a very high resolution, then you might want to divide the time that it takes to run the same test k times (for some large enough k) by k to obtain a more accurate measurement for a single test run.  In other words, time 10 or 100 rounds, and then divide the time by 10 or 100.

Throw one (or more) away:  The results from the first (and sometime the first few hundred ...) runs of an algorithm are often atypical.  The implementation may have been compiling or loading code, warming caches, or doing other housekeeping tasks.  Time the first few runs separately from those that follow.  Does the time start high, decrease, and then level off?  You may well decide to “throw away” those results.  This is especially true for language implementations that do “Just in time” compilation: you want to time your algorithm, not the JIT Compiler.
If you are measuring elapsed time, you may also find that most runs take the same time, but a few outliers take significantly longer.  Think why this might be.  Are they reproducible?  What do these outliers tell you about the algorithm?

Questions?

In spite of the details here, the project component of CS350 is still rather open-ended. Please do not hesitate to ask if you have any questions or need more guidance or input.  Piazza is the best venue, so that others can benefit from the answers.


Most recently modified sometime in the past

Andrew P. Black