CS510 - Concurrent Systems

When: Monday 1700-19:45
Where: Capital Center 1025
Instructor: Jonathan Walpole
Instructor Office Hours - By Appointment

If you are signed up for this class, please send me your email address!

Description

Hardware designers have reached an inflection point at which the performance gains promised by increases in clock frequency are outweighed by those promised by increases in the number of processors on a core. Hence, we are entering an era of multi-core processing in which parallelism, at the hardware level, will increase inexorably, and the ability of software systems to achieve higher performance will rest on their ability to utilize this parallelism. The effect of these changes on software will be profound. This is arguably the biggest challenge to face computer scientists in several decades.

This course explores the challenges of writing concurrent software with scalable performance characteristics on modern multicore architectures. We study the issue primarily in the context of operating system kernel code, since operating systems developers have extensive experience with concurrent programming and hence the operating system domain is rich with interesting approaches. It is also rich with difficult challenges since system-wide performance and correctness depends on the performance and correctness of operating system code.

The course is based on research papers. It will involve a lot of reading and assimilation of ideas. The reading list has been carefully selected from classical and recent research papers in order to introduce and develop the key concepts and developments in concurrent programming and shared memory multiprocessor operating system design. These concepts include various uses of locking in OS kernels, synchronization in interrupt and process contexts, issues in the design of scalable spin locks, lock-free and non-blocking synchronization techniques, hardware and software transactional memory, hazard pointers, read copy update, memory consistency models and their implications for the performance and correctness of concurrent code.

You will be required to read each paper carefully, write a brief summary of it, and submit it before the start of each class. Each student will be assigned one or more papers to present in class. These presentations, your in-class participation and paper summaries will contribute directly to your grade. The goal of this class is not only to help you learn about different systems, but also to learn how to read and evaluate a research paper, how to compose a concise summary of a research paper, and how to synthesize concepts and ideas across several papers. Toward the end of the class each student will write a short position paper describing their views on the current and future directions of concurrent programming.


Prerequisites

If you have not taken, and passed with a grade B or better, an introductory course in operating systems this is not the course for you! Take CS333 first. Ideally, you will also have taken and passed CS533 with a grade of B or better.


Text Book

There are no required text books for this class. Instead, the class is based on a collection of research papers, listed below in the Schedule & Syllabus section.


Grading

Your final grade will be calculated as follows: paper reviews - 25%; in-class paper presentations - 50%; position paper - 25%;


Paper Reviews

Prior to each class, starting with class 2, you must submit a concise review of each of the papers listed for that class. Reviews should briefly explain, for example, (a) what the topic of the paper is, (b) what the contribution is and what problem it solves, (c) how the proposed solution differs from previous work, and in particular how it relates to papers covered so far in this class, (d) how the claims of the paper were validated, i.e., what research methodology was used, (e) whether the paper succeeded in solving an important problem and whether the validation was convincing, (f) what the overall message of the paper was and what aspects of it you found most interesting. Reviews should be written in plain text format and submitted by email to walpole@cs.pdx.edu with subject heading [Paper Review]. Be sure to include your name with your review.


Presentations

Each class session will involve two paper presentations followed by a discussion session, led by me, to integrate and evaluate the key concepts. You will be assigned at least one paper (probably several) to present in class during the quarter. Preliminary assignments are listed below each paper. Presentations should be targeted to last 40 minutes and should emphasise the key ideas of the paper. Don't waste your presentation time on mundane issues. You must extract the important contributions and present them clearly. Be sure to work through examples where possible. Your presentation should be formal, and you should prepare slides as necessary. If you reuse material from the web, be sure to cite the source. You must do whatever additional background reading is necessary to enable you to understand and present your paper well. I advise you to start preparing for your presentations early!


Position Papers

Each student must write a short position paper (1500 words) discussing current trends in concurrent programming and multiprocessor operating system design, predicting future research challenges and likely developments. This paper will be due at the start of the final class.


Mailing List

A "MailMan" e-mailing list will be maintained for this class. The list, called cs510walpole@cs.pdx.edu, is for communicating information relating to the course, and can be used by students as well as the instructor. All students should subscribe to this list. Go to the following web page and follow the instructions:

https://mailhost.cecs.pdx.edu/mailman/listinfo/cs510walpole


Schedule & Syllabus
Class 1
01-07-08
Course Overview; Review of Kernel Locking Techniques; Spin Lock Performance Considerations.
Overview of the course, plus a discussion of various forms of lock-based synchronization used in OS kernels. A case study of the locking primitives used in the Linux kernel. Implementation strategies for spin locks, with emphasis on contention issues and design strategies that improve performance.

Reading:
Class 2
01-14-08
Communication and Synchronization Strategies for SMMP Kernels; Locality Issues in Scalable Kernel Design
Implications of interrupt-level vs process-level execution and the use of remote access vs remote invocation for communication across processors in shared memory multiprocessor kernels. Locality issues, object distribution and replication, scalable locking and interaction with memory management.

Reading:
Class 3
01-28-08
Lock-Free and Non-Blocking Synchronization in the Synthesis and Cache Kernels
Lock-free synchronization strategies for common kernel data structures, kernel design based on extensive use of lock-free synchronization and other strategies to improve locality. General strategies for non-blocking synchronization based on double compare and swap (DCAS) and their use in an OS kernel.

Reading:
Class 4
02-04-08
Practical Blocking & Non-Blocking Queue Algorithms; General Methods for Non-Blocking Synchronization
Practical blocking and non-blocking queue algorithms using Compare and Swap (CAS). Effects of preemption on performance of locking and non-blocking synchronization. A general methodology for converting sequential objects to non-blocking objects, interactions with memory management.

Reading:
Class 5
02-11-08
Safe Memory Reclamation for Lock-Free Objects; Comparison of Deferred Reclamation Strategies
The use of "hazard pointers" for safe memory reclamation with lock-free objects. Comparison of hazard pointers and read copy update (RCU) based deferred reclamation strategies, and performance implications on modern CPU architectures.

Reading:
Class 6
02-18-08
Memory Consistency Models at the Hardware and Programming Language Levels
Memory consistency models of modern CPUs, sequential consistency, weak consistency models, memory barriers. Correctness implications of compiler and architecture-level reordering when concurrency is implemented outside the compiler (i.e. through a thread-library).

Reading:
Class 7
02-25-08
RCU Case Study
Case study of the RCU API with discussion of correct use and mis-use of primitives and an in depth look at how various versions of RCU are implemented in Linux.

Reading:
Class 8
03-03-08
Transactional Memory
Introduction to the transactional memory abstraction as a means of simplifying concurrent programming. Comparison of various software implementations of Transactional Memory.

Reading:
Class 9
03-10-08
Practical Issues for Transactional Memory
Discussion of performance, I/O and scheduling interactions of transactional memory systems. Comparison of relative merits of transactional memory and other synchronization techniques.

Reading: Position paper due today!


Back to Jonathan Walpole's home page