CS410/CS510 - Concurrent Systems

When: Mon/Wed 1200-1350
Where: UTS 304
Instructor: Jonathan Walpole
Instructor Office Hours - By Appointment

If you are signed up for this class, please send me your preferred email address!


This course explores a variety of shared memory concurrent programming techniques with special emphasis on their performance and scalability characteristics on modern CPU architectures.

The course is based on research papers. The reading list has been carefully selected from classical and recent research papers in order to introduce and develop the key concepts and developments in concurrent programming on shared memory multiprocessors. These concepts include a review of various practical uses of locking in OS kernels, synchronization among interrupt and normal process contexts, the design of scalable spin locks, lock-free and non-blocking synchronization techniques, hardware and software transactional memory, synchronization strategies based on deferred reclamation, read copy update, and memory consistency models and their implications for the performance and correctness of concurrent code.

You will be required to read each paper carefully, write a brief summary of it, and submit it before the start of each class. Each student will be assigned a paper to present in class. These presentations, your in-class participation and paper summaries will contribute directly to your grade. The goal of this class is not only to help you learn about different systems, but also to learn how to read and evaluate a research paper, how to compose a concise summary of a research paper, and how to synthesize concepts and ideas across several papers. Toward the end of the class students will write a short position paper describing their views on the current and future directions of concurrent programming.


If you have not taken, and passed with a grade B or better, an introductory course in operating systems this is not the course for you! Take CS333 first. Ideally, you will also have taken and passed CS533 with a grade of B or better.

Text Book

There are no required text books for this class. Instead, the class is based on a collection of research papers, listed below in the Schedule & Syllabus section.


The final grade for graduate students will be calculated as follows: paper reviews - 10%; midterm exam - 25%; in-class paper presentations - 20%; final exam - 25%; position paper or project - 20%;

Paper Reviews

Prior to each class, starting with class 2, you must submit a concise review of each of the papers listed for that class. Reviews should briefly explain (IN YOUR OWN WORDS!!!), for example, (a) what the topic of the paper is, (b) what the contribution is and what problem it solves, (c) how the proposed solution works and how it differs from previous work, and in particular how it relates to papers covered so far in this class, (d) how the claims of the paper were validated, i.e., what research methodology was used, (e) whether the paper succeeded in solving an important problem and whether the validation was convincing, (f) what the overall message of the paper was and what aspects of it you found most interesting. Reviews should be written in plain text format and submitted by email to walpole@pdx.edu with subject heading [CS410/510 Paper Review]. Be sure to include your name with your review.


Each class session will involve a paper presentation followed by a discussion session, led by me, to integrate and evaluate the key concepts. You will be assigned one paper to present in class during the quarter. Preliminary assignments are listed below each paper. Presentations should be targeted to last 40 minutes and should emphasise the key ideas of the paper. Don't waste your presentation time on mundane issues. You must extract the important contributions and present them clearly. Be sure to work through examples where possible. Your presentation should be formal, and you should prepare slides as necessary. If you reuse material from the web, be sure to cite the source. You must do whatever additional background reading is necessary to enable you to understand and present your paper well. I advise you to start preparing for your presentations early!

Position Paper or Project

You must either complete a programming project or write a short position paper (1500 words) discussing current trends in concurrent programming and multiprocessor operating system design, predicting future research challenges and likely developments. The paper or project report will be due at the start of the final class.

Mailing List

A "MailMan" e-mailing list will be maintained for this class. The list, called cs510walpole@cs.pdx.edu, is for communicating information relating to the course, and can be used by students as well as the instructor. All students should subscribe to this list. Go to the following web page and follow the instructions:


Schedule & Syllabus
Class 1
Locking: Introduction
Introducton to concurrency, data races, synchronization via mutual exclusion, locks, semaphores and monitors.

Presenter: Jonathan Walpole
Course Overview slides: [pptx] [pdf]
Class 1 Slides: [pptx] [pdf]

The following book (free) contains a lot of useful background information and programming examples: "The Little Book of Semaphores, Second Edition" Allen B. Downey, 2008.
Class 2
Locking: Kernel Locking Techniques
A discussion of various forms of lock-based synchronization used in OS kernels. A case study of locking primitives used in the Linux kernel.

Class 3
Locking: Spin Lock Performance
Implementation strategies for spin locks, with emphasis on contention issues and design strategies that improve performance and scalability.

Class 4
Non-blocking Synchronization: Lock-Free Algorithms
Lock-free synchronization strategies for common kernel data structures, kernel design based on extensive use of lock-free synchronization and other strategies to improve locality.

Class 5
Non-blocking Synchronization: Practical Blocking & Non-Blocking Queue Algorithms
Practical blocking and non-blocking queue algorithms using Compare and Swap (CAS) which is a readily available instruction on modern CPUs. Effects of preemption on performance of locking and non-blocking synchronization.

Class 6
Non-blocking Synchronization: General Methodology
A general methodology for converting sequential objects to non-blocking objects. Introduction to simple memory management approaches and issues.

Class 7
Concurrency Concerns: Hardware Reordering
Review of memory consistency models used by modern CPUs. Algorithms discussed earlier assume sequential consistency, which is not typical of modern CPUs. Safety net techniques, in the form of memory barriers, must be used to make such algorithms safe in the presence of weak consistency models, which are typical.

Class 8
Concurrency Concerns: Compiler Reordering
Discussion of the correctness implications of compiler and architecture-level reordering when concurrency is implemented outside the compiler (i.e. through a thread-library).

Class 9
Concurrency Concerns: Safe Memory Reclamation via Hazard Pointers
Discussion of the implications of non-blocking synchronization on memory management. The use of "hazard pointers" for safe and scalable memory reclamation with lock-free objects.

Class 10
Concurrency Concerns: Safe Memory Reclamation via RCU
Review of the Read-Copy Update (RCU) technique with lock-based writers, lock-free readers, and quiescent state based deferred reclamation, for practical scalable concurrent programming on modern architectures.

Class 11
Midterm Exam
Class 12
Scalable Concurrent Programming: Linux Kernel RCU Case Study
Case study of RCU usage in the Linux kernel.


For more information about Linux kernel implementations of RCU see the following: "The read-copy-update mechanism for supporting realtime applications on shared-memory multiprocessor systems with Linux," D. Guniguntala, P. E. McKenney, J. Triplett, and J. Walpole IBM Systems Journal, Volume 47, Number 2, 2007.

Class 13
Scalable Concurrent Programming: Relativistic Programming
Case study in relativistic programming - a general methodology for using RCU-like primitives.

Class 14
Scalable Concurreny: How Strong is Weak Ordering?
A discussion of the real-world ordering properties of linearizable and non-linearizable FIFO queue implementations.

  • "How FIFO is Your Concurrent FIFO Queue?",
    Andreas Haas, Christoph Kirsch, Hannes Payer and Michael Lippautz
    In proceedings of RACES'12, Workshop on Relaxing Synchronization for Multicore and Manycore Scalability, Tuscon, Arizona, 2012.
    Presenter: Jonathan Walpole
    Slides: [.pdf]
Class 15
Transactional Memory: Hardware Implementations
Introduction to the transactional memory abstraction as a means of simplifying concurrent programming. Outline of a hardware implementation of transactional memory.

Class 16
Transactional Memory: Case Study HTM Use in an OS Kernel
Discussion of performance, I/O and scheduling interactions of transactional memory in an operating system.

Class 17
Transactional Memory: Software Implementations
Comparison of various software implementations of Transactional Memory.


The following is an interesting and relevant paper I just came across ...

Understanding Transactional Memory Performance
Donald Porter and Emmett Witchel, In the Proceedings of the 2010 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS '10), White Plains, NY, March 2010
Class 18
Review: A Comparison of Synchronization Technologies
Comparison of relative merits of transactional memory and other synchronization techniques.

Class 19
Guest lecture by Paul McKenney

Slides: [Talk 1 .pdf] [Talk 2 .pdf]

Back to Jonathan Walpole's home page