Instructor
David Devecsery
Course Materials
Course Description
How do we manage our ever increasing use of parallel and shared data? As computer systems have evolved we have gone from an era of single-core monolithic tasks to an era in which parallelism is driving software scalability, operating on several cores in a desktop machine, or possibly thousands of cores in cloud environments. In an era where complex tasks must share data, how do we dictate what happens to concurrent accesses of that data? What operations and semantics are allowed on shared memory? These problems are dictated by the memory model of the system.Memory models present a unique challenge in computer science. They are necessary to understand and bound how parallel systems operate over shared data; however, they must also balance usability and understand-ability to programmers, with inherent performance trade-offs that stem from the underlying system's construction.
Memory models are by their nature a full software stack problem, with foundations in architecture, compilers, programming languages, and runtime systems. This course explores the full stack of existing memory models. We begin with the architectural foundations behind today's low-level memory models. We work our way up the stack, exploring how memory models effect compiler optimizations and language construction. We also explore proposed language-level and runtime systems proposed to enable higher-level, more understandable runtime systems. Finally, we venture into non-conventional memory models. We look at how distributed systems change how data is shared across wide networks, and look at how lack of specification of timing information has allowed attackers to exploit shared data to discover other user's secrets.
Prerequisites
As memory models are by their nature a full-stack problem, this course will cover a wide range of topics, including software systems, compilers, and computer architecture. Students are not expected to have complete mastery over all of these topic areas, and background will be provided for more advanced issues in each of these areas. Students should have knowledge in the following:- Required:
- Software Systems (CS 6200) or equivalent: Memory models exist to define how multiple threads can interact through shared memory. Student's should be comfortable writing multi-threaded programs, and should understand the challenges of synchronizing between them.
- Useful, but not required:
- Computer Architecture (CS 6290): The properties of the architectural memory system are intrinsic to the memory-model actually visible to programmers. Students do not need to be architects to take this course and more advanced topics will be reviewed, but they should at minimum understand pipeline processors, and caching systems. Knowledge of more advanced topics (out-of-order processing, cache coherence protocols, load/store buffering) is beneficial, but will be reviewed.
- Compilers (CS 6241): Compilers convert from a high-level description of a program, down to lower-level. In order to guarantee a program meets a certain memory model the compiler must follow a set of rules. Knowledge of how compilers transform programs is beneficial to understanding the challenges they face with respect to memory models. Necessary background material will be covered.
Course Calendar
NOTE: This calendar is only an approximation of classes and material covered. It is subject to change.
Week | Date | Readings / Discussion Topics / Tutorials | Lead |
Foundations of Memory Models | |||
1 | Aug 20 | CS8803 Introduction [slides] | David Devecsery |
Aug 22 |
Classic Hardware Memory Models Background [slides, worksheet] Recommended Readings: Writing Reviews How To Read a Paper |
David Devecsery | |
2 | Aug 27 | Relaxed consistency part 2 [slides]
Classic Hardware Memory Models Readings: How to Make a Multiprocessor Computer that Correctly Executes Multiprocess Programs [PDF] No review required Leslie Lamport x86-TSO: A Rigorous and Usable Programmer’s Model for x86 Multiprocessors [PDF] No review Required Peter Sewell, Susmit Sarkar, Scott Owens, Francesco Zappa Nardelli, and Magnus O. Myreen CACM 2010 Weak Ordering - A New Definition [PDF] Sarita V. Adve, Mark D. Hill ISCA 1990 |
David Devecsery |
Aug 29 |
Challenges of Memory Models: [slides] The silently shifting semicolon [PDF] Daniel Marino, Todd Millstein, Madanlal Musuvathi, Satish Narayanasamy, and Abhayendra Singh SNAPL 2015 Frightening small children and disconcerting grown-ups: Concurrency in the Linux kernel [PDF] Jade Alglave, Luc Maranget, Paul E. McKenney, Andrea Parri, Alan Stern ASPLOS 2018 |
David Devecsery | |
3 | Sep 3 | No Class, Labor Day | Holiday |
Sep 5 |
Compilers and Memory Models: Background [slides] Foundations of the C++ Concurrency Memory Model [PDF] No review required Hans-J. Boehm, and Sarita V. Adve PLDI '08 |
David Devecsery | |
4 | Sep 10 |
Compilers and Memory Models: Readings [slides] Threads Cannot be Implemented as a Library [PDF] No review required Hans-J. Boehm The Java Memory Model [PDF] Jeremy Manson, William Pugh, and Sarita V. Adve POPL 2005 A Case for an SC-Preserving Compiler [PDF] Daniel Marino, Abhayendra Singh, Todd Millstein, Madanlal Musuvathi, and Satish Narayanasamy PLDI 2011 |
David Devecsery |
Sep 12 |
Java Memory Model Examples: Good, Bad and Ugly [PDF] No review required David Aspinall and Jaroslav Sevcık A Type and Effect System for Deterministic Parallel Java [PDF] Robert L. Bocchino Jr., Vikram S. Adve, Danny Dig, Sarita V. Adve, Stephen Heumann, Rakesh Komuravelli, Jeffrey Overbey, Patrick Simmons, Hyojin Sung, and Mohsen Vakilian OOPSLA 2009 Common Compiler Optimisations are Invalid in the C11 Memory Model and what we can do about it [PDF] Viktor Vafeiadis, Thibaut Balabonski, Soham Chakraborty, Robin Morisset, Francesco Zappa Nardelli POPL 2015 |
David Devecsery | |
Creating Better Memory Models | |||
5 | Sep 17 |
Weak Memory Models Hearding cats: Modeling, Simulation, Testing, and Data-mining for Weak Memory [PDF] No review required Jade Alglave, Luc Maranget, and Michael Tautschnig TOPLAS 2014 A Unified Formalization of Four Shared-Memory Models [PDF] Sarita V. Adve and Mark D. Hill TOPDS 1993 |
David Devecsery |
Sep 19 |
Weak Memory Models Continued DeNovoND: efficient hardware support for disciplined non-determinism [PDF] Hyojin Sung, Rakesh Komuravelli, and Sarita V. Adve ASPLOS 2013 Racer: TSO Consistency via Race Detection [PDF] Alberto Ros and Stefanos Kaxiras MICRO 2016 |
Students | |
6 | Sep 24 |
Strengthening Memory Models Intro Data Races vs. Data Race Bugs: Telling the Difference with Portend [PDF] Baris Kasikci, Cristian Zamfir, and George Candea ASPLOS 2012 INVISIFENCE: Performance-Transparent Memory Ordering in Conventional Multiprocessors [PDF] Colin Blundell, Milo M. K. Martin, and Thomas F. Wenisch ISCA 2009 |
Students |
Sep 26 |
Strengthening Memory Models Continued Making Sequential Consistency Practical in Titanium [PDF] Amir Kamil, Jimmy Su, and Katherine Yelick SC 2005 End-To-End Sequential Consistency [PDF] Abhayendra Singh, Satish Narayanasamy, Daniel Marino, Todd Millstein, and Madanlal Musuvathi ISCA 2012 |
Students | |
7 | Oct 1 |
Strengthening Memory Models Continued DRFx: A Simple and Efficient Memory Model for Concurrent Programming Languages [PDF] Daniel Marino, Abhayendra Singh, Todd Millstein, Madanlal Musuvathi, and Satish Narayanasamy PLDI 2010 Efficient Processor Support for DRFx, a Memory Model with Exceptions [PDF] Abhayendra Singh, Daniel Marino, Satish Narayanasamy, Todd Millstein, Madanlal Musuvathi ASPLOS 2011 |
Students |
Oct 3 | Project Proposals | David Devecsery | |
8 | Oct 8 | Fall Recess | Holiday |
Oct 10 | Midterm Prep (No class) | No Class | |
9 | Oct 15 | Midterm in class! | Exam! |
Oct 17 |
Strengthening Memory Models Finale A Volatile-by-Default JVM for Server Applications [PDF] Lun Liu, Todd Millstein, and Madanlal Musuvathi OOPSLA 2017 Hybrid Static–Dynamic Analysis for Statically Bounded Region Serializability [PDF] Aritra Sengupta, Swarnendu Biswas, Minjia Zhang, Michael D. Bond, and Milind Kulkarni ASPLOS 2015 |
Students | |
10 | Oct 22 | Exam Return / Review / Discussion | David Devecsery |
Oct 24 | Project updates / work / advice time | David Devecsery | |
11 | Oct 29 |
Transactional Memory LogTM: Log-based Transactional Memory [PDF] Kevin E. Moore, Jayaram Bobba, Michelle J. Moravan, Mark D. Hill, and David A. Wood HPCA 2012 OmniOrder: Directory-Based Conflict Serialization of Transactions [PDF] Xuehai Qian † Benjamin Sahelices, and Josep Torrellas ISCA 2014 |
Students |
Oct 31 |
Non Volatile Memories Memory Persistency [PDF] Stephen Pelley, Peter M. Chen, and Thomas F. Wenisch ISCA 2014 High-Performance Transactions for Persistent Memories [PDF] Aasheesh Kolli, Steven Pelley, Ali Saidi, Peter M. Chen, and Thomas F. Wenisch |
Students | |
12 | Nov 5 |
Memory Timing and Security Meltdown: Reading Kernel Memory from User Space [PDF] Moritz Lipp, Michael Schwarz, Daniel Gruss, Thomas Prescher, Werner Haas, Anders Fogh, Jann Horn, Stefan Mangard, Paul Kocher, Daniel Genkin, Yuval Yarom, and Mike Hamburg Usenix Security 2018 InvisiSpec: Making Speculative Execution Invisible in the Cache Hierarchy [PDF] Mengjia Yan , Jiho Choi , Dimitrios Skarlatos, Adam Morrison , Christopher W. Fletcher, and Josep Torrellas MICRO 2018 |
Students |
Nov 7 | Project updates / work / advice time | Students | |
13 | Nov 12 |
Distributed Systems Intro Dynamo: Amazon’s Highly Available Key-value Store [PDF] Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels SOSP 2007 Spanner: Google’s Globally Distributed Database [PDF] James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, Dale Woodford SOSP 2011 |
Students |
Nov 14 |
Distributed Systems: Part 2 FaRM: Fast Remote Memory [PDF] Aleksandar Dragojević, Dushyanth Narayanan, Orion Hodson, and Miguel Castro NSDI 2014 Latency-Tolerant Software Distributed Shared Memory [PDF] Jacob Nelson, Brandon Holt, Brandon Myers, Preston Briggs, Luis Ceze, Simon Kahan, and Mark Oskin Usenix ATC 2015 |
Students | |
14 | Nov 19 | Project updates / work / advice time | Students |
Nov 21 | Thanksgiving | Holiday! | |
15 | Nov 26 |
Distributed Systems Continued Fast In-memory Transaction Processing using RDMA and HTM [PDF] Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, Haibo Chen SOSP 2015 FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs [PDF] Anuj Kalia, Michael Kaminsky, and David G. Andersen OSDI 2016 |
Students |
Nov 28 |
Non-traditional architectures Review on GPGPU Shared Memory [PDF] No review required Devirtualizing memory in Heterogeneous Systems [PDF] Swapnil Haria, Mark D. Hill, and Michael M. Swift ASPLOS 2018 |
Students | |
16 | Dec 3 | Projects | Students |
Dec 5 | Class is Over | No One! |