Important details
- This schedule listed above may (and is highly likely) to change during the semester.
- You must read the required readings by the day they will be discussed in class.
- You need to be connected to a Tufts network to access some of the readings. You can connect to a Tufts network directly while on campus or via this VPN while off campus.
- Any Lecture videos are either linked to from this website or available in the "recordings" section of Canvas.
- Please read required readings before the class next to which they are listed.
- Paper summaries for required readings are due starting with Lecture 6. Please see the summaries page for more information. Please submit your summaries to the appropriate discussion thread on Canvas.
- Starting with Module 2:
- You must additionally submit 1 question about the required reading's technical content. This question must be submitted to the appropriate discussion thread on Canvas by 11:59PM the day of the lecture.
- You must present the required reading in your project groups. The schedule lists which group will present on which day. Please see the presentations page for more information.
- You must present short 10-minute project updates at the end of lecture in their project groups. The schedule lists which group will present on which day. Please see the projects page for more information.
1 - Class Intro
Lec | Date | Topic | Instructor slides | Student Slides | Project Update | Required Readings | Optional readings | Notes |
1 | W, 01/18 | Intro to cloud computing & why care about debugging? | Slides | N/A | N/A | N/A | ||
2 | M, 01/23 | Single process vs. distributed systems | Pre Post | N/A | Operating systems: the three easy pieces, processes, threads (pages 1-6) | |||
3 | W, 01/25 | Single process vs. distributd systems cont. | Pre, Post | N/A | ||||
4 | M, 01/30 | Disributed systems tour | Pre Post | N/A | Kleppmann Chapter 1 (up to reliability), Kleppmann Chapter 4 (up to schema evolution rules) | RPC Demo, | ||
5 | W, 02/01 | Distributed systems tour cont. & Metrics | Pre Post | N/A | Kleppmann Chapter 1 (up to describing performance), Kleppmann Chapter 9 (Ordering guarantes to end of chapter) | Metrics Handout (Do before class and we will dicuss in class), Project choices out. | ||
6 | M, 02/06 | Single-node slack analysis | Pre Post | N/A | Kleppmann Chapter 1 (rest of chapter), Curtsinger'15, | Reading summaries for required readings due by start of class starting with this lecture. Team membership Individual project preferences due Wednesday 02/08 11:59PM Eastern. Link to preferences survey on Piazza. | ||
7 | W, 02/08 | Single-node slack analysis wrap up | Pre Post | N/A | N/A | class will focus on critical analysis of Curstinger paper | ||
8 | M, 02/13 | *Guest lecture: Peter Portante, Red Hat | slides | N/A | N/A |
2 - Microservices & distributed tracing
Lec | Date | Topic | Instructor slides | Student Slides | Project Update | Required Readings | Optional readings | Notes |
9 | W, 02/15 | Microservices intro | Group 1 | Jamshidi'18, | Containers intro | Q: Why did containers enable microservice architectures? | ||
N/A | M, 02/20 | *No class, Presidents' Day* | ||||||
10 | W, 02/22 | Discussion | Group 2 | N/A | N/A | Project startup documents due Friday 02/24 at 11:59PM Eastern on Canvas. | ||
11 | Th, 02/23 | Open-source microservice testbeds | Group 3 | Group 7, Group 8 | Gan19 | DeathStarBench github, Lamport Clocks | Q: Why are microservices more likely to exhibit tail latency violations than single monoliths (Fig. 12)? | |
12 | M, 02/27 | Discussion | Group 4 | Group 6 | N/A | |||
13 | W, 03/01 | Intro to distributed tracing | Pre, Pre | Group 5 | Group 4, Group 5 | Sigelman10, Fonseca10 | Sambasivan'16 | Q: How do the tracing approaches in Sigelman10 and Fonseca10 differ? Be aware of OpenTelemetry: industry-standard distributed-tracing |
14 | M 03/06 | Discussion | Group 7 & 8 | Group 2, Group 1 | ||||
15 | W 03/08 | *Guest lecture: Yuri Shkuro, Meta | N/A | N/A | N/A |
3 - Trace collection & analysis
Lec | Date | Topic | Instructor slides | Student Slides | Project Update | Required Readings | Optional readings | Notes |
16 | M, 03/13 | Trace Sampling Strategies | Group 6 | Group 4 | Shkuro'19 | Q: What types of applications/workloads might bebest suited for head vs tail-based sampling? Why? | ||
17 | W, 03/15 | Discussion | Group 1 | Group 7 | ||||
N/A | M, 03/20 | *No class, Spring Break* | ||||||
N/A | W, 03/22 | *No class, Spring Break* | ||||||
18 | M, 03/27 | Intelligent Sampling Strategies | Group 2 | Group 6, Group 5 | Zhang'23 | Las-Casas'19 | Q: What is the event horizon as per Hindsight? Why is it important? | |
19 | W, 03/29 | Discussion | Darby | Group 8, Group 1 | Midterm project writeup due Friday 03/31 11:59PM Eastern on Canvas. | |||
20 | M, 04/03 | Critical Path Analysis | Group 4 | Group 2 | Zhang'21 | Q: How does CRISP define the critical path? What are some potential limitations of this definition? | ||
21 | W, 04/05 | Discussion | Group 5 | Group 4, Group 8 | ||||
22 | M, 04/10 | Distributed Tracing Visualizations | Group 7 & 8 | Group 6, Group 5 | Davidson'23 | Sambasivan'13 | Q: The paper lists many unsupported interactions (C3)- what might some challenges be for addressing C3? | |
23 | W, 04/12 | Discussion | Group 6 | Group 1, Group 7 | ||||
N/A | M, 04/17 | *No class, Patriots' Day* | ||||||
N/A | W, 04/19 | *No class* |
5 - Final project presentations & wrap-up
Lec | Date | Final presentation | Notes |
24 | F, 04/21 | Group 2(Project Update), Group 7 | N/A |
25 | M, 04/24 | Group 4, Group 8 | N/A |
26 | W, 04/26 | Group 1, Group 2 | N/A |
27 | M, 05/01 | Group 5, Group 6 | N/A |
28 | W, 05/10 | Final writeup due (no extensions) | N/A |