Debugging Cloud Computing

Tufts Cloud Computing | COMP 150-DCC | Spring 2023

Schedule


Important details

  • This schedule listed above may (and is highly likely) to change during the semester.
  • You must read the required readings by the day they will be discussed in class.
  • You need to be connected to a Tufts network to access some of the readings. You can connect to a Tufts network directly while on campus or via this VPN while off campus.
  • Any Lecture videos are either linked to from this website or available in the "recordings" section of Canvas.
  • Please read required readings before the class next to which they are listed.
  • Paper summaries for required readings are due starting with Lecture 6. Please see the summaries page for more information. Please submit your summaries to the appropriate discussion thread on Canvas.
  • Starting with Module 2:
    • You must additionally submit 1 question about the required reading's technical content. This question must be submitted to the appropriate discussion thread on Canvas by 11:59PM the day of the lecture.
    • You must present the required reading in your project groups. The schedule lists which group will present on which day. Please see the presentations page for more information.
    • You must present short 10-minute project updates at the end of lecture in their project groups. The schedule lists which group will present on which day. Please see the projects page for more information.

1 - Class Intro

Lec Date Topic Instructor slides Student Slides Project Update Required Readings Optional readings Notes
1 W, 01/18 Intro to cloud computing & why care about debugging? Slides N/A   N/A N/A  
2 M, 01/23 Single process vs. distributed systems Pre Post N/A   Operating systems: the three easy pieces, processes, threads (pages 1-6)    
3 W, 01/25 Single process vs. distributd systems cont. Pre, Post N/A        
4 M, 01/30 Disributed systems tour Pre Post N/A   Kleppmann Chapter 1 (up to reliability), Kleppmann Chapter 4 (up to schema evolution rules)   RPC Demo,
5 W, 02/01 Distributed systems tour cont. & Metrics Pre Post N/A   Kleppmann Chapter 1 (up to describing performance), Kleppmann Chapter 9 (Ordering guarantes to end of chapter)   Metrics Handout (Do before class and we will dicuss in class), Project choices out.
6 M, 02/06 Single-node slack analysis Pre Post N/A   Kleppmann Chapter 1 (rest of chapter), Curtsinger'15,   Reading summaries for required readings due by start of class starting with this lecture. Team membership Individual project preferences due Wednesday 02/08 11:59PM Eastern. Link to preferences survey on Piazza.
7 W, 02/08 Single-node slack analysis wrap up Pre Post N/A   N/A   class will focus on critical analysis of Curstinger paper
8 M, 02/13 *Guest lecture: Peter Portante, Red Hat slides N/A   N/A    

2 - Microservices & distributed tracing

Lec Date Topic Instructor slides Student Slides Project Update Required Readings Optional readings Notes
9 W, 02/15 Microservices intro   Group 1   Jamshidi'18, Containers intro Q: Why did containers enable microservice architectures?
N/A M, 02/20 *No class, Presidents' Day*            
10 W, 02/22 Discussion   Group 2   N/A N/A Project startup documents due Friday 02/24 at 11:59PM Eastern on Canvas.
11 Th, 02/23 Open-source microservice testbeds   Group 3 Group 7, Group 8 Gan19 DeathStarBench github, Lamport Clocks Q: Why are microservices more likely to exhibit tail latency violations than single monoliths (Fig. 12)?
12 M, 02/27 Discussion   Group 4 Group 6 N/A    
13 W, 03/01 Intro to distributed tracing Pre, Pre Group 5 Group 4, Group 5 Sigelman10, Fonseca10 Sambasivan'16 Q: How do the tracing approaches in Sigelman10 and Fonseca10 differ? Be aware of OpenTelemetry: industry-standard distributed-tracing
14 M 03/06 Discussion   Group 7 & 8 Group 2, Group 1      
15 W 03/08 *Guest lecture: Yuri Shkuro, Meta   N/A N/A N/A    

3 - Trace collection & analysis

Lec Date Topic Instructor slides Student Slides Project Update Required Readings Optional readings Notes
16 M, 03/13 Trace Sampling Strategies   Group 6 Group 4 Shkuro'19   Q: What types of applications/workloads might bebest suited for head vs tail-based sampling? Why?
17 W, 03/15 Discussion   Group 1 Group 7      
N/A M, 03/20 *No class, Spring Break*            
N/A W, 03/22 *No class, Spring Break*            
18 M, 03/27 Intelligent Sampling Strategies   Group 2 Group 6, Group 5 Zhang'23 Las-Casas'19 Q: What is the event horizon as per Hindsight? Why is it important?
19 W, 03/29 Discussion   Darby Group 8, Group 1     Midterm project writeup due Friday 03/31 11:59PM Eastern on Canvas.
20 M, 04/03 Critical Path Analysis   Group 4 Group 2 Zhang'21   Q: How does CRISP define the critical path? What are some potential limitations of this definition?
21 W, 04/05 Discussion   Group 5 Group 4, Group 8      
22 M, 04/10 Distributed Tracing Visualizations   Group 7 & 8 Group 6, Group 5 Davidson'23 Sambasivan'13 Q: The paper lists many unsupported interactions (C3)- what might some challenges be for addressing C3?
23 W, 04/12 Discussion   Group 6 Group 1, Group 7      
N/A M, 04/17 *No class, Patriots' Day*            
N/A W, 04/19 *No class*            

5 - Final project presentations & wrap-up

Lec Date Final presentation Notes
24 F, 04/21 Group 2(Project Update), Group 7 N/A
25 M, 04/24 Group 4, Group 8 N/A
26 W, 04/26 Group 1, Group 2 N/A
27 M, 05/01 Group 5, Group 6 N/A
28 W, 05/10 Final writeup due (no extensions) N/A