Debugging Cloud Computing

Tufts Cloud Computing | COMP 150-DCC | Spring 2023


Piazza Site:
Meeting Times: Mon/Wed, 4:30 - 5:45PM
In-person location: JCC 170

1   Description

Society relies on cloud computing---i.e., the practice of developing and using software services that run on top of rented, remotely-located software and hardware---for almost1 everything. For example, we rely on it when shopping (e.g., Amazon), when conducting financial transactions (e.g., online brokers), when traveling (e.g., flight-booking software), when entertaining ourselves (e.g., Netflix), when collaborating at work (e.g., Google Docs), and even when getting medical diagnoses (e.g., radiology scans in the cloud).

As a result of society's dependence on cloud datacenters, failures or performance problems within them or the networks that connect them to each other can be devastating. Consider these three examples: 1) in April of 2018, a cloud datacenter responsible for managing flight schedules suffered an outage, which left 500,000 passengers stranded all across Europe. 2) On June 2nd, 2019, an outage at Google's datacenters disrupted services such as Youtube, Gmail, Snapchat, and Vimeo for over four hours. 3) In November 2018, a failure within Amazon Web Services (AWS) resulted in several cryptocurrency exchanges becoming unavailable.

In this course, we will examine the causes of problems in cloud datacenters. We will also explore the vast collection recent research on tools and techniques that use domain knowledge, machine learning, and statistics to help engineers diagnose cloud problems. To provide students with necessary background, we will start with a brief introduction to cloud computing and the distributed systems that make cloud computing possible.

Key outcomes: After completing the course, students will be able to:

  1. Understand and use basic debugging tools for software systems such as GDB and gprof.
  2. Read and critically analyze research papers that focus on state-of-the-art diagnoses techniques for distributed systems in terms of their contributions, strengths, and limitations.
  3. Understand how to contrast different diagnosis tools and techniques in terms of their design, tradeoffs, and applicability to various types of distributed systems.
  4. Conduct original research or engineering in the area of cloud problem diagnosis.

Non-goals: This class is not about:

  1. Learning in detail the infrastructure and software systems that comprise cloud computing. This is a separate class (CS 118). We welcome you to take this class if you have not taken CS 118, but you may have to do some extra work.
  1. Learning how to write code.
  2. Doing a lot of onerous problem sets.
  3. Memorizing tons of material.

2   Instructors and TAs

Instructor: Raja Sambasivan
Office hours: Mon: 5:45PM - 6:45PM
Location: JCC 453
E-mail: raja AT cs[.]tufts[.]edu

TA: Darby Huye
Office hours: Wed: 3:15 PM - 4:30 PM
Location: JCC 440H
E-mail: Darby.Huye AT tufts[.]edu

TA: Grace Ye
Office hours: TBA
Location: TBA
E-mail: Grace.Ye AT tufts[.]edu


3   Important details & prerequisites

  • This is a project-intensive, research-based course intended for graduate students and advanced undergraduates. Students will drive much of the class discussions and are expected to complete a semester-long project of their choosing.
  • For undergraduates: COMP 40 (Machine Structure & Assembly Language Programming). Though not required, having taken COMP 111 (Operating Systems), COMP 112 (Networking), COMP 117 (Internet-scale distributed systems), COMP 118 (Cloud Computing) will help significantly.
  • For PhD students and Master's students: a graduate or undergraduate course in computer systems, networking, or distributed systems taken at any university.
  • If you are not sure you have the background needed for this class, please email one of the instructors! We will meet with you to discuss your background and the skills you should independently acquire to do well in this class. We want everyone to excel in this class.

4   Required textbook

The required textbook for this class is Designing data-insensive applications: the big ideas behind reliable, scalalbe, and maintainable systems, First edition by Martin Kleppmann. It is a great book that broadly covers disributed systems fundamentals. Free E-book versions are available for free via Tufts' library. You can purchase it at the Tufts bookstore or online (typically for $25).

5   Evaluation goals & structure

The final course grade will be calculated as follows:

Item Percentage of final grade
Reading summaries & homeworks 25%
In-class lecture lead 10%
Course project proposals 15%
Project Progress presentations 15%
Project midterm writeup 15%
Project final writeup & presentation 15%
Meet your friendly prof. 5%

Students will be assigned the following grades based on the percentages above.

Grade Interval
A 90 - 100%
B 80 - 89.9%
C 70 - 79.9%
D 65 - 65.9%
F < 65%

We may make these grade intervals more generous, but we will not adjust them to be less so. We will use +/- grading.

6   What are the various grading items about?

Homeworks: We may assign a small number of assignments, which will be handed in via Gradescope or Canvas.

Project (teams of three): This is a semester-long project you will
complete in teams of three. We will provide a list of projects and you can pick which one your team wants to work on. Students may work individually with explicit permission of the instructor. Your overall grade for the project is broken up into several parts. 1): An initial project proposal in which you describe the work your team plans to do, how it relates to previous work in the area, and propose a schedule for your progress. 2): 10-min periodic project progress presentations during which your team will deliver updates about your progress on your project. 3): A project midterm report and presentation. 4) A project final writeup and presentation.

*Paper summaries (individual): Students must submit summaries for most required readings for this class. Summaries must be posted as replies to discussion threads that the instructors will create on Canvas. We ask that students peruse the discussion thread to read other students' summaries before class. Please see this link for more information about how to read research papers and write your summaries.

Lecture discussion lead (in your project teams): During most lectures, one team will lead the discussion about the reading for the lecture. (The teams will be the same as that for the project.) To lead the discussion, your team must prepare a ~25 minute presentation about the reading material. Your team must meet with the Instructor or TAs to obtain feedback about a complete draft of your presentation at least five days before your presentation slot. Please see this link for more guidance about how to construct your presentations.

Meet your friendly prof (individual): During the course of this semester, you must meet individually with your course instructor (Raja) at least once during the semester during his office hours, either in-person or virtually. This meeting is a vehicle for your professor to get to know you, your interests, and your concerns. It is meant to be a friendly, fun chat.

7   Office hours

Please come to office hours so that we can get to know every one of you. Valid reasons to come to office hours include, but are not limited to: chatting about the course, discussing any of the readings, discussing course projects, discussing research ideas or anything you find cool about the class, or just wanting to say hi and introduce yourselves.

8   Where should I look for important information?

  • Course website: This website contains invaluable resources to help you with your project and paper critiques. It is a living document that will be updated regularly throughout the semester. Please check it frequently.
  • Canvas: You will submit reading summaries for readings as replies to discussions on Canvas. We will also use Canvas to host your grades.
  • Piazza: All course announcements will be posted here. You can also use Piazza to ask questions to the instructors or to the class. Please check this board frequently and subscribe to new notifications.
  • Gradescope: Some written homeworks will be graded and handed back here.

9   Course policies

Attendance: We encourage you to attend class and participate in discussions. Doing so will help foster a lively, positive class environment for both you and your fellow classmates. However, attendance is not mandatory.

Extensions & Late policy: You may miss up to one paper summary per course module. But, you cannot miss the summary worth 100 points. Each group has three days total slack time that they can use to submit the startup document and midterm writeup late. Otherwise, extensions cannot be granted. For example, you can submit the startup document one day late and the midterm writeup two days late w/o penalty. We cannot easily grant extensions for presentations to keep the course running smoothly. Please talk to us if there are extenuating circumstances why you cannot submit material on time and we will try to be flexible.

Extra credit easter egg: For 5 points of extra credit added to Homework 0, post a private note on Piazza with a link to a picture of your favorite (extinct or current) flying creature. This note must be private, addressed only to your instructor and both your TAs, and submitted by Wednesday, February 1st, 2022.

Collaboration: Please talk to other people and share ideas! Much of this class is collaborative. We strongly encourage you to discuss ideas and obtain feedback on drafts from your classmates.

  • Paper summaries: Please write these up individually. After submitting a summary, you may look at other students' summaries on canvas to get a sense of how they understood the reading.
  • Homeworks: We expect you to work individually. After completing a question, you may discuss your answers and obtain feedback from other members of the class. You must write up answers to questions individually.
  • Projects: We expect you to work in teams of three. You may work individually if approved by the instructor. You may use or modify any open-source codebase as part of your project. The core of your code, the entirety of your writeups, and the entirety of your presentations must be written by you and your teammates. Each team member must contribute equally and their contributions must be explicitly listed in each of the project documents (code, writeup, presentation). If you feel that a team member is not pulling their weight or if you feel your teammates are assigning you insufficient work or inappropriate work, please come talk to us.
  • Presentations: We expect you to work in your project teams. The presentation must be created by you and your teammates and you must explicitly cite any external resources you use. Team members must speak for equal amounts of time during presentations. You are not allowed to adapt or use existing presentations on the topic that you find online. This is because we want you to think deeply about best to present ideas to the class; doing so will greatly enrich your own understanding of the material.

Academic Integrity: This course will strictly follow the Academic Integrity Policy of Tufts University. Students are expected to finish course work independently when instructed and to acknowledge all collaborators appropriately when group work is allowed. Submitted work should truthfully represent the time and effort applied. Please refer to Tufts' Academic Integrity Policy, available here.

10   Diversity statement

The instructors and TAs of this class welcome students from all backgrounds. We view diversity and differences in viewpoints as critical strengths to be celebrated, not squashed. This is an especially important to emphasize in systems because, as you will see, often the answer to many questions is "it depends."

We expect that all members of this class contribute to a respectful and inclusive environment for every other member of the class. This does not mean we cannot disagree or have different ideas. It does mean we must consider perspectives other than our own, though they may differ from our own beliefs/experiences. If something in class or in the course materials makes your uncomfortable, please arrange a time to talk with me about ways I can improve the course. Alternatively, you can arrange to speak with the CS Department Chair ( about providing feedback that is anonymous to me.

11   Accommodations and wellness

Accommodations for the pandemic: The course instructors realize that we are all doing our best in a very difficult time. As a result of the pandemic, we know you might be experiencing stress or might be more strained for time. If you are experiencing any such difficulties, please come talk to us and we will do our best to help you succeed in the class.

Accommodations for Students with disabilities: Tufts University values the diversity of our body of students, staff, and faculty and recognizes the important contribution each student makes to our unique community. Tufts is committed to providing equal access and support to all qualified students through the provision of reasonable accommodations so that each student may fully participate in the Tufts experience. If a student has a disability that requires reasonable accommodations, they should please contact the StAAR Center (formerly Student Accessibility Services) at or 617-627-4539 to make an appointment with an accessibility representative to determine appropriate accommodations. Please be aware that accommodations cannot be enacted retroactively, making timeliness a critical aspect for their provision.

Children in class: All exclusively breastfeeding babies are welcome over Zoom (or in-person) as often as is necessary. For older children and babies, I understand that unforeseen disruptions in child-care often put parents in the position of having to choose between missing class to stay at home with a child and leaving them with someone you or the child does not feel comfortable with. While this is not meant to be a long-term child-care solution, occasionally bringing a child to class in order to cover gaps in care is perfectly acceptable.

Student wellness: As a student, you may experience a range of challenges that can interfere with learning, such as strained relationships, increased anxiety, substance use, feeling down, difficulty of concentrating, and/or lack of motivation. These mental health concerns or stressful events may diminish your academic performance and/or reduce your ability to participate in daily activities. There are resources at Tufts to help. You can learn more about confidential health services available on campus here. For mental health emergencies, please contact the CMHS front office at 617-627-3360 during regular business hours or the counselor-on-call at 617-627-3030 after hours.

Academic support: The StAAR Center (formerly the Academic Resource Center and Student Accessibility Services) offers a variety of resources to all students (both undergraduate and graduate) in the Schools of Arts and Sciences, and Engineering, the SMFA, and The Fletcher School; services are free to all enrolled students. Students may make an appointment to work on any writing-related project or assignment, attend subject tutoring in a variety of disciplines, or meet with an academic coach to hone fundamental academic skills like time management or overcoming procrastination. Students can make an appointment for any of these services by visiting, or by visiting

In case of emergency: If you, a family member, or a close friend are experiencing an emergency or crisis: absolutely do not worry about contacting me until you have some free time. In collaboration with other university resources, we will take care of getting your course work back on track after the crisis has passed.

12   Acknowledgments

The course policies and course descriptions are informed in part by the New Computer Science Faculty Workshop and numerous other classes at Tufts and other universities. The course website's theme is based on Pelican Alchemy and Jason K. Moore's Mechanical Engineering Capstone Course at UC Davis.