Tufts COMP 117 (Spring 2019):
Internet-scale Distributed Systems

Should I Take
COMP-117: Internet-scale Distributed Systems?

This page intended to provide information for students who are considering taking COMP-117: Internet-scale distributed systems in spring of 2019.

Registration for Spring 2019

This is a discussion class and there are also written assignments (see below) which are graded and reviewed directly by the Professor. For both these reasons, the class size must be limited.

For Spring term 2019 you must fill out a Web-based application by Nov. 7, 2018 in order to be considered for admission. Information on how to apply is available on the Applying For Admission to COMP 117 page.

Course Content

Q. What is COMP-117 about?

A. COMP 117 explores the most important principles and rules of thumb for large-scale software system design. Many of these are core principles that every system designer should know. Understanding these principles will help you identify the key design points and architectural structures that will be most important to the success of the systems that you build.

The course covers a combination of some of the most famous principles (e.g. the End-to-End Principle, Postel's law, etc.), and also some less well known challenges that have proven important in practice (e.g. Leaky Abstractions). Several programming assignments provide experience with these principles, and with implementation of distributed systems.

The course focuses especially on the World Wide Web which embodies in particularly clean, comprehensible form many of the most important architectural principles we explore. The Web is an extraordinarly large and successful system, but it's core constructs are surprisingly simple, powerful and scalable. In addition to the Web itself, we explore principles that enabled the success of the Internet (on which the Web is built), and other important systems such as the Unix/Linux operating system.

Q. Can you give me a more detailed example of a topic discussed in the course?

A. Sure. Idempotence is a somewhat forbidding name for a simple concept. Operations that are idempotent do the same thing regardless of whether you try them just once or many times. Setting your bank balance to $100 is idempotent (do it twice and your balance is still $100); adding $10 to your balance is not. Idempotent operations are typically easier to implement, to optimize, and to reason about. Not everything can conveniently be done in an idempotent way, but when designing a prototocol or interface it's often worth asking which operations can or should be idempotent. Other important course topics include naming, designing for evolution (versioning), and the end-to-end principle.

Q. What are distributed systems?

A. Distributed systems use multiple computers working together to solve a problem or implement an application. We focus mainly on Internet-scale systems, such as the Web, e-mail, etc., but the principles we study apply to many smaller systems and to non-distributed systems as well.

Q. Is this a new course?

A. No. The course was first offered in fall of 2012 and has been taught several times since.

Who should take the course?

In 2012, this course attracted roughly an equal mix of undergraduates, regular masters/PhD students, and part-time evening students (the latter are allowed to do programming projects individually). Most students who took the course reported that they liked it, and success did not correlate highly with graduate/undergraduate status.

Q. What are the formal prerequisites?

A. COMP-40 or permission of the instructor (please email noah@cs.tufts.edu if you have questions). The conceptual material should be accessible to anyone who has taken COMP-40 and who is interested in principles of system design. The programming projects are challenging, but most students last year felt they were worthwhile; if you enjoyed and did reasonably well with the harder COMP-40 assignments, you should do fine. If you still feel uncertain programming at the COMP 40 level, you may have trouble.

We are aware that SIS is buggy, and that it may not accept your registration if you have not actually taken COMP 40. If you have Noah's permission to register and are having trouble, please let him know immediately and we will try to get it done for you.

Q. Does it matter if I took COMP-112: Networking?

A. There's a little bit of overlap, but the emphasis in the two courses is quite different. COMP-112 mainly teaches the multi-layer stack of network protocols. COMP 117 teaches principles of system design. COMP 112 is definitely not required for COMP 117; conversely, if you have taken 112, then COMP 117 should still be very worthwhile. Last year, perhaps 1/4 of students in COMP 117 had already completed 112. Occasionally, alternate versions of assignments will be offered to those who have already taken 112.

Instructor

The course is taught by Professor of the Practice Noah Mendelsohn. Noah has been doing research and development on distributed systems since the 1970s. He helped design the XML stack of document technologies, and until recently he co-chaired (with Tim Berners-Lee) the W3C Technical Architecture Group, which is the senior technical steering committe for the World Wide Web. This course is, to a significant degree, designed to share insights gained from working with the designers of the Web on the most challenging technical problems facing the Web and the Internet today. The lessons should be valuable for anyone designing large software systems.

Assignments and workload

Q. What are the assignments and tests like?

A. Although there are some challenging programming assignments, we also read several classic papers in computer science, selected chapters from textbooks (all available from Safari), and part of the autobiography of Tim Berners-Lee. Most weeks there is a short assignment asking you to provide written answers to questions about the reading. A rough estimate for this work is 2-4 hours/week, including reading and writing.

There are a few (2-4) team programming assigments . For many students, these will be a first opportunity to write distributed systems — doing that is hard, but very exciting! Writing two programs that communicate with each other is an important and rewarding experience. The programming assignments are designed to illustrate the principles and rules of thumb that are the subject of the course.

Students in 2012 reported that the larger programming assignments were similar in complexity and challenge to the harder COMP 40 assignments, but more time is given for each assignment, and there is time between programming assignments.

A final exam will be given during the last week of class, but during reading and finals period, you will work on a final paper covering a topic of current controversy relating to the Web or the Internet (you choose the topic). There will also be an in-class midterm.

Q. Overall, how much time will the course take?

A. Much less than COMP 40, but it's still a significant course. A few times during the term, you will be very busy for a week or two with team programming, and you will also likely spend a few days during reading/finals period on the final paper.

Q. Are there labs?

A. No. Occasionally, the instructor may suggest an optional lab-scale exercise to help you learn some topic, but doing those is up to you.

Q. What programming language is used?

A. C++. No, it's not a particularly beautiful language, and it's much less handy then Python/Ruby, etc. Nonetheless, most scalable networking systems and most of the browsers you use are written in C, C++ or a variant. Just as COMP 40 gives you experience with machine-level programming, programming in COMP-117 gives you a sense of how many large scale distributed systems are built today, and of how core networking APIs like sockets are used. Actually, for part of one project we allow you to use Python or Ruby if you prefer, but knowledge of those languages is not required.

Q. Are there reading and writing assignments as well as programming?

A. Yes. Compared to the introductory sequence of CS courses at Tufts, COMP-117 involves more reading and writing about concepts and principles. There are several reasons for this. First of all, for some students this will be their first opportunity to read and analyze important research papers that were influential in setting out fundamental concepts of computer science. Furthermore, the Internet and the Web are designed and managed by a worldwide community of programmers and system designers who exchange ideas about how to evolve the system. The reading and writing in 150IDS are designed to help you build the skills to participate effectively in communities like this. As noted above, there are also some very interesting and rewarding programming assignments.

Note: Some students who aren't fluent in English occasionally find some of the reading and writing to be challenging. The professor will try to help you succeed, and occasional lapses in grammar or usage won't reduce your grade, but it's essential that you be able to learn and to clearly explain the concepts covered in the readings. If you have any concerns, please check in advance with the instructor, who can show you some of the reading materials and assignments from last year.