Tufts CS 117 (Fall 2022):
Internet-scale Distributed Systems

Cross-cutting Issues*:
Principles, Guidelines and Bits of Wisdom

The fundamental goal of CS 117 is to uncover and explore a variety of principles, guidelines, and other useful insights relating to computer systems in general, and to large-scale distributed systems in particular. As we encounter each one, a brief entry will be recorded here. The entries below will be grouped into major themes, but in many cases the most important principles are indeed cross-cutting, in that they impact many facets of system architecture simultaneously.

Some of the items listed below are well understood fundamentals of good system design. Others are controversial and remain the subject of debate in the system design community.

* The term cross-cutting issues is lifted from one of the best textbooks ever written on computer hardware architecture: Computer Architecture: A Quantitative Approach, by John Hennessy and David Patterson. That book has sections throughout discussing cross-cutting issues relating to hardware design. For anyone who wants to do some self-study on hardware architecture, the book is highly recommended!

We will fill in this page as the term progresses.

General

KISS — Keep It Simple Stupid!

When in doubt, choose the simplest approach to solving a problem. Keep interfaces simple and straightforward. As the designers of UNIX advised: have each component of the system do one thing well.

Adding features or options to a system tends to create complexity that is out of proportion to the individual function that's added. Such complexity makes systems slower to develop and harder to maintain, and it often results in worse runtime performance as well. Even small additions to a system can have far-reaching ramifications.

Example: Unicode allows the Ö (O with Umlaut) character to be represented as either the single value U+00D6 or the two combining characters U+004F (O) and U+0308 ( ̈ ). Allowing such alternative representations in documents can facilitate integration with certain systems (on some old typewriters, the two were typed separately), but it greatly complicates the comparison of otherwise similar strings from two documents. This has been a major headache in many Unicode-based systems, including the Web. Extra code is needed in the comparison routines, and the pathlengths when doing comparisons may be longer. Such overhead is typically incurred for all comparisons, as it's usually hard to tell in advance which strings actually contain combining characters. Furthermore, the added comparison code can fill up processor caches, causing other parts of the system to run slower too. Similar complications arise if combining characters are allowed in filenames or variable names.

History: The KISS principle was first articulated (according to Wikipedia) by the legendary aircraft designer Kelly Johnson. According to the same source, the word "stupid" was not referring to those making the design decisions. Rather, the goal was to build a system that you didn't have to be very smart to understand and maintain. So, the principle is "Keep it simple (and) stupid", not "Keep it simple, stupid".

Tradeoffs: Sometimes a complex solution is needed to achieve the required result. In such situations, trying to make things overly simple can actually result in unnecessary complication or a poor design (as an example, the optimizers in programming language compilers often perform quite tricky transformations that result in code that is difficult to understand and debug, but that runs very fast.) There is a related principle attributed to Albert Einstein that "everything should be as simple as it can be, but not simpler".

References:

Tradeoffs

Almost every decision in the design of a system involves compromises. Even when applying the most fundamental of good practices, there are usually tradeoffs. A well layered system maybe easier to understand, evolve and maintain, but it might be slower than one in which layers are integrated. Clean abstractions are essential to good system design, but such abstraction often prevents control of specifics that may be important. Be alert for tradeoffs and compromises when applying the principles and guidelines described here.

Example: A filesystem provides many useful facilities for storing, organizing and retrieving information using hard disk, but access through a filesystem prevents the use of certain low-level optimizations that may be important in unusual cases (that's why many high-performance databases directly control the hard disk, rather than working through a filesystem).

Idempotence

The term idempotent describes an operation that produces the same results regardless of how many times it is executed. Setting someone's bank account balance to $100 is an idempotent operation; adding $10 to their account is not. The second deposit will add to the first.

In many cases, idempotent operations can be supported more efficiently, and by simpler protocols, than otherwise similar operations that are not idempotent. For example, if the (idempotent) balance setting operation above fails, it can be retried until the desired result is retrieved. If the (non-idempotent) bank deposit operation is sent through a network and clear confirmation is not received, then any retry must be labeled as a duplicate so the receiving system will know to perform the update exactly once. In order to do this, the receiving system will normally have to keep a record of operations that have already been performed, and all requests must be checked against that record.

Not all operations can be designed to be idempotent, but when practical, idempotent operations are usually simpler to implement and incur less overhead than others. Not always, but sometimes, a choice is available: if we know that the bank account has $90 before the deposit, then either an "add $10" or a "set balance to $100" operation will yield the same final result. The latter is idempotent, but the former is not.

Scaling

Metcalfe's Law and Network Effects

Metcalfe's Law states that for many networks, there is value in each connection that can be made. The number of such connections grows as the square of the number of things that can be connected, so the value of the network grows quickly as nodes or resources are added. Because growth is much faster than linear, a single large network tends to be much more valuable than multiple smaller networks, even though the smaller networks are often easier to build and maintain. The term network effects is often used to refer to the super-linear gain in value from integrating a node or resource into an existing network.

Example: the telephone system became more valuable when international direct dialing became possible. Phones in Europe could be used to directly call the United States, something that had not been possible when the networks were separate.

Example: Metcalfe's law is often used to explain why there "can be only one World Wide Web". A new page added to the Web potentially adds value to the many other pages that can now link it. Similarly, the new page itself may be more valuable as part of a Web in which it too can link to many other things. For example, an airline page can link to weather reports for the cities to which it flies, and the weather site can offer links to flights. Neither could be done if the weather reports and flights were in separate networks.

Implications: Metcalfe's law gives useful perspective on a surprising range of issues relating to the Web, including:

  • A single Web should connect users world wide
  • A single Web should support the broadest possible range of devices and operating systems
  • It should be possible to publish existing content using Web technologies
  • All scaling limits should be avoided if possible
  • Decentralized control and extensibility tends to be necessary if such a broad range of users and content is to be supported.
  • If content is "locked up" in proprietary systems, then that content neither contributes to nor benefits from the Web
  • Creating walled gardens, I.e. parts of the Web that can be accessed only by some users tends to undermine the benefits of network effects (a movie review that can be accessed only by logged in Facebook users contributes less to network effects than one that can be accessed by everyone).
  • Etc.

Tradeoffs: building and maintaining large networks may be more difficult than building multiple smaller ones.

History and references: this law is attributed to Ethernet inventor Bob Metcalfe, who supposedly made the point in a talk in 1980. His focus at the time was more on the value of a network of compatibly connected devices than on users (of a phone system) or Web pages, but the essence of the argument applies broadly. George Guilder has also apparently contributed to the formulation of the law, but everone calls it Metcalfe's law. Boston-area entrepreneur Sim Simeonov offers some clarification of the history in a 2006 blog posting.

Simplicity supports scalability

Having noted Metcalfe's law, we must observe that building very large systems is inherently difficult. Many of the challenges are obvious: it's hard to design algorithms that work at global scale, hard to manage a large system running under control of many separate organizations, hard to keep the software running as updates are incrementally installed in different locations, etc., etc.

One might therefore be tempted to assume that the largest systems are the most complex in their design and interfaces, but often the opposite is true. If the interface required for a system to join a network is simple, then the barriers to connecting are much lower. Thus, among the reasons that the Web's key architectural elements (URIs, HTTP, and HTML) are quite simple to learn and implement is to reduce the barrier to connecting a wide range of systems to the Web. We all know that many large organizations run Web servers, but simple Web servers are also now found in printers, home appliances, and even traffic lights! Building a simple but completely standards-compliant Web server is quite easy to do.

Tradeoffs: as noted above under KISS, some things just can't be properly implemented using simple techniques. One could argue, for example, that public key cryptography is conceptually complex and also difficult to implement properly. Nonetheless, this form of crypto is the best way we've found to implement (relatively) secure communication and identification checking on a global scale.

Related to: KISS principle and Metcalfe's Law.

Networks

The End-to-End Principle

One of the most famous and influential papers relating to the design of distributed systems is End to End Arguments in System Design by Jerry Salzer, David Reed, and David Clark. The Association for Computing Machinery's (ACMs) SIGOPS group voted this paper to its hall of fame, stating:

"This paper gave system designers, and especially Internet designers, an elegant framework for making sound decisions. A paper that launched a revolution and, ultimately, a religion."

Maybe not a religion, but this paper has helped a lot of designers to think clearly about tradeoffs in organizing distributed systems, and it neatly explains some of the tradeoffs embodied in the architecture of the Internet. This is a surprisingly easy paper to read and understand.

Stated informally, the principle says:

If you don't check the entire function of a multistep process from end to end, making sure that the overall operation completed successfully, then there are almost surely steps that are unchecked. Errors in those steps are likely to go undetected. Therefore, end to end checking is typically required to ensure that a system operates successfully!

Additional checking and recovery for particular steps within the system (e.g. on some particular network link) is therefore at best an optimization, because the end-to-end checks are both necessary and sufficient to ensure correctness. However, such incremental error recovery is often required to achieve practical performance; end-to-end recovery tends to result in repeating large operations in bulk (e.g. retransmitting an entire file), while incremental recovery of particular errors (e.g. a dropped or corrupted packet) can often be done very efficiently.

The principle also applies to systems in which end-to-end checking is not done, correctly predicting that those systems are at risk of having undetected errors.

From the paper:

"This paper discusses one class of function placement argument that has been used for many years with neither explicit recognition nor much conviction. However, the emergence of the data communication network as a computer system component has sharpened this line of function placement argument by making more apparent the situations in which and reasons why it applies. This paper articulates the argument explicitly, so as to examine its nature and to see how general it really is. The argument appeals to application requirements, and provides a rationale for moving function upward in a layered system, closer to the application that uses the function."

In short, the principle encourages clear thinking as to which functions or guarantees must be provided at the endpoints of a system, and which might be handled by interior layers or communication nodes.

Not only is this a wonderfully visionary paper on how to think about reliability and error recovery in a wide variety of computer data manipulation systems, it forms the key philosophical underpinning for the Internet as we know it. These were the "arguments" used to justify building a reliable system out of unreliable datagrams, and crucially, to justify putting the smarts at the periphery of the Internet vs. in the network itself (traditional AT&T phone systems). Thus, indirectly, this paper justifies building networks in which you can innovate by providing new functions (e.g. the Web, BitTorrent, VOIP, etc.) without necessarily rearchitecting or redeploying the infrastructure that implements the network fabric.

Thus, end-to-end arguments are closely related to the principle of network neutrality, I.e. that providers of network infrastructure should carry packets indiscriminately for any application, and should provide qualities of service based only on what's paid for and the underlying capabilities of the system. Tim Berners-Lee cites network neutrality as key to his ability to create and deploy the Web at a time when few others believed such a system was necessary or even possible (video)

Documents, Data Formats, Data Management

Text vs. Binary Formats

One of the most important tradeoffs in designing a data or message format is the choice of text or binary representation. By text representation we mean one in which the bits of the format are interpreted as encodings of characters that are suitable for presentation to users: ASCII or Unicode are the most common choices. Any information to be conveyed is first represented as characters, and then those are encoded as bits for storage or transmission. For example, the number 321 would be represented either as:

Text:     321     (e.g. the 3 ascii characters 0x33 0x32 0x31)
Binary:   0x0141  (i.e. the two bytes with the decimal values 1 and 65)

Actually, in the case of binary, the order of the bytes may vary according to whether the machine is big-endian or little-endian. I.e. they may be stored in memory or transmitted as either [0x01,0x41] or [0x41,0x01]. Indeed, this incompatibility results in one of the important tradeoffs favoring the use of text: widely used text encodings such as ASCII and Unicode tend to be stored and transmitted the same way by all systems; binary formats have multiple incompatible representations.

In the early days of computing, when memory was scarce, binary formats were almost always preferred when practical, as they tend to be more compact. To a striking degree, the application-level protocols used on the Internet, including those for e-mail (RFC 2822), Web HTTP (RFC 2616) and Web HTML (HTML) tend to be text-based.

History: This tradeoff has existed since the early days of computing. As noted above, the rapid drop in the price of computer memory and processing power has facilitated the widespread use of text-based formats.

Tradeoffs: As noted above, binary formats tend to be more compact. When all systems involved use the same binary formats (word size, endianness, etc.), then binary can be much faster as well. In such cases, in-memory data structures can often be written and read directly to or from a disk or network.

Text formats have many important advantages, including:

  • Machine-independent format eases transmission between incompatible systems
  • Easy to process using widely available tools, including text editors, programming language print statements, regular expression processors, or even spreadsheets
  • Easy for humans to debug
  • Easy to teach and learn: text-based formats are much more easily expained in books or Web pages. The View Source effect in systems like the Web means that users can easily inspect a file or message to see what it looks like, and thus to learn more about how a system works.

Related to:

Mixed Content

One interesting characteristic of a document, data format, or programming language is its ability or lack of ability to easily handle what SGML and XML call mixed content, I.e. text strings that contain structured markup. The most familiar example for most users is HTML, which allows markup like this:

<p>This is a <em>really</em> important point!</p>

In the example above, the <em>really</em> markup turns the string into mixed content, because there is markup within the text. Mixed content is particularly important in HTML because it enables links like this one to be embedded directly within text like this sentence, using markup like this!

...because it enables <a href="https://en.wikipedia.org/wiki/Hyperlink">links 
like this one</a> to be embedded directly within text...

Mixed content tends to be important for the machine-processable representation of highly structured documents. Systems that lack mixed content are fine for simple text documents (e.g. plain text e-mail), but tend not to represent with fidelity the logical structure of a more complex document. For example, both XML and JSON are useful for storing simple typed key/value pairs, but XML (which supports mixed content) is better able to convey the structure of, an insurance policy document that has many nested sections, specific fields with the customer's name and amount of coverage, etc. JSON is, however, simpler to implement and use for the cases it does handle well.

History: as far as I know, mixed content was first popularized in the (precursors of) SGML. Specifically: Charles Goldfarb developed a series of document formats at IBM in the 1970s; these evolved into SGML, which was widely used outside of IBM, and is indeed the precursor of XML. Tim Berners-Lee chose to make HTML a variant of SGML, in part to leverage the skills, expertise, and expectations of those who already knew SGML, and to avoid "reinventing the wheel"

Tradeoffs: Mixed content is powerful and useful, but it greatly complicates the handling of text strings in a programming language. Almost all popular languages have built in support for ASCII or Unicode strings; the only ones that handle mixed content really well tend to be languages like XQuery and XSLT that were designed with systems like XML in mind.

References: mixed content in the XML Recommendation

Model / View Separation

When designing a document or data representation format, the content (sometimes referred to as the model) should, to the extent practical, be stored separately from the presentation formatting instructions (the view). A number of benefits result from this approach, including:

  • Content tends to be easier to extract, search and update, both for machines and for people
  • The same content can more easily be re-styled, for presentation on different devices or purposes
  • This approach tends to facilitate making information accessible to those with perceptual deficiences or other disabilities: e.g. to process with screen readers for the blind
  • In many cases, the resulting encodings are more compact, because presentation information is shared across multiple documents

Many systems include not just model and presentation information, but also logic for managing interactions with a user and transitions between program states. There are benefits to separating such control logic from both the model and the view. The resulting architecture is known as Model/View/Controller (MVC). For example, in a user interface framework that's used to fill out data forms, the stored data (the model) should be separate from the form layout (the view), and both should be separate from the logic that decides which form to present next after the first is filled in.

Examples: The separation of CSS stylesheets from HTML is the most obvious example on the Web. It should be noted that this separation was incompletely realized for many years. For a long time, the preferred way of styling HTML content was with attributes in the HTML; only over time did CSS emerge and become sufficiently capable to cover most styling needs. Other examples include XSLT for styling XML (though in practice, XSLT is often used to produce HTML, that in turn uses CSS!). To see great examples of Model/View separation with HTML and CSS, check out the CSS Zen Garden.

Many presentation tools such as PowerPoint provide ways of changing the "Design" of a document after it has been created. As noted above, the User Interface frameworks for many languages including Smalltalk embody an MV or MVC layering.

History: Model / view / controller separation was first seriously explored during development of the famous Smalltalk language, which was created at Xerox PARC in the 1970s.

Related to: Layering and separation of concerns; tradeoffs

Tradeoffs: Tradeoffs in supporting MV or MVC include

  • Layering can result in complexity or performance problems
  • Separating content from presentation can be inconvenient or a bad choice when:
    • Users need to repeatedly update content and formatting together. This is a common case, e.g. when formatting word processor documents, slide presentations, etc. When updating Web pages using CSS, it is often necessary to update the HTML and the CSS separately
    • There will be only one document of a given style, so formatting information is never shared across documents anyway

Errors and Error Handling

Design for error handling

The overall architecture of a system tends to affect the options available for reporting on and recovering from errors. Indeed, providing robust error handling may significantly constrain or in some cases complicate the design of a system. For all of these reasons, the requirements for dealing with errors should be set out early in the design process, and where appropriate the architecture should be chosen to support clean handling of the necessary reporting and recovery.

Example: Many programming languages provide a syntax for character string literals. A design choice that has a significant effect on error reporting is whether such literals can extend beyond one line of source, for example:

str = "Hello
       World";

One subtlety when such continuations are allowed is determining whether the newline is part of the string, whether it becomes a blank space, etc. There is, however, a more serious problem relating to error reporting. Consider the following example:

str = "Hello World; <— Note missing quote mark

if (a > B) {
  printf("It's a beautiful day");
} else {
  printf("The weather is bad");
}

Note the missing " character after World. Because strings may extend beyond one line, the literal being assigned to str starts with Hello as intended, but it extends to the characters printf(, and everything to that point is legal syntax. The compiler will therefore report an error not on the assignment to str, but instead on the first printf It may go on to claim that there is an else clause with no corresponding if, since the latter was included in the string literal. This is confusing in the example above, but in a real program there may be hundreds of lines of code before the next quote character; the error report may indicate a line far from the true source of trouble, and it will likely say nothing about unterminated strings.

In a language where strings cannot extend beyond one line, or as in C and C++ where an explicit line contuation marker is required, the compiler can localize such errors to the line in which the problem occurs, and can provide a more helpful message, such as "Unterminated string literal on line xxx". Furthermore, the compiler may be able to go on and provide meaningful error reports for the remainder of the program.

Example: Robust error handling must typically be anticipated as code is organized and as invariants are established. Very often exceptions provide a structured framework for doing this, especially in networked and interactive systems. Consider a server system that maintains multiple connections supporting multiple remote users. A typical recovery architecture might have the following general features:

  • Invariant: each remote user is served by a single process.
  • Invariant: the exception handling code associated with that process will, when invoked, cleanly terminate the connection to the associated remote user, without affecting communication with any other user
  • Invariant: whenever a non-recoverable error or timeout occurs while communicating with a user, an exception is thrown in the associated process

By constraining the design of the system in this way, it becomes relatively easy to write code that will terminate the connection to one user without affecting work done by others. It also becomes easy to reason about the system's response to error conditions.

Tradeoffs: The choices described above constrain the design of a system. Often they make the system simpler and more powerful, but in other cases designing for robust error handling can compromise performance, convenience for users, or clarity of the program source code.

Clean abstractions for error handling

When designing a system of significant size, one of the key principles all good designers learn is: create clean abstractions, and where practical, use clean interfaces that are easy to reason about. The same good advice applies when planning to handle errors and failures. Clean abstractions are often the key to a robust and scalable approach to handling errors.

Example: one surprisingly effective approach to dealing with errors is to just stop. Do nothing! Although it's tempting to assume that some elaborate approach to recovering errors might be more effective (as sometimes it is!), guaranteeing that some component or part of a system will just stop rather than proceeding in the face of errors yields a very powerful guarantee: the system will never do anything incorrect!

Of course, it is often undesirable for an entire system to stop, so the "fail stop" architecture is very often applied to components or subsystems. In cloud-scale computing system with thousands of connected computers, each one that fails may stop, but the others will rapidly notice and arrange for the computation to either restart or be cleanly terminated. The power of this approach comes from the fact that it applies in uniform way to a very broad class of errors, and that the consequences are easy to reason about rigorously. If the system is running, then no unrecoverable errors have been detected and the computation is believed to be correct; higher-level error recovery is reduce to one challenge, I.e. dealing with components that have stopped.

Copyright 2012, 2013, 2014, 2015, 2016, 2017, 2018 & 2019 — Noah Mendelsohn & Tufts University