Tufts CS 117 (Fall 2024):
Internet-scale Distributed Systems

Tufts CS 117 Programming Assignment
End-to-End File Copy

Table of Contents

  1. Goals
  2. Overview
  3. Success criteria
  4. What to Do
    1. Getting Started
    2. Implementing end-to-end checks
    3. Your program's command line arguments
    4. Implementing a simple file copy protocol
    5. Improving your protocol
  5. Hints
  6. Instrumenting your code for grading
    1. Writing to the grading logs
    2. IMPORTANT: what to put in the grading logs
    3. The grading logs vs. the debug logs
  7. Preparing your report
  8. Submitting your work for grading
    1. Preparing your work for submission
    2. Submitting your preliminary end-to-end check and design document
    3. Submitting your work
    4. Instructions for team members
  9. A note on commenting and code quality
  10. Utility programs supplied to you
    1. SHA1
    2. Makedatafile
  11. Collaboration

These instructions are somewhat long, in part because there are a lot of details you'll eventually have to get right. We suggest you set aside some time to read through them casually first, to get a general idea of the scope of the project. Then go back, and try to note the key steps you'll have to take, and maybe also note things that don't make sense. With that done, you and your partner should get together to compare what each of you has learned, and to start making a plan. Also: remember that Piazza is the place to ask questions if you are confused.

Goals

In this programming assignment, which is the first really significant one of the term, you will deeply explore the use of the end-to-end principle, and the design of packet-based protocols. Your goal will be to build a system that copies a directory full of files from a client on one machine to a server on another. Your focus here will not be to build the most optimized and efficient protocol, but rather to cleanly separate end-to-end recovery logic from other lower-level error recovery, and to build a system that successfully copies the files. You will also have the opportunity to explore idempotence, text vs. binary formats (which we will discuss in detail later), and other key concepts we've discussed in the course.

Your submission will consist of a client program, a server program, a Makefile (which can be adapted from the one supplied), and an HTML file explaining your project, documenting its protocols and design tradeoffs and answering questions set out in the report template.

This is not an easy assignment, and complete success is not required for a good grade! There is a section below outlining Success criteria. In your report, you will indicate how far you think you've gotten, and that will guide our review of your submission. For example, we will ask you which "nastiness" levels you think are appropriate for testing your code, and why, and we will test it accordingly: if you know your code doesn't work for network-nastiness > 1, then there's no sense in our trying to test it beyond that.

To complete this asssignment, you will work in teams of two, doing pair programming. Please be sure you are familiar with the course rules on pair programming: in short you do not split the work; both partners must be present (in person or via video) when design decisions are made, when coding or debugging is done, and when the report is being written. Both must contribute to all phases, and both must agree to any decisions before they are implemented. The same grade will be awarded to both partners. (Exceptions allowing you to work alone will be very rare, and must be approved in advance by the professor.)

Overview

Your specific task will be to write a UDP-based client and a corresponding server program that will copy all the files in a source directory on the client machine to a target directory on the server machine. (*See note below). Unfortunately, this will be greatly complicated by two types of challenges, one of which you dealt with in the pingtest program:

  1. All your network traffic must be sent using the same c150nastydgmsocket class that you used in the pingtest programs (note that in the ping programs, the nasty version is used only in the server; please use the nasty version for both client and server in this assignment, and run with the same nastiness levels on client and server).
  2. All your file reads and writes will be done with a class you haven't yet used: c150nastyfile, which is documented in c150nastyfile.h. Its API is almost identical to the usual Unix fopen/fread/fwrite/fseek, etc. but you can guess the twist: you give the constructor a "nastiness" level that may cause it to sometimes give you erroneous data when reading file data, or to corrupt the file when writing or closing. Note that a sample program called nastyfile is provided for you, with source, in the FileCopy project directory. If you take a look at that, and try commands like man fread, you will quickly see that the class is just a thin (if occasionally malicious) wrapper over well-documented Unix functions.

To make sure that you don't declare success on any files that aren't properly copied, you will implement an end-to-end checking protocol that reads each file back from the disk after it's copied, and sends your choice of the entire file, or a sha1 hash code back to the client. The client then verifies whether the file has been correctly copied. To learn about sha1, look it up on the Internet and also see the sha1test.cpp sample that's provided for you (details below). You may steal code from the sample (don't forget the header files you need to include, and be sure to get the -lssl and -lcrypto switches into your Makefile, or you'll spend a lot of time wondering why your builds don't work!)

There's still a problem though. After the above steps, the client knows whether the copy has succeeded, but the server doesn't. Therefore, when you first write each file in the target directory, you will do what many network file copy programs do: you will store it under a temporary name like "filename.TMP", I.e. with .TMP appended to the name of the temporary copy. When the client discovers that the copy operation has succeeded, it must tell the server to rename its file to the proper name; optionally, you can also try to alert the server in the case of failure, to clean up its temporary file. (Renaming files can be done with a rename system call; look it up in Kerrisk, or else do a man rename.)

Note that this design gives you an interesting and important invariant: Except for files with the .TMP suffix, every file at the server is known to be a correct copy of the file at the source. Many "real world" systems uses similar tricks to ensure that users are never trusting incomplete or incorrect file data (many editors write to temporary files first, then rename once the entire file is safely written). Also note that in modern Linux systems rename is an atomic operation; if there's a crash right while rename is running, then after restart you will either find the old name or the new, but never both.

There are other protocol variations you could use, e.g. sending sha codes to the server and checking there, but then you'd need to ensure that the server waits to be sure the client knows about success. The rest of your protocol is likely driven by the client, and in such cases it's usually easier to have all of it driven by the client. That's why this design is suggested, even though it may add a bit of overhead.

The result of all the above trickery is that if your end-to-end checking is good, then it will never incorrectly claim to have successfully copied a file! This is the most important part of the assignment. Occasionally failing to successfully copy a file in the face of significant nastiness may impact your grade a little. Claiming even once that you successfully copied a file when you didn't shows us that your end-to-end check is faulty, and that will cost you significantly more.

So,we'll be asking you to implement and test your end-to-end check first: that submission is due on Tuesday October 08. Note that you can do this without any network file copy code at all. Just preload your target directory with files that are either good or bad copies of the source files, and make sure your end-to-end check can successfully detect any corrupt or missing files. Also at that time we ask you to submit a very preliminary design document, outlining your plans for eventually doing the file copying.

Once that's done, you can start with very simple, dumb copying algorithms. They may not succeed when nastiness is high, but at least you'll know when they failed, and when nastiness is low, they may work anyway. That's the essence of the end-to-end principle in action! We separated checking for success from using sophisticated techniques to improve the chances of success. As an example of a simple strategy, you can blindly send the data to the server, hope it gets there, and rely on the end-to-end check to have you redo the whole thing if necessary. Of course, if the nastiness gets high enough, then it might be a very, very long time before the whole file makes it successfully. As you proceed, you can make your packet protocols increasingly sophisticated in an effort to get more files copied in the face of higher levels of disk and network nastiness. That said, you won't come up with a good design for high nastiness by incremental "hacking". Quite early in your work, after you've done some successful experiments with low nastiness, you should try to come with a clean, well organized approach to efficiently handling both disk and file nastiness. Your final submission including your filecopy code and a detailed report is due on Tuesday October 15.

Success criteria

Again, the main points of the project are to get a feel for how end-to-end checking can organize your whole approach to reliability, and to give you a feel for the challenges of designing protocols using unreliable datagrams.

Success in this project is a matter of degree. Showing that your end-to-end check works is the bare minimum for a passing grade. Your grade will improve somewhat if you implement a simple file copy protocol that copies files successfully in the case where nastines=0 (no disk or network errors), and that might succeed sometimes on small files even when nastiness>0. Improving your grade further depends on two related factors:

  1. Improving your protocol to succeed with higher levels of disk and network nastiness
  2. Showing in your report that you understand the relationship between the design choices you make and the correctness of your results (correctly reporting that you have failed to copy a file is a correct answer, and indeed it may be the only practical answer when the nastiness is high -- most real systems abort when error rates get sufficiently high)

Don't under-estimate the second factor. Understanding and explaining the choices you make in this project is as important as writing code that runs. Furthermore, it can be almost impossible to figure out how a protocol works by reading just the code, so we will be depending on your written explanation of what you've done as a guide to our grading of the code too.

What to do

Here is a summary of what you will do. Although these steps are outlined in order and should be coded and tested incrementally, it's also essential that you start thinking ahead. The two protocols you'll design, I.e. the end-to-end check and the file copy, are the two hardest parts of this assignment, and they'll ultimately have to work together: file copy packets that are delayed in the network may show up in the middle of your end-to-end check once you turn up the nastiness. Stray packets from an earlier file copy may show up during a later one. You'll probably want to start thinking about how you'll handle tough challenges like that, while coding the easier versions that don't deal with high nastiness levels.

You should do the following roughly in order. Remember, you can get a decent if not spectacular grade by merely copying the directory with nastiness=0 or nastiness=1. You are strongly urged to get that much working first using a straightforward protocol, while thinking in advance about the changes that might deal with higher error rates. Handling the higher nastiness levels is very tricky, and if you don't create (and keep!) a version of the code that handles the easier cases, you risk having nothing running when the assignment is due! (git is a great tool for this if you know how to use it, or if not, just make a habit of copying your entire development directory into a dated backup copy every few hours, and especially when you've reached useful milestones. If you're not using git, which has commit messages, we suggest you leave yourself a note in each backup indicating what level of function provides. That way, if you get in trouble, it will be easy to go back to something that works.)

Getting Started

Implementing end-to-end checks

There's a nice trick that makes it easy to try out your end-to-end checks without implementing file copy code at all: our servers and clients share a filesystem! So, for this first step, your program won't copy files at all! Rather, you will use the nastyfiletest program to fill the TARGET directory for you. Depending on the nastiness, it will either make clean copies, or it will include some files with errors, and your end-to-end code should catch those.

Specifically, you should:

Note that Linux comes with cmp and diff programs you can use to see for yourself which files are clean and which are corrupt, so that makes it easy to see if your end-to-end checks agree.

Your first submission will consist of the end-to-end check code described in this section, along with a design document for your file copy protocol.

Your program's command line arguments

Your programs should have the following names, and take the following arguments:

         fileclient <server> <networknastiness> <filenastiness> <srcdir>
         fileserver <networknastiness> <filenastiness> <targetdir>

Among the tests we will likely run on your programs is::

         fileclient <sever> <networknastiness> <filenastiness> /comp/117/files/FileCopy/SRC
         fileserver <networknastiness> <filenastiness> TARGET

SRC is provided to you, and it's pre-stocked with files for you to copy. For the final filecopy submission, TARGET should start out empty before each test, and will hold the results written by your server. Of course, you should run other tests too, but the supplied SRC directory is a good one to start with.

For your initial end-to-end submission, the commands, arguments and SRC directories are the same as for the filecopy submission, but as described above in Implementing end-to-end checks you should pre-populate the TARGET directory either with a correct copy of SRC, or with your choice of missing and/or corrupted files.

In your report, you will tell us the nastiness levels we should use and how much function you believe is working (if you've only manged to do end-to-end checks and no file copying, you will need to give us a TARGET.TST directory with the nasty'd files you've used for checking).

Implementing a simple file copy protocol

If the above is all you do, you'll get a passing grade. Next it's time to improve that, by doing your own file copies through the network. Again, we strongly urge you to start with a simple protocol that will work with nastiness=0. Getting that to work will raise your grade very significantly, and building protocols to run over lossy networks is very tricky.

Improving your protocol

Your grade will improve significantly if you can reliably handle nastiness levels >0. Improve your protocol to recover from errors and to retry. Suggestion: always keep copies of earlier versions that worked. That way, if you get in trouble, you'll have something to hand in.

Hints

We'll add to these as we get questions from students, so check back here often

Instrumenting your code for grading

To help us understand what your program is doing, you MUST include in your program statements to output information into a grading log that will alert us to your program's progress. Doing this should be straightforward and shouldn't take much time. This section gives the specific instructions.

First, as noted above you must copy the code from the start of nastyfile.cpp that says:

  //
  //  DO THIS FIRST OR YOUR ASSIGNMENT WON'T BE GRADED!
  //
  
  GRADEME(argc, argv);

into the main program of each of your executables. This will cause the program to create a new GRADELOG_XXXX.txt file each time you run your program. You will not submit these log files, but you are welcome to look at them and check them to help verify your program. We will generate new logs ourselves when we test your code.

By default the grading files will not have much useful, but you MUST instrument your code with additional output statements as described here. Read your GRADELOG files during testing to make sure they're working. Note that grading log filenames include the program name and date, so server and client automatically get their own logs, and successive runs don't wipe out older files. You'll be deleting any you create before submission anyway, so there's no need to keep extra ones around.

Writing to the grading logs

An output stream pointer with the global name GRADING is created for you automatically when you issue the GRADEME call shown above. As long as you include c150grading.h in a source file, you can write to the grading log like this:

*GRADING << "The sum of 100 + 20 + 3 is: " << 100+20+3 << endl;

In other words, this is an ordinary C++ ostream and you can do all the usual things with it. Note that GRADING is a pointer so you must write to *GRADING, including the splat (*).

IMPORTANT: what to put in the grading logs

You MUST use the technique shown above to log significant events to the grading log. You'll need to put grading log information in both your client and server. We need to see each time a client attempts to send a new file, each time a server starts writing a new file, when transmission of a file completes at the client, when end to end checks succeed and fail, etc. Some of these events will be common to everyone's work, since everyone must try to send a file somehow.

We've standardized a set logging message formats for these common events. Please follow the formats when logging those events.

For the client:

Event Format
When the client starts to send a filename <name> at attempt #<attempt>
(If your protocol does not support retries, then make <attempt> 0.)
File: <name>, beginning transmission, attempt <attempt>
When the client finishes sending all of filename <name> to the server during attempt #<attempt>
File: <name> transmission complete, waiting for end-to-end check, attempt <attempt>
When the end-to-end check for the file is successful
File: <name> end-to-end check succeeded, attempt <attempt>
When the end-to-end check for the file is unsuccessful
File: <name> end-to-end check failed, attempt <attempt>

And for the server:

Event Format
When the server starts to receive a filename <name>
File: <name> starting to receive file
When the server has finished receiving the file and an end-to-end check is starting
File: <name> received, beginning end-to-end check
When the end-to-end check for the file is successful File: <name> end-to-end check succeeded
When the end-to-end check for the file is unsuccessful File: <name> end-to-end check failed

There will likely be other significant events to log depending on the protocols you've designed.

For example, if you happen to use an approach where you resend just part of a file that's in error, then you should log something like "File: myfile.txt resending bytes 5000-6000 attempt 2". If your program does something different, like giving up on an entire file in case of an error, then log that.

It is also helpful to include information about the type of end-to-end check you're performing and log something like File:myfile.txtsendingsha1checksumedef9723...af5bc00toclient (or "sending whole file to client")

In other words, the logs will be our guide to what your program is doing. Try to make sure they tell the story. We'll read your report and your code as a guide and a cross check. When nastiness=0, we'll expect to see just a few lines per file, indicating start/end of transmission, successful end-to-end check etc. When nastiness is higher, we'll expect to see more about your attempts to retransmit or recover.

Guideline: we do not want to see a line in the log for things like successful transmission of an individual packet or file block, because that would generate too many entries. We do want to see when you use strategies for error recovery, e.g. when you re-attempt sending a file, if you ask for or receive retransmission of a missing packet, if you do anything particular to recover disk nastiness, etc. Typically, except where you are doing lots of error recovery, we would expect a few lines for each file transmission attempt. 

The grading logs vs. the debug logs

In case there's confusion: here's how to think about the difference between the debug logs and the grading logs:

Of course, you may sometimes want to duplicate information in the two logs, e.g. so you can see significant events in your debug log as they occur: that's up to you. If you do that a lot, you can always make a little helper function that will write the same message to both.

Preparing your report

The report you submit with your project will be as significant for grading as the code itself. In fact, the report will be our guide to understanding your code.

A template for your report is available for download from the links below: the procedure is similar to what we've been using for question/answer assignments. Download the HTML file and adapt it to include your report. Please keep the heading information intact, but feel free to adapt the rest of the HTML to your needs, and to include <style> tags with CSS at the top if necessary. Remember that the <pre> HTML tag is very useful for quoting multi-line code fragments (like structs to explain your packet structures).

There are a few questions at the top of the template that you should be sure to answer. These will tell us how far you think you got with the assignment, how you recommend testing your code, etc. In particular be sure to indicate the highest nastiness levels at which your code should succeed in copying the entire directory.

The main part of the report will be an explanation of your approach, what you think you've achieved, and what we should expect when we test your code. We want to know what your code is doing, and we want you to explain why it does or does not work in particular cases. For example, if you know that delayed packets will cause your program to get confused as to which file it's working on, tell us. If you think you're handling that case, explain the technique you're using to keep things straight.

Overall, your report should cover:

Include anything else that will demonstrate your understanding of this assignment and your results. Your comments on the assignment, and suggestions for future revisions of it are also welcome. Also, please include a statement confirming that both team members were present for (substantially) all coding, and that both worked out the design together (obviously, you can do some individual design work, but it should be roughly balanced, and you and your team mate must make all final decisions together, and with shared understanding.)

Template for report - Download template for report

Submitting your work for grading

The following are the steps you should take to prepare and submit your work. If you are on a team, one student should follow these steps, and the other should follow the instructions under team submission below.

Preparing your work for submission

As described below, you will be making two submissions:

  1. A preliminary submission demonstrating your end-to-end check and including a design document for your proposed file copy protocol.
  2. A final submission with your running file copy code, and the report described above.

There are sections below telling you how to do each of these submissions. For both of them, please consider the following checklist before submitting:

Submitting your preliminary end-to-end check and design document (due Tuesday October 08)

Approximately one week ahead of the final due date for the project, you must submit the following (see the assignment calendar for the exact due date):

It will be fine if you later change your design, but we want you to think about and document a design before you try coding and debugging

One team member from each team should submit the code and design document:

cd <parent-directory>
provide comp117 filecopycheck FileCopy design.pdf <explain.txt>

Note that the submission name for this preliminary submission is filecopycheck.

You may submit design.txt in place of design.pdf if you prefer.

FileCopy must be the name of the directory in which your code is built. design.pdf or design.txt is your design document. In most cases explain.txt need not be provided, but this is the place for the submitting team member to give explanations of any personal issues that might need attention (explanations for lateness, illness, etc.)

For this preliminary submission, just note the partner's name and login in the design document. As described below, we use a more formal procedure for the final submission.

Your final submission (due Tuesday October 15)

The same team member who made the preliminary submission should submit the code and your report. Be sure to reread the instructions for the report; it is not the same as your design document (though you are welcome to copy pieces from your earlier design document submission...be sure to use the supplied report template though.)

cd <parent-directory>
provide comp117 filecopy FileCopy filecopyreport.html <explain.txt>

FileCopy must be the name of the directory in which your code is built. filecopyreport.html is your report. In most cases explain.txt need not be provided, but this is the place for the submitting team member to give explanations of any personal issues that might need attention (explanations for lateness, illness, etc.) Information related to actually running and grading the submission should be in the report itself.

Instructions for team members

If you are a member of a team, then one of you should submit your complete project as described above. Immediately after that's done, the other should:

provide comp117 filecopy teamreport.txt <explain.txt>

teamreport.txt should be a short text file indicating your name, and your team member's name. It should indicate: "I hereby certify that the submission by <partner's name> on <date> and <time> is our joint submission. Both the code and the report included with that submission are our joint work, and should be the basis for my grade for the file copy assignment."

Again, to emphasize what's stated in the first paragraph of this section: the student who submits the code must not make an additional submission with a teamreport; only the student(s) who do not do the code submissions submit the team report.

If you there is any additional information, e.g. relating to personal issues (illness etc.) then either partner can provide that in an explain.txt file, as usual. As noted above, information related to actually running and grading the submission should be in the report itself.

A note on commenting and code quality

In programs of this complexity it's particularly important that you organize and comment your code so that a grader can figure out how it works. If your code is not pleasant to read, well organized, and reasonably well commented, you will lose credit. Even code that works may not be judged well if we can't easily figure out why it works.

Please be sure to follow the CS 117 coding standards.

Utility programs supplied to you

SHA1 Code

You'll note in the project directory a program named sha1test.cpp. We encourage you to look at it, and steal code from it if you like It provides SHA1 hashes that may be useful in your end-to-end checks.

To try sha1test pass it a list of filenames and it will compute sha1 hashes for each. The crucial call that does the checksum looks like this:

char buffer[];   // buffer with your file data
int length;      // length of your file data
unsigned char shaComputedHash[20] // hash goes here
SHA1((const unsigned char *)buffer, length, shaComputedHash);

Also, it's very useful for testing to know that you can compute sha1 hashes from the command line using:

openssl dgst -sha1 file [file...]

You do not have to use SHA1 hashes for your end-to-end checks, but if you decide to, sha1test.cpp may be a useful guide.

Important: if you are using the supplied SHA1 method shown above, you must link against the ssl and crypto libraries. So, the Makefile entry for sha1test looks like this:

sha1test: sha1test.cpp
	$(CPP) -o sha1test sha1test.cpp -lssl -lcrypto

Makedatafile

Also in the project directory is a file called makedatafile.cpp . I wrote this for my own testing, and it may be useful to you. It's a simple, not particularly well-written program that takes a filename and a linecount. It creates a file of the given name, with the specified number of lines. The lines are filled with ascending numbers. When debugging the file copy protocol, files like this can be useful, because it's easy to figure out which buffer goes where! It's provided as-is "without any warranty". Bug reports welcome, as always.

Collaboration

Most of you will work in teams. The rules we will follow will be those set out for pair programming in COMP 40.

That said, this is hard stuff! Distributed programming is tricky: even simple bugs can be difficult to find, and if you fall into the (common) pitfall of designing a messy protocol, things can head downhill fast. So, you should get help when you need it.

Piazza should be your first stop for getting help. As usual, if your question is of general interest, please post publicly. If your question discloses any aspect of your design or design ideas, then please post privately. You are also welcome to mail us design documents or notes for comment, and we will do our best to take a look and help you. Getting reports about what's confusing you also helps us to know what we need to clarify for everyone. When in doubt, send us your questions, and we'll do our best to get you some help.

Our TAs are also available to meet with you and help you. Similarly, you may seek out help from others who are expert in networking who are not currently taking the course. When you get such help, it MUST be acknowledged with your report, and you must explain what kind of help you got. In general, it's appropriate for helpers to point out flaws in your design, and to point you to useful references. It is never appropriate for someone (other than the course staff) to fix your design or code for you.