Comp111 Final Exam Answers
Dec 14, 2017 - open books and notes

No electronic devices of any kind allowed, including cellphones, calculators, pagers, etc. Please turn these off before beginning the exam.

Please show all work on these pages. Feel free to use the backs of pages as appropriate. Make sure that your work for each problem is clearly marked. Please place your name on each sheet.

Please note that the amount of space after each question does not indicate length requirements for an answer! If an answer is short, write it succinctly.

These pages will be graded and returned electronically, like exercises.

  1. Explain how a single linux file can have more than one pathname.
    Answer: The file itself has a unique inode, but that inode can be made part of several directories (or even the same directory, under a different name) via "hard links" constructed with the ln command. Symlinks are also possible, that are constructed via ln -s. The important fact is that an inode can be referenced in any number of directories, thus giving it several different names.

    There were serious misunderstandings in some answers. To clarify the limits of a filesystem:

    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value. Misunderstandings of the kind listed above are (at least) major misconceptions.
  2. Can a file with more than one pathname be accessible via one pathname and not accessible via another? Why?
    Answer: The file has one protection word that can only have one value no matter how many times the file is hard linked. However, the protection for a file is the union of restrictions for the file and for containing directories. Thus, it is possible to have a file that is accessible through one path of directories and not through another. For example, consider a file /foo/bar with protection 0644, which is hardlinked to /sand/bar. If the protection of foo is 0755 while the protection of sand is 0700, then /foo/bar is globally readable while /sand/bar is only readable to the owner of sand!
    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.
  3. Why is there a copy of the superblock in every inode group of a filesystem?
    Answer: If the superblock is lost by any mechanism -- including a scratch on the disk -- we lose track of several important things, including, e.g., the location of the root directory for the filesystem. Thus, we can no longer mount the filesystem. Thus this information is duplicated.

    Major misconceptions included that the superblock is written in several places to avoid seek time. This is false. It is actually written to every location in each pass through the disk.
    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.

  4. Under what conditions can one lose files with journalling in effect? Why?
    Answer: As we discussed in detail in the review session, journalling reduces the window of vulnerability for file loss but does not eliminate it. The way files are lost is if power is interrupted before or during a journal write. Only the information in that write is lost. If this write does not get to a consistency point, all changes between the previous consistency point and the end of the journal are lost.

    There were some major misconceptions in answers to this question.

    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.
  5. If the average time between entering and exiting a restaurant is two hours, and the average arrival rate is 20 customers per hour, what is the average number of people in the restaurant at a time?
    Answer: L = λ W = 20 customers per hour * 2 hours = 40 customers.
    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.
  6. If there are an average of 200 people in another restaurant, and the average time between entering and exiting the restaurant is 2 hours, then what is the arrival rate for people coming to the restaurant?
    Answer: λ = L / W = 200 people / 2 hours = 100 people per hour.
    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.
  7. Suppose we decide to split up a disk journal into several separated block groups distributed throughout a disk. Is this advantageous? Why or why not?
    Answer: The power of the journal is that most writes to it require one disk operation for the whole write. Distributing the journal in any way makes that less likely and degrades the performance of the journal. So distributing a journal on a spinning disk is undesirable.

    Since the journal must be an ordered ring queue, the time spent seeking is not reduced, because at any one time, only one of the stripes is being used. There is no seek time advantage to distributing the journal.
    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.

  8. In the command mkfs, there are a lot of options for constructing a filesystem. What happens if block groups in a filesystem are made too small? What if they are made too large?
    Answer: If block groups are too small, we end up with completely full block groups, and have to skip around between block groups too much in reading one file, so that files take longer to read. If block groups are too large -- by contrast -- internal fragmentation within each block group increases over time, which also degrades disk read times.
    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.
  9. One of the grand challenge problems of cloud computing is the zero-latency failover problem: how can one keep two computers so closely synchronized that one can replace the other quickly and completely? This, of course, includes disk contents. Describe how disk journals could help with this problem. How would you use journals, and what would be the limits of the solution?
    Answer: The simplest procedure is to constantly copy the contents of the journal on the live to the other machine, where it can be consumed as a set of changes. This only works if the two machines are otherwise exact duplicates, which is usually true. If the second machine refrains from making its own changes to the filesystem until the first machine fails, then failover becomes possible.

    This is usually accomplished by not even mounting the target filesystem on the second machine until failover. Instead, the journal is written to the raw disk with a special driver that is not a true filesystem driver. During failover, this disk is mounted and becomes the main disk.

    This is actually done in production systems, even with main and failover systems that are geographically far apart (east coast vs. west coast). This is part of the general discipline of "six-sigma uptime", which is the practice of running systems so that they are available 99.9999% ("six nines") of the time for critical needs, such as healthcare.

    In like manner, often a database is synchronized with a failover database via a database journal of changes, by the same technique.

    There were a number of interesting (but incorrect) approaches to this problem.

    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.
  10. There were difficulties in Assignment 3 with setting some limits, because the limits could prevent the child process from running at all. Why?
    Answer: Doing an exec requires more than running the child process. One must also have enough resources to run, the dynamic linker, to actually link and load the target process. This exec is subject to whatever limits you set on the child. Thus,
    1. Setting RLIMIT_NPROC too low keeps one from even opening the file for the child process!
    2. Setting RLIMIT_HEAP too low keeps from running.

    A large number of people talked about preventing the fork, which is completely under control of the a3 programmer and is just a programming error, as limits should be instituted after the fork and before the exec. The more important case is failure of the exec.
    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions including preventing the fork rather than the exec, +3 if any value.

  11. (Extra Credit) The library functions fsync and fdatasync ensure that the memory pages for a file are posted to disk, and mark the memory pages as "clean" rather than dirty, but do not de-allocate the pages. Why is this a good design?
    Answer: As we learned in assignment 4, deallocation of clean pages is fast. The only part that is slow is making them clean. Thus clean pages do not significantly delay LRU recovery. Meanwhile, by the principle of locality, it is likely that the processes manipulating those pages will need them again, so they remain available until they are not needed.
    Scoring: +10 if correct, +8 if minor issues, +5 if major misconceptions, +3 if any value.