Comp111: Operating Systems
Classroom Exercise 18 Answers
Storage
Fall 2017

  1. Journals preserve consistency by entering special records called "consistency watermarks" into the journal whenever the filesystem is in a consistent state. How can this be used to restore a filesystem after a crash?
    Answer: Recall that the goal of fsck is to preserve consistency. If we play back the journal to the last consistency watermark (and discard changes after the last watermark), the result is consistent, by definition. This is the same as running an fsck.
  2. Suppose that I recover from a crash by reconciling a journal (to the last consistency watermark) and -- before I'm done -- the system crashes again. How does one recover from the second crash?
    Answer: Reconciling a journal is an idempotent operation. If one reconciles a journal partially, and then the server dies again, and then one starts reconciling it again, the second reconciliation replays the first. So the result is the same as if there was one reconciliation and the filesystem becomes consistent.
  3. In an EXT3 filesystem (which can be thought of as like EXT2, but with bigger inodes, addresses, and a journal), what happens to block read time when a journal is used. Does it stay the same or increase or decrease? Why?
    Answer: The journal requires a check of a journal hash during every read that is not already in the page cache. In this case, reads slow down due to a need to check the journal hash before checking the source disk block that the journal may have over-written.
  4. In the "pure" journalling approach, in which the whole disk is a journal, what happens as the disk fills up?
    Answer: The pure journalling approach presumes that entries are written to a journal in the most efficient way, while reads are hashed so that they appear to be reading disk blocks. As the disk fills up, there is no place on the disk that does not have internal fragmentation due to deleted files.

    The key to this, however, is that the transaction queue is relocatable. Thus it is possible to defragment the journal entries by removing transactions that have been overwritten. This is a matter of starting at the beginning of the queue and copying the queue onto itself, overwriting entries that have been overwritten and are of a specific age. For example, suppose that we have a transaction queue

    A1 B1 C1 A2 D1 A3 C2 E1
    where A, B, C, D, E represent the addresses of physical blocks and the subscript indicates the version of a block. Clearly, at the present time, this transaction queue is equivalent to
    B1 D1 A3 C2 E1
    because the earlier versions A1, A2, C1 no longer matter.
  5. Is a file erased when its directory entry is erased? Why or why not? What does it take to truly "erase" a file?
    Answer: The content of a file persists even after it is deleted. Deletion de-allocates its inode and removes its directory entry, but does not overwrite its blocks. Some programs can recover files based upon this.
  6. In a raid 5 filesystem, explain why reads become slower when a disk fails.
    Answer: The file is spread over several disks. When one fails, the others must be used to reconstruct its contents. This slows down reads (slightly for hardward RAID; significantly for software RAID).
  7. (Advanced) In a flash drive with virtualized blocks, private data cannot be erased efficiently. Why?
    Answer: You are not in control of when blocks are overwritten. In fact, when you try to overwrite a block with garbage, you actually write that garbage into a new block. Thus in order to erase something, you must cycle through writing garbage into all free blocks on the disk, which will significantly shorten the life of the flash.

    This was not covered in lecture due to shortage of time. Sorry.