Comp111: Operating Systems
Classroom Exercise 2 Answers
System Calls
Fall 2017

In class we have discussed many Linux system calls. Let's make sure we understand the important issues.

  1. Can a process's children see what it prints? Why or why not?
    Answer: The process's children get access to the same open files, but not to what has been printed in those files. The print buffers are, however, duplicated. Thus if you call flush after both sides of a fork, output is duplicated.

    But there is a subtle difference between having access to what has been printed before, and getting access to what is printed in the future.

    There is -- in general -- no way to figure out what has been printed to, e.g., a terminal. When printing to files, it is possible to "rewind" the file to see what has been printed, but only for files that have been opened in read/write mode.

    So, in general, it isn't possible for a process's children to see what it prints.

    We will, however, study how to create communications between parent and child, in the next lecture.

  2. Would a call that does a fork and exec as one thing be as useful? Why or why not?
    Answer: The main disadvantage to a fork_exec() call is that it is often desirable to modify one's environment after the fork and before exec'ing a child, e.g., to impose a limit. This would be impossible if fork and exec were the same call. The library function "system" is in fact a combination of fork, exec, and wait, without any modification of one's environment. We will study ways to exploit the state between fork and exec in the next lecture.
  3. Why is it not necessarily a good idea to fork a program a large number of times to accomplish many separate tasks?
    Answer: Because of context switching overhead, it takes time to switch between the forked processes. Thus, for compute bound processes -- e.g., weather forecasting -- forking off several processes actually takes longer than running the computations in one process! For I/O bound processes -- e.g., a web server -- forking is quite advantageous, because the I/O operations can be interleaved (via latency hiding) and computation is not the bottleneck.
  4. Why would it be advantageous for an application program to avoid making system calls? What are the disadvantages of implementing something in the kernel rather than in user space?
    Answer: Any time one makes a system call there is a complex switch of context from user to kernel mode and then back, involving changing the whole memory map and protection scheme. This is time-consuming, which is why we tend to minimize system calls. For example, we use buffering in printf to call write a minimum number of times.

    Thus, programs that can be written without system calls tend to be faster than those that are written with system calls. For example, compute-bound programs are faster if one avoids writing results until after the computation, which is -- in essence -- the same strategy that printf uses.

    The kernel is executed with memory protection off; thus any bug or malicious code has the freedom to do more or less anything. There are more subtle limits; the kernel runs in a fixed memory space, and thus cannot easily expand its memory like a process can, so things that need a lot of memory should not be done in the kernel. In fact, there are several parts of the operating system that actually run in user mode in order to use dynamic memory allocation, as we will discuss later.

    (Advanced) We will see some very clever ways of avoiding system calls, including user-mode mutexes. These do not completely avoid system calls, but instead, minimize them and call them only when needed.

  5. Consider the code:
     
    pid_t pid1, pid2; int status; 
    struct rusage usage; 
    if ((pid1=fork())) { // parent
        printf("I am parent %d; child is %d\n",getpid(),pid1); 
        pid2=wait3(&status, 0, &usage); 
        printf("exit code for %d is %d\n", pid2, status); 
    } else { 
        execl("/bin/cat", "/bin/cat", "/comp/111/news/0001.txt", NULL); 
        printf("we should never get here!\n"); 
    } 
    
    1. Explain exactly when the "if" statement is true.
      Answer: It is true if the fork() return value (as stored in pid1) is nonzero, and thus represents the pid of a child. Thus it is true when we are in the parent of the parent-child pair.
    2. Why is the status an integer rather than something more complicated? What impact would there be for returning a more complicated status to the parent?
      Answer: Remember that anything we can do in one process has to be an option in every process. If the status could be larger, then zombies would take up more memory until they were reaped, and the whole process control block (PCB) that stores status information would have to be larger.
  6. (Advanced) I claim that in this particular case, whether the parent waits for the child is irrelevant and cannot create zombies. Why would that be? Under what precise conditions is it a bad idea for a parent to skip reaping of a child status?
    Answer: The reason this particular example is safe is that the parent exits and does not stay running. Thus the child becomes a child of init, so that it is reaped even though the parent is not running.

    It would be bad if the parent kept running. It is always safe for the parent to die, but not safe for the parent to run, unattentively.