Comp111: Operating Systems
Classroom Exercise 3 Answers
Signals and handlers
Fall 2017

In class we have discussed the concept of a signal, including sending and handling signals. Let's make sure we understand the important issues.

  1. Why can't you block a segmentation fault signal (SIGSEGV)?
    Answer: You can respond to a segmentation fault but can't block it, because the fault leaves the process in an unresumable state. There is no reasonable way for the process to continue once it has referenced non-existent memory. So, if you try to ignore SIGSEGV in a handler, the process will end, anyway, even if you don't exit during your handler.
  2. I claimed in class that one can actually stack signals when using obsolete signal() syntax, but if you try that with the keyboard you will not be able to do it. How can one reliably generate stacked signals?
    Answer: The problem is that the signal handlers are too short in duration to stack. The keyboard can only generate 10 events per second, maximum. Signal handlers can complete in as little as one microsecond. However, a programmatic kill statement can generate thousands of events per second, making stacking more likely.
  3. Why is there a separate signal stack for receipt of signals? What would go wrong if the execution stack were used for this purpose?
    Answer: It is important to note that signals can interrupt anything the program is already doing as an assembly language program. The signal stack does not function like the program stack. Rather than keeping a concept of state simply for the signal handler, it contains complete information for resuming the process at an arbitrary point in an assembly language program. Thus it has complete register information for the resumed process. This means -- in turn -- that the handler is never called directly, but is always interrupting something, where that something can include initializing the active frame of the process stack.

    Thus, the signal stack is kept separate to avoid confusion between the stacks.

    However, there is a deeper reason that the signal stack is kept separate. The most common bug in a C program is to leave a stack variable uninitialized. Leaving a variable uninitialized means that it gets the value that was last in memory at its stack location. Without signals, this value can be highly predictable and the program can possibly work consistently in the presence of an uninitialized variable.

    But let's suppose that the same stack was used for signals and process. This would leave values around -- in unpredictable places -- from processing signals. Even though this could be made semantically predictable, the simple fact that the contents of the process stack are being changed asynchronously from the process can lead to what we call "Heisenbugs": situations in which the program crashes in irreproducible ways. In linux, we prevent "Heisenbugs" by assuring that the stack stays in a predictable state. We will see that linux also takes steps to prevent Heisenbugs during memory management via similar mechanisms.

  4. Consider the code:
     
    
    #include <stdio.h> 
    #include <stdlib.h> 
    #include <unistd.h> 
    #include <signal.h> 
    #include <sys/time.h> 
    #include <sys/resource.h> 
    
    void reaper(int sig, siginfo_t *info, void *buf) {  
        int status; 
        int pid = waitpid(-1, &status, 0); // wait for any child 
        printf("pid %d with status %d reaped by parent\n", pid, WEXITSTATUS(status)) ; 
    } 
    
    main() { 
        struct sigaction sa, old_sa;  
        sigset_t mask; 
        sigemptyset(&mask);
        sa.sa_sigaction = reaper; 
        sa.sa_mask = mask;  
        sa.sa_flags = SA_SIGINFO;  // enable siginfo_t argument
    
        if (fork()) { 
            printf("I'm the parent with PID %d\n", getpid());
            sigaction(SIGCHLD, &sa, &old_sa);  
        } else { 
            printf("I'm the child with PID %d\n", getpid());
        } 
    } 
    
    1. What happens if you leave off the WEXITSTATUS and just print status? Why?
      Answer: I apologize for not getting to this in lecture. It was the next slide! The status variable returned by waitpid is not a simple exit status, but encodes more information about the process exit. The macro WEXITSTATUS returns just the exit status part of a more complex code. No deduction for not knowing this.
    2. You observe that some times, the printout in reaper does not occur. Why does this happen?
      Answer: The sigaction call occurs after the child could have exited. If the handler is not registered, the sent signal is implicitly ignored. In fact, in this case, the child always finishes before the handler is registered!
  5. (Advanced) Some really sneaky signal handlers actually correct for errors. For example, it is possible with extreme cleverness to catch and correct a SIGFPE (floating point exception). But this is tricky. Explain why it's difficult and what you would have to do to correct a divide-by-zero error.
    Answer: Note that SIGFPE is a non-fatal signal and that the IEEE floating point standard defines that floating point exceptions result in the special value NaN (not a number). Normally, execution is not terminated; NaN is returned.

    One can correct for this dynamically by manipulating values in global variables. Suppose you want to always replace NaN with zero. To do this, you need to store the results of the operation in a global variable that you can modify in the signal handler. Then -- in the code -- you check for whether this global variable is modified and -- if so -- use its value instead of the operation.

    This is really subtle, though. Remember that when you execute Y = X / 0.0 , the RHS is executed before the LHS. Thus, the SIGFPE occurs before Y is set. Once the SIGFPE returns, Y is still set to NaN. So, you must check after this statement for a global variable indication that SIGFPE has occurred, and then replace the value NaN with 0.0.

    In practice it is easier to check for NaN after the statement.