Comp150CPA: Clouds and Power-Aware Computing
Classroom Exercise 10b (unassigned)
Map and Reduce
Spring 2011

group member 1: ____________________________ login: ______________

group member 2: ____________________________ login: ______________

group member 3: ____________________________ login: ______________

group member 4: ____________________________ login: ______________

group member 5: ____________________________ login: ______________

In class we have studied how to convert familiar algorithms to the MapReduce framework. Let's explore that in more detail.

  1. Why can the reducer and combiner be the same subroutine in WordCount?

  2. Suppose that rather than counting words, you want to compute the range of frequencies of words, i.e., the lowest and highest word counts over all documents in a set. Describe the mapper input and output necessary to solve this problem. Feel free to use new classes.
  3. Suppose that after mapping, you have four mapper outputs with contents
    Mapper 0
    Mapper 1
    Mapper 2
    Mapper 3
    Suppose that the partioning hash function maps words beginning with "a"-"g" to 0, "h"-"n" to 1, "o"-"u" to 2, and "v"-"z" to 3. Simulate the result of each step of the butterfly sort and describe how many communication steps are necessary to completely sort the data above to appropriate workers.

  4. (Advanced) Suppose that we represent a table of distances between cities in MapReduce as city1 -> < distance, city2 > where city1 and city2 are cities and distance is the shortest distance between them. Give a MapReduce algorithm for determining the shortest distance between two cities X and Y, using intermediate cities when necessary. What does the Map step do, and what does the Reduce step do?