Comp150CPA: Clouds and Power-Aware Computing
Classroom Exercise 9
Transformations and Schemas
Spring 2011

group member 1: ____________________________ login: ______________

group member 2: ____________________________ login: ______________

group member 3: ____________________________ login: ______________

group member 4: ____________________________ login: ______________

group member 5: ____________________________ login: ______________

In class we have studied how Pig statements transform data schemas. Let's explore this in more detail. Suppose that inside Pig, we have

 
grunt> DESCRIBE x; 
x: {name: chararray, bored: int}
grunt> DESCRIBE y; 
y: {name: chararray, joking: int}
  1. What schemas result from the following statements?
    1. z = FOREACH x GENERATE name;





    2. z = FILTER x BY name=='Alva';





    3. z = GROUP x by name;
      





    4. z = JOIN x BY bored,y BY joking; 
      





    5. z = GROUP x by name;
      w = FOREACH z GENERATE group as name, x.bored as bored; 
      




  2. (Advanced) The Pig cookbook advises that one should use FOREACH-GENERATE to omit unneeded columns as early as possible and as often as possible. Why?