Comp150CPA: Clouds and Power-Aware Computing
Classroom Exercise 9
Transformations and Schemas
Spring 2011
group member 1: ____________________________ login: ______________
group member 2: ____________________________ login: ______________
group member 3: ____________________________ login: ______________
group member 4: ____________________________ login: ______________
group member 5: ____________________________ login: ______________
In class we have studied how Pig statements transform data
schemas. Let's explore this in more detail.
Suppose that inside Pig, we have
grunt> DESCRIBE x;
x: {name: chararray, bored: int}
grunt> DESCRIBE y;
y: {name: chararray, joking: int}
- What schemas result from the following statements?
z = FOREACH x GENERATE name;
z = FILTER x BY name=='Alva';
z = GROUP x by name;
z = JOIN x BY bored,y BY joking;
z = GROUP x by name;
w = FOREACH z GENERATE group as name, x.bored as bored;
- (Advanced)
The Pig cookbook advises that one should use FOREACH-GENERATE to
omit unneeded columns as early as possible and as often as possible. Why?