Comp150CPA Homework Exercise 3: Becoming a Pig

Overview

In groups of 2-5 people, please create the following Pig scripts. I will only give you the schema for data and the schema for output. You have to test with reasonable data.

  1. Please write lookup.pig that simulates looking up the second 20 messages in my gmail mailbox (I.e., messages numbered 21-40 for mailbox alva.couch), when ordered according to time of arrival. The input schema is:
    mail: {mailbox: chararray, time:long, messageid: long}
    where and the output schema should be the same, i.e.,
    out: {mailbox: chararray, time:long, messageid: long}
    but only for mailbox alva.couch and messages whose time puts them in order between 21 and 40. You may assume that the time is linux system time, seconds since the epoch.
  2. Please write lazy.pig that determines which users are forgetting to log out of their accounts. The input schema is:
    log: {time:long, user:chararray, action:chararray, status:chararray, address:chararray}
    where The output schema should be:
    out: {user: chararray, times:int} 
    where
  3. Please write a Pig script 'first.pig' that identifies who made the post that was the first in each positive buzz about 'Toyota'. The input to this is the schema
    posts: {time:long, person:chararray, mentions:chararray, opinion:chararray}
    where The output should be the schema
    out: {time:long, person:chararray, posts:int}
    where As in the classroom example, feel free to choose 3600 seconds as the sample length for the buzz.

Submitting completed assignments

This assignment is a bit more complex to submit because you need to submit machine-readable code.

Submitting completed assignments

We will submit this assignment as a set of files. Click here to fill out a form that -- when submitted -- generates a printable form that can be saved to a file, e.g., hw03.html. Next, provide this along with everything else via provide:

 
provide comp150cpa hw03submit lookup.pig lazy.pig first.pig hw03.html
Completed assignments will be printed, graded by hand, and scanned back into the system. The grade for hw03submit will be listed as the grade for hw03 in provide, so we can give everyone in the group credit.