Name: ______________________________________________
Login: ______________
Please answer the following questions on these sheets. Please put your name at the top of each sheet. You may use any printed material. Electronic devices are prohibited.
@Persistent
in AppEngine
contain an instance of another @Persistent
object as a member? Why or why not?
time person activitywhere
time
is a timestamp of the event in seconds since
Jan 1, 1970 00:00:00 GMT.
person
is the name of a person.
activity
is the name of the activity that was
started at this time, e.g., "sleeping", "running",
"walking", "working", etc.
George
spent sleeping
. This is the sum of
the differences between times that George
started
sleeping and times that George
started doing something else.
For example, for the data
10001 George sleeping 10006 George walking 10100 George sleeping 10200 George showering 10500 George runningyour program should output the number 105, which is 10006 - 10001 + 10200 - 10100. You may assume that every time George starts sleeping, there is another subsequent event when George stops sleeping.
log = LOAD 'file.dat' USING PigStorage AS (stamp:integer, name:text, activity:text); # schema is log:{stamp, name, activity}We are only interested in George:
log2 = FILTER log1 BY name=='George'; log3 = FOREACH log2 GENERATE stamp, activity; # schema is log3:{stamp, activity}And we are only interested in whether George is sleeping or not:
tmp1 = FILTER log3 BY activity=='sleeping'; asleep = FOREACH tmp1 GENERATE stamp; # schema is asleep:{stamp} tmp2 = FILTER log3 BY activity!='sleeping'; awake = FOREACH tmp2 GENERATE stamp; # schema is awake:{stamp}After this:
asleep
is a list of times George went to sleep:
10001 10100
awake
is a list of times George woke up.
10006 10200 10500
asleep
and awake
states,
using a cross product:
prod = CROSS asleep, awake; # schema is prod:{asleep::stamp, awake::stamp} less = FILTER prod BY asleep::stamp<awake::stamp; grp = GROUP prod by asleep::stamp; # schema is grp:{asleep::stamp, prod:{asleep::stamp, awake::stamp}}after which we have
grp
as follows:
10001 {(10001, 10006), (10001, 10200), (10001, 10500)} 10100 {(10100, 10200), (10100, 10500)}and throw away all but the smallest
awake::stamp
:
least = FOREACH grp { foo1 = ORDER prod by awake::stamp; foo2 = LIMIT foo1 1; GENERATE foo2; } # schema is least:{asleep::stamp, awake::stamp}which gives us
least
as
10001 10006 10100 10200after which we subtract and join things
sub = FOREACH least GENERATE (awake::stamp - asleep::stamp) as slept; # schema is sub:{slept} every = GROUP sub BY all; # schema is every:{all, sub:{slept}} sum = FOREACH every GENERATE sum(sub:slept); dump sum;Of course, there are many other ways to do this.
time stock changewhere:
time
is a time of day.
stock
is a stock code.
change
is the change in value (in + or - points).