Comp150CPA: Clouds and Power-Aware Computing
Classroom Exercise 11 Answers
group member 1: ____________________________ login: ______________
group member 2: ____________________________ login: ______________
group member 3: ____________________________ login: ______________
group member 4: ____________________________ login: ______________
group member 5: ____________________________ login: ______________
In class we have studied how to think of a cloud application as a service,
including choices for front-facing cloud datastores.
Let's explore that in more detail.
- What is the CAP class of a database server running on
a single machine? Why?
It is in at least class CA, because an outage on the server can lead to
data loss. Arguably, since it cannot be partitioned,
one could consider it to be in class P as well.
- Is there a reasonable use for cloud datastores in CAP class CA?
What might it be?
A thing in class CA has consistency and availability but not partition
resilience. The appropriate uses of such a thing include
- Best-effort services where transaction resilience is not promised,
e.g., social locator services.
- Situations in which there is no effect of partitioning, including
where a database is replicated precisely on a farm of servers.
The google search service is in class A and thus class CA. It
doesn't need to worry about consistency, because its returns are all
best-effort. it doesn't need to protect against partitioning, because
it is read-only once it starts, and consists of farms of duplicate
- I claimed in lecture that LinkedIn stores the results of a Pig
job in a NoSQL datastore. For assignment 4,
what is the key in this datastore, and what is the value?
The key is the identity of one user, and the value is a structure
of potential friends and the friends-in-common.
- In Amazon Dynamo,
the datastore key for shopping cart contents
is the content of a cookie stored locally in your browser.
Suppose that one user opens two instances of a
shopping cart in two panes of the same browser, and proceeds to update
each one by deleting a different item. Draw a picture showing why
this is a conflict in the vector clock algorithm. Then describe what
the vector clock algorithm might do to resolve this situation,
to business advantage.
The vector clock algorithm stores both versions of objects and the
version timestamps. In this case, the situation is something like:
Version 1 (initial)
Version 2 Derived from Version 1 at time 2
Version 3 Derived from version 1 at time 3
Which, as a picture, looks like this
delete / \ delete
Version 2 \
The vector clock algorithm resolves this discrepancy according to
business rules, in this case, merging the carts:
delete / \ delete
Version 2 \
\ Version 3
Version 4 (merge of Version 2 and Version 3)
- (Advanced) A key feature of Amazon's Dynamo is what is called
"business-logic-based recovery". If a server is lost during a post,
so that the database is in an inconsistent state, consistency is
restored according to business rules rather than computer science
concepts. What are the appropriate business rules for merging
versions of a purchase transaction, or a return authorization? Why?
The business resolution rules must take into account what a customer
expects. In the case of duplicate purchase transactions, the most
likely problem is that a user changed his or her mind. So the
appropriate resolution is to delete the earlier one and act on the
In the case of a return authorization, the appropriate resolution
is to merge if possible. It is quite possible that two different
return authorization requests were made for the same order.