Comp150CPA: Clouds and Power-Aware Computing
Classroom Exercise 1
The Google AppEngine Datastore
Spring 2011

group member 1: ____________________________ login: ______________

group member 2: ____________________________ login: ______________

group member 3: ____________________________ login: ______________

group member 4: ____________________________ login: ______________

group member 5: ____________________________ login: ______________

In class we have studied the Java Datastore object in Google Apps. Let's explore a few aspects of this new concept.

  1. Note that only data is marked as persistent in the persistence manager. Why isn't it necessary to mark functions as persistent or not persistent?
    Answer: Just as in serial Java programming, functions are immutable (and cannot be "stored" or "retrieved":). Or -- to put things more accurately -- a factory can put a function into the namespace once.
  2. Why is it possible to mark some data as not persistent? Give a use case for a private non-persistent data entity, and explain how it might be used.
    Answer: It often happens that something is needed that is expensive to compute, e.g., the sum of all numeric fields in all objects. Once this is computed, it can be stored in a temporary location inside a class and reused. But by nature, the usefulness of such values has a short lifetime, so after some period of time, it should be computed anew. Thus it is not particularly useful to store such information permanently.

    More generally, non-persistent data can cache anything that is expensive to compute, so that it only has to be computed once, e.g., the MD5 signature of a text string.

  3. The primary key of an object store is analogous to a reference, and allows building of persistent data structures whose elements reference other structures. But using this to build a linked list of objects in the normal way one uses in Java -- while it is possible -- is highly discouraged for Cloud applications. Why?
    Answer: A linked list is emulated by the cloud itself. When one posts data to the cloud, it is always returned in a list format. Querying the cloud is expensive, and traversing a linked list of cloud queries would be much slower than making one query that returns the whole list.
  4. One attribute of the Datastore is that it is not sequential, i.e., items are not guaranteed to be retrieved in the order in which they were created. How must you construct objects in order to handle sequential processing in creation order, e.g., for a guestbook?
    Answer: The so-called list can contain any data. To order the list, you must add some attribute to the data that orders it. This can be a sequence number for each element, or a timestamp. Then you can query the cloud for data in a particular order (more about that later).
  5. (Advanced): Why is the persistence manager implemented via a factory, rather than via a regular Java class?
    Answer: A factory creates a class from scratch, writing new Java code in the process. This incorporates whatever code that the cloud provider writes to enable persistence. If the class were not an instantiation of a factory, you would have to include (or write) that code yourself.

    This code does a number of things. In a PersistenceCapable class, all accesses to Persistent members are instrumented, so that the persistence mechanism knows when something has changed.