Comp150CPA: Clouds and Power-Aware Computing
Classroom Exercise 17
XML, Xschemas, XPATH
Spring 2011
group member 1: ____________________________ login: ______________
group member 2: ____________________________ login: ______________
group member 3: ____________________________ login: ______________
group member 4: ____________________________ login: ______________
group member 5: ____________________________ login: ______________
In class we have described XML, XSchemas, and XPATH.
Let's explore these in a bit more detail.
- Consider the code:
<pets>
<cat name=Fred><food amount='1' brand='Friskies'>Tuna</food></cat>
<dog name=George><food amount='1' brand='Purina'>Chow</food>
</pets>
This is not legal XML! Why? Correct it "in place" so that it
becomes compliant XML.
Answer: strings must be quoted, and tags must be closed, e.g.:
<pets>
<cat name="Fred"><food amount='1' brand='Friskies'>Tuna</food></cat>
<dog name="George"><food amount='1' brand='Purina'>Chow</food> </dog>
</pets>
-
What are the outputs of the following XPATHs when applied to the (corrected) XML above (as nodesets)?
-
dog//food
-
food[@brand='Purina']
-
dog/food[@amount > 1]
Answer:
-
<food amount='1' brand='Purina'>Chow</food>
-
<food amount='1' brand='Purina'>Chow</food>
- Empty nodeset.
- Consider the partial XML schema:
<xs:element name="food">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string">
<xs:attribute name="amount" type="xs:number"/>
<xs:attribute name="brand" type="xs:string"/>
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Does this describe the (corrected) version of the "food" elements
in the above? Why or why not?
Answer: It matches the corrected text precisely.
- One hidden "point" of today's lecture is that in order for an
XPATH to make sense, some known schema has to be valid. Give an XML
element and an XPATH that searches for something reasonable within the
XML, and then give an example of a different XML element for which the
XPATH does not function as expected.
Answer:
Consider, e.g., an XML element where an attribute might have two meanings:
Suppose an appropriate instance is:
<pets>
<dog name='Fred' foods='2'/>
</pets>
and, e.g., an invalid entry:
<pets foods='3'>
<dog name='Fred'/>
</pets>
And consider what the XPATH
sum(@foods)>2
might mean.
In the intended case, it is the sum of all counts of foods
for pets, but in the strange case, it refers to a data element
with a different meaning. There are an infinite number of
examples like this.
- (Advanced) Note that string facets are all defined in terms of
regular expressions. Why is this computationally a good
idea? What can't you do in a facet because of this? Hint: consider
the language hierarchy in basic computation theory.
Answer: By the fundamental theorem of regular languages,
a regular grammar can be parsed efficiently by a discrete finite
automaton using linear time and exponential space. This means that
matching regular patterns is very fast, though not particularly
space-efficient. This means, though, that you cannot match
parenthetical expressions (e.g., "((X (Y)))") in facets, because these
require a pushdown automaton and a context-free (rather than a
regular) grammar.