Problem Statement |
|
Note: This problem statement includes images that may not appear
if you are using a plugin. For best results, use the Arena editor.
In order to attract users of other weblog management systems, bloggo
offers tools for content migration. Not only is bloggo capable of reading
some of its chief competitors' back-end formats, but it can analyze the
HTML that must by definition emerge at the front end of any weblog.
An HTML document consists of tags, which are angle-bracketed
strings that look like "<b>" or "</li>",
as well as plaintext, which is everything else. Each opening
tag, such as "<h1>", has a similar closing tag, in
this case "</h1>". A substring that begins with an opening
tag and ends with a similar closing tag is called a span. The type
of a span is decided by its first and last tags, so a span that begins
with "<h1>" and ends with "</h1>" is called an "h1" span. We
say that the substring found between the first and last tags of a span is
enclosed by the span. Each span of an HTML document encloses zero
or more plaintext characters interspersed with zero or more further spans.
Thus, the spans of an HTML document form a nested structure
that is described by a unique tag tree, each node of which is
labeled with the type of its corresponding span. The outermost span of
an HTML document is required to be of the "html" type, so the root of
a tag tree is always labeled "html". If a span encloses further spans,
then its corresponding node in the tag tree has children that describe,
in order, each of the immediately enclosed spans. A span that encloses
no further spans corresponds to a leaf of the tag tree. For example,
consider the following document.
<html>
<h1> Snapping Turtles </h1>
<ul>
<li>
<h2> Common Snapping Turtle (<i>Chelydra serpentina</i>) </h2>
<p> With its dark coloring, sinuous neck, and swift lunging
motion, Chelydra serpentina is an adept hunter of nearly all
creatures that live in or near water. A fully grown specimen,
weighing on average 20 pounds with a 12-inch-long carapace,
has no enemy but man. The long lifespan of the adult, reaching
30 years or more, compensates for the low hatchling survival
rate. </p>
<p> The snapping turtle rarely basks, preferring
to lurk at the silty bottom or among underwater
vegetation. Although reclusive in its element, when on
land it ferociously confronts larger beasts that approach
it, for it cannot withdraw into its undersized shell. It
is a solitary creature, declining to fraternize within
its species. </p>
<p> Snapping turtle populations are jeopardized by
trapping, habitat loss, and by automobile traffic crossing
the overland routes of nesting females. </p>
</li>
<li>
<h2> Alligator Snapping Turtle (<i>Macroclemys temminckii</i>) </h2>
<p> Like the common snapping turtle, the alligator snapping
turtle has a massive head and a generally saurian aspect.
Whereas the common snapping turtle's talent for hibernation
lets it range as far north as Canada, the alligator snapping
turtle is restricted to the warm waters flowing into the Gulf
of Mexico. </p>
<p> With an average shell diameter of two feet and
a weight of 150 pounds, this is the largest freshwater
turtle in the world. Macroclemys temminckii's short
neck requires that it lie in wait for prey, which it
dispatches with a sudden blow of its powerful jaws.
A unique feature of the alligator snapping turtle is a
tubular pink appendage, growing from the dark interior
of its mouth, which it wriggles in wormlike fashion to
lure fish to their doom. </p>
</li>
</ul>
</html>
This document has the following tag tree.
For the purpose of content migration, it is useful to know how
the tag trees of a pair of HTML documents relate to each other. One
interesting relationship is that of the intree, which is defined
recursively. First of all, every tree is an intree of itself. In addition,
if A is an intree of B and we remove any leaf from A
to obtain a smaller tree A', then A' is also an intree of
B. If A is an intree of B, then B is called
an outtree of A. If A is both an intree and an
outtree of B, then A and B are equivalent. If
A is neither an intree nor an outtree of B, then A
and B are incompatible.
Given a pair of String[]s, docA and docB,
each containing an HTML document, you are to determine how the tag tree
of docA relates to the tag tree of docB. The elements of
each String[] are concatenated to make up the respective
document. The only tags permitted in these documents are the opening tags
"<html>", "<h1>", "<h2>",
"<h3>", "<ul>", "<ol>",
"<li>", "<p>", "<b>",
"<i>", "<u>", and the similar closing
tags. Tag names are case-sensitive and may be broken over several lines,
but must not otherwise be altered. The outermost tags of a document are
of "html" type and may be padded with spaces on either side. Plaintext
may only contain alphanumeric characters, space characters, and the
punctuation symbols ',', ';', '.', '!', '?', '-', '(', and ')'.
If the two documents are equivalent or incompatible, return the String
"equivalent" or "incompatible", respectively. If
docA is an intree of docB, return "intree X",
where X is the number of nodes in docB that are not present
in docA . If docA is an outtree of docB, return
"outtree X", where X is the number of nodes in docA
that are not present in docB.
|
|
Definition |
| Class: | bloggoDocStructure | Method: | compare | Parameters: | String[], String[] | Returns: | String | Method signature: | String compare(String[] docA, String[] docB) | (be sure your method is public) |
|
|
|
|
Constraints |
- | docA and docB each contain between 1 and 50 elements, inclusive |
- | each element of docA and docB is between 1 and 50 characters long, inclusive |
- | the only characters allowed in docA and docB are '<', '>', '/', 'a' to 'z', 'A' to 'Z', '0' to '9', the space character, ',', ';', '.', '!', '?', '-', '(', and ')' |
- | if the elements of docA or of docB are concatenated into a single string, the result is an HTML document as defined above |
|
Examples |
0) | |
| {"<html> <h1>Snapping Turtles</h1> <ul> <li> <h2>",
"Common Snapping Turtle (<i>Chelydra serpentina</i>",
") </h2> <p> With its dark coloring, sinuous neck,",
"and swift lunging motion, Chelydra serpentina is a",
"n adept hunter. </p><p> The snapping turtle rarely",
"basks. </p><p> Snapping turtle populations are jeo",
"pardized by automobile traffic. </",
"p> </li> <li> <h2> Alligator Snapping Turtle (",
"<i>Macroclemys temminckii</i>) </h2> <p> Like the",
" common snapping turtle, the alligator snapping ",
"turtle has a massive head.</p><p>This is the lar",
"gest freshwater turtle. A tubular pink append",
"age grows from its mouth.</p> <",
"/li> </ul> </html>"} | {" <html> turtles <h1> snapping </h1> <ul> <li> <h2",
"> common <i> chelydra serpentina </i> </h2> <p>",
"hunter </p> (adept?) <p> rarely basks </p> (hmm)",
"<p> jeopardized by traffic </p></li>",
"<li> often confused with... <h2> alligator snapp",
"ing turtle <i>macroclemys temminckii</i> </h2>",
"<p> massive head </p> big! <p>",
"largest freshwater turtle. pink wormlike thing <",
"/p></li></ul></html> "} |
| Returns: "equivalent" | These two documents have the same structure as the example shown above. |
|
|
1) | |
| {" <html> turtles <h1> snapping </h1> <ul> <li> <h2",
"> common <i> chelydra serpentina </i> </h2> <p>",
"hunter </p> (adept?) <p> rarely basks </p> (hmm)",
"<p> jeopardized by traffic </p></li>",
"<li> often confused with... <h2> alligator snapp",
"ing turtle <i>macroclemys temminckii</i> </h2>",
"<p> massive head </p> big! <p>",
"largest freshwater turtle. pink wormlike thing <",
"/p></li></ul></html> "} | {"<html><h1></h1><ul><li><h2><i></i></h2><p></p><p>",
"</p><p></p></li><li><h2><i></i></h2><p></p><p></p",
"></li></ul></html>"} |
| Returns: "equivalent" | The docB here is a minimal document with the same structure. |
|
|
2) | |
| {"<html><h1></h1><ul><li><h2><i></i></h2><p></p><p>",
"</p><p></p></li><li><h2><i></i></h2><p></p><p></p",
"></li></ul></html>"} | {" <html> snapping turtles <ul> <li> common ",
"snapping turtle, chelydra serpentina <p>",
"hunter </p> <p> rarely basks </p> ",
"<p> jeopardized by traffic </p></li>",
"<li> often confused with... <h2> alligator snapp",
"ing turtle <i>macroclemys temminckii</i> </h2>",
"<p> massive head; largest freshwater turtle;",
"pink wormlike appendage lures fish </p>",
"</li></ul></html> "} |
| Returns: "outtree 4" | The tag tree for docB is the following.
It is evident from comparison with the earlier illustration that docA has four nodes not present in docB. |
|
|
3) | |
| {"<html><h1></h1><ul><li><h2><i></i></h2><p></p><p>",
"</p><p></p></li><li><h2><i></i></h2><p></p><p></p",
"></li></ul></html>"} | {" <html> turtles <h1> snapping </h1> <ul> <li> <h2",
"> common <i> chelydra serpentina </i> </h2> <p>",
"hunter </p> <p> rarely basks <h3>",
"<i>almost</i> <b>never</b> </h3> </p>",
"<p> jeopardized by traffic </p></li>",
"<li> often confused with... <h2> alligator snapp",
"ing turtle <i>macroclemys temminckii</i> </h2>",
"<p> massive head </p> big! <p>",
"largest freshwater turtle. </p> <p> <b>pink</b>",
" <b>wormlike</b> lure in mouth </p> <p> imposing",
"sight on land or water </p> </li></ul></html> "} |
| Returns: "intree 7" | We now have the following tag tree for docB.
Here, docB has seven nodes not found in docA.
|
|
|
4) | |
| {"<html><h1></h1><ul><li><h2><i></i></h2><p></p><p>",
"</p><p></p></li><li><h2><i></i></h2><p></p><p></p",
"></li></ul></html>"} | {"<html><ul><li><h2><i></i></h2><p></p><p>",
"</p><p></p></li><li><h2><i></i></h2><p></p><p></p",
"></li></ul><h1></h1></html>"} |
| Returns: "incompatible" | |
|
5) | |
| {"<html></html>"} | {"<html><html><html></html><html></html><ul>",
"</ul><ol></ol></html></html>"} |
| Returns: "intree 5" | |
|
6) | |
| {"<html><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><b></b></html>"} | {"<html><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p><p></p>",
"<p></p><p></p><p></p><p></p><p></p><p></p></html>"} |
| Returns: "incompatible" | |
|