TopCoder problem "RecoverWords" used in TCCC '04 Finals (Division I Level Two)



Problem Statement

    We want to write a program to recover some missing letters from a document. You will be given a sentence, some of whose letters are missing and are represented by '?' (one question mark stands for exactly one missing letter, which cannot be a space). Your goal is to recover the values of all the question marks, given some knowledge about the structure of English.



For the purposes of this problem, we will define a sentence using the following (extremely simplified) grammar:
<sentence> ::= <noun phrase> <predicate> [<preposition>] <noun phrase>
<noun phrase> ::= <article> {<adjective>} <noun>
<predicate> ::= <verb> {<adverb>} 
Elements in curly braces ('{' and '}') may be present zero or more times, and those in square brackets ('[' and ']') are optional. There are only two <article>'s in English: "a" and "the" (we'll be ignoring "an" in this problem). However, there are thousands of <adjective>'s, <noun>'s, <adverb>'s, <preposition>'s, and <verb>'s. Thus, a small subset of these will be given to you as five String[]'s. Each element of each of these String[]'s represents one word and is a sequence of one or more lowercase letters.



Your task is to change all of the question marks in sentence into letters such that the sentence matches the given grammar and return the result. If this is not possible (including cases where there are no question marks, but the given sentence doesn't match the grammar), or if there are multiple ways to change some question marks that are consistent with the grammar, return "".
 

Definition

    
Class:RecoverWords
Method:recover
Parameters:String, String[], String[], String[], String[], String[]
Returns:String
Method signature:String recover(String sentence, String[] nouns, String[] adjectives, String[] verbs, String[] adverbs, String[] prepositions)
(be sure your method is public)
    
 

Notes

-Some words may be serve as multiple parts of speech. This occurs in English with words like "water" which can be both a noun and a verb.
-We are looking to replace the question marks with letters, not understand the semantics of the sentence. Hence, there may be multiple ways to interpret a sentence within the context of the above grammar, which all lead to the question marks being replaced in the same way. See examples 0 and 5.
 

Constraints

-sentence will contain between 1 and 50 characters, inclusive.
-Each character of sentence will be a lowercase letter ('a'-'z'), a question mark ('?') or a space (' ').
-There will be no double spaces, leading spaces, or trailing spaces in sentence.
-Each of the five input String[]'s will contain between 0 and 50 elements, inclusive.
-Each element of each of the five input String[]'s will contain between 1 and 50 lowercase letters ('a'-'z'), inclusive.
-All complete words without question marks in sentence must be in one of the five input String[]'s (except "a" and "the", which may or may not in the input String[]'s).
 

Examples

0)
    
"? ????k b???? f?x ????? ??? ???? dog"
{"fox","fox","dog","dog"}
{"quick", "brown", "lazy"}
{"jumps"}
{}
{}
Returns: "a quick brown fox jumps the lazy dog"
The resulting sentence has the following parts of speech:

<article> <adjective> <adjective> <noun> <verb> <article> <adjective> <noun>
1)
    
"? b?? r?? e????? ???? ??? ?????"
{"rat","boy","field"}
{"big"}
{"ran","eating"}
{"easily"}
{"over"}
Returns: ""
There are two valid interpretations for this sentence:

"a big rat eating over the field"

"a boy ran easily over the field"

2)
    
"? b?? r?? e????? ???? ??? ?????"
{"boy","field"}
{"big"}
{"ran","eating"}
{"easily"}
{"over"}
Returns: "a boy ran easily over the field"
Without the rat, there is only one valid interpretation.
3)
    
"the farmer is ?? sally"
{"farmer","sally"}
{}
{"is"}
{}
{}
Returns: ""
4)
    
"the gardener will ????? with a ????? can" 
{"gardener","water","can"}
{"water","the","a"}
{"will","water"}
{"water"}
{"with"}
Returns: "the gardener will water with a water can"
5)
    
"??? ??? ??? ??? ??? ??? ??? ??? ??? ??? ???"
{"the"}
{"the"}
{"the"}
{"the"}
{"the"}
Returns: "the the the the the the the the the the the"
There are multiple ways to interpret which part of speech each word is. However, every interpretation results in the same sequence of characters.

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=2367

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=5080&pm=2367

Writer:

lars2520

Testers:

schveiguy , vorthys

Problem categories:

String Parsing