TopCoder problem "ProbabilisticTranslator" used in SRM 329 (Division II Level Three)



Problem Statement

    

You want to translate a text from a source language to a target language. The text is a sequence of words which you translate one by one using a dictionary, replacing each word in the source text with one of its possible translations. In order to find the most appropriate translation, you have explored a few books in the target language, from which you have calculated expected frequency values for all ordered pairs of consecutive words. To estimate the correctness of your translation, you calculate the sum of the expected frequencies of all the consecutive pairs of words in the translation. This sum is called the fidelity of the translation.

The source text is given as a String[], each element of which contains a non-empty single-space separated list of words. The entire String[] should be considered a single text, so the last word of an element is considered to be adjacent to the first word of the next element.

The dictionary is given as a String[], each element of which is formatted as "word : tr1 tr2 ... trn", where word is a word in the source language, and "tr1 tr2 ... trn" is a single-space separated list of n possible translations of that word in the target language (dots and quotes for clarity only).

The expected frequencies are given as a String[] containing elements of the form "WORD1 WORD2 FREQUENCY" (quotes for clarity only), where FREQUENCY is the expected frequency of WORD1 followed immediately by WORD2. If the frequency of some pair of words is not given in the input, it is assumed to be zero.

Return the maximum possible fidelity of the translation.

 

Definition

    
Class:ProbabilisticTranslator
Method:maximumFidelity
Parameters:String[], String[], String[]
Returns:int
Method signature:int maximumFidelity(String[] text, String[] dictionary, String[] frequencies)
(be sure your method is public)
    
 

Constraints

-text will contain between 1 and 50 elements, inclusive.
-Each element of text will contain between 1 and 50 characters, inclusive.
-Each element of text will contain a list of words separated by single spaces, with no leading or trailing spaces.
-dictionary will contain between 1 and 50 elements, inclusive.
-Each element of dictionary will contain between 1 and 50 characters, inclusive.
-Each element of dictionary will be formatted as "word : tr1 tr2 ... trn" (dots and quotes for clarity only).
-Each word in the source text will have exactly one entry in dictionary.
-frequencies will contain between 1 and 50 elements, inclusive.
-Each element of frequencies will be in the form "WORD1 WORD2 FREQUENCY" (quotes for clarity only).
-In each element of frequencies, FREQUENCY will be an integer between 1 and 999, inclusive, with no leading zeroes.
-No two elements of frequencies will have the same "WORD1 WORD2".
-Each word in text, dictionary, and frequencies will contain between 1 and 20 lowercase letters ('a'-'z'), inclusive.
 

Examples

0)
    
{"a b c"}
{"a : x y", "b : y z", "c : x z"}
{"y z 20", "x y 10", "z x 5"}
Returns: 30
The translation with the maximum possible fidelity is {"x y z"}.
1)
    
{"a b c"}
{"a : x y", "b : p q", "c : x y"}
{"x p 100", "x q 10", "q x 10"}
Returns: 100
Both {"x p x"} and {"x p y"} have a fidelity of 100.
2)
    
{"a b", "c"}
{"a : x y", "b : p q", "c : x y"}
{"x p 100", "x q 97", "q x 97"}
Returns: 194
The translation with the maximum possible fidelity is {"x q x"}.
3)
    
{"a", "bb c"}
{"a : a a", "bb : b", "c : c", "ccc : a"}
{"y z 2", "x y 10", "z x 5"}
Returns: 0

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=6010

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=10009&pm=6010

Writer:

dmytro

Testers:

PabloGilberto , brett1479 , Olexiy , lovro

Problem categories:

Dynamic Programming, String Parsing