TopCoder problem "Decipher" used in SRM 253 (Division I Level One , Division II Level Two)



Problem Statement

    

A substitution cipher is a simple method of encoding text in which all occurrences of each letter are replaced by another. For example, consider the following substitution table:

    original: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
    encoded:  P X M S A C Z E V G I J D K N F O R L H T U W B Y Q

The message "REGISTER NOW FOR THE TOPCODER OPEN" would be encoded as "RAZVLHAR KNW CNR HEA HNFMNSAR NFAK" by finding each letter in the top row and replacing it with the letter directly beneath.

Substitution ciphers are relatively easy to break. One method takes advantage of the fact that some letters are used more frequently than others. For example, in English text, the letter 'E' occurs more often than any other. It is reasonable to guess that the most common letter in the encoded text represents 'E'.

You will be given the encoded text as a String[] encoded. You will also be given a String frequencyOrder containing all the letters that occur in the unencoded text sorted in decreasing order by frequency. You are to attempt to decipher the encoded text by replacing the most frequent letter in encoded with frequencyOrder[0], the second-most frequent letter in encoded with frequencyOrder[1], and so on. If some letters occur equally frequently in encoded, replace the letter that occurs first alphabetically with the lower-index character in frequencyOrder.

Your method should return a String[] with exactly the same number of elements, and characters in each element, as encoded. Replace only letters, preserving spaces.

 

Definition

    
Class:Decipher
Method:decipher
Parameters:String[], String
Returns:String[]
Method signature:String[] decipher(String[] encoded, String frequencyOrder)
(be sure your method is public)
    
 

Constraints

-encoded will contain between 1 and 50 elements, inclusive.
-Each element of encoded will be between 0 and 50 characters long, inclusive.
-Each element of encoded will contain only uppercase letters ('A'-'Z') and spaces.
-encoded will contain at least one letter.
-frequencyOrder will contain between 1 and 26 characters, inclusive.
-frequencyOrder will contain only uppercase letters ('A'-'Z').
-frequencyOrder will not have any duplicate letters.
-encoded and frequencyOrder will contain the same number of distinct letters.
 

Examples

0)
    
{ "ABBBCC" }
"XYZ"
Returns: {"ZXXXYY" }
The most common letter in the encoded text is 'B', so it gets replaced by 'X'. The second-most common letter is 'C', so it gets replaced by 'Y'. Finally, the lone 'A' gets replaced by 'Z'.
1)
    
{ "RZLW" }
"CEFD"
Returns: {"EDCF" }
All letters occur equally frequently. So, the first letter alphabetically, 'L', gets replaced by the first letter in frequencyOrder, 'C'.
2)
    
{ "XX YYY Z YYY XX",
  "WWWWWZWWWWW" }
"ETMQ"
Returns: {"MM TTT Q TTT MM", "EEEEEQEEEEE" }
3)
    
{ " X ",
  "",
  " ",
  "  ",
  "   " }
"B"
Returns: {" B ", "", " ", "  ", "   " }
4)
    
{ "RAZVLHAR KNW CNR",
  "HEA HNFMNSAR NFAK" }
"EORTPNFHSCDIWG"
Returns: {"REGISTER NOW FOR", "THE TOPCODER OPEN" }

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=4674

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=7227&pm=4674

Writer:

legakis

Testers:

PabloGilberto , lbackstrom , brett1479 , radeye

Problem categories:

Sorting, String Manipulation