TopCoder problem "SyllableSorting" used in SRM 374 (Division I Level One , Division II Level Two)



Problem Statement

    

Syllable sorting is a method of sorting words based on their syllabic decompositions. The first step is to decompose each word into syllables. A syllable is defined as a maximal non-empty substring of consonants followed by a maximal non-empty substring of vowels. The only vowels are 'a', 'e', 'i', 'o' and 'u'. All other letters are considered consonants. All words will start with a consonant and end with a vowel.



To compare two words syllabically, first decompose them into sequences of syllables. For example, the words "zcdbadaerfe" and "foubsyudba" decompose as follows:

  • "zcdbadaerfe" = {"zcdba", "dae", "rfe"}
  • "foubsyudba" = {"fou", "bsyu", "dba"}

Then, sort each sequence of syllables alphabetically. In the above example, the sequences become:

  • {"dae", "rfe", "zcdba"}
  • {"bsyu", "dba", "fou"}

Then, compare these sorted sequences lexicographically. A sequence S1 comes before a sequence S2 lexicographically if S1 has an alphabetically earlier element at the first index at which they differ. In the above example, the second sequence comes earlier lexicographically because "bsyu" comes before "dae" alphabetically.



If two sorted sequences are equal, then compare their corresponding unsorted sequences instead. For example, the words "daba" and "bada" decompose into the same sorted sequence {"ba", "da"}. Compare the unsorted sequences {"da", "ba"} and {"ba", "da"} to determine that "bada" comes before "daba".



You are given a String[] words. Sort the words using the method above and return the resulting String[].



 

Definition

    
Class:SyllableSorting
Method:sortWords
Parameters:String[]
Returns:String[]
Method signature:String[] sortWords(String[] words)
(be sure your method is public)
    
 

Constraints

-words will contain between 1 and 50 elements, inclusive.
-Each element of words will contain between 2 and 50 characters, inclusive.
-Each element of words will contain only lowercase letters ('a'-'z').
-The first character of each element of words will be a consonant.
-The last character of each element of words will be a vowel.
 

Examples

0)
    
{"xiaoxiao", "yamagawa", "gawayama"}
Returns: {"gawayama", "yamagawa", "xiaoxiao" }

After decomposing the words into sequences of syllables, we get the following unsorted and sorted sequences of syllables:

   WORD    |    UNSORTED SEQUENCES    |     SORTED SEQUENCES
-----------+--------------------------+--------------------------
"xiaoxiao" | {"xiao", "xiao"}         | {"xiao", "xiao"}
"yamagawa" | {"ya", "ma", "ga", "wa"} | {"ga", "ma", "ya", "wa"}
"gawayama" | {"ga", "wa", "ya", "ma"} | {"ga", "ma", "ya", "wa"}

To compare "xiaoxiao" with the other two words, we use the sorted sequences of syllables. However, to compare "yamagawa" with "gawayama" we have to use the unsorted sequences because the sorted ones are equal.

1)
    
{"bcedba", "dbabce", "zyuxxo"}
Returns: {"bcedba", "dbabce", "zyuxxo" }
2)
    
{"hgnibqqaxeiuteuuvksi", "jxbuzui", "zrotyqeruiydozui",
 "ywuuzkto", "lmopbookoagyco", "vredfvavvexliu"}
Returns: 
{"hgnibqqaxeiuteuuvksi",
"vredfvavvexliu",
"lmopbookoagyco",
"jxbuzui",
"zrotyqeruiydozui",
"ywuuzkto" }
3)
    
{"crazgo", "cwsoygiokiuo", "yueoseeu", "tuadiojvugeoe",
 "naumxditui", "sgukkelyoi", "nrohjuasoia", "mgabmo"}
Returns: 
{"mgabmo",
"crazgo",
"cwsoygiokiuo",
"tuadiojvugeoe",
"nrohjuasoia",
"sgukkelyoi",
"naumxditui",
"yueoseeu" }
4)
    
{"wheewjuguoi", "coutcu", "hqivaa", "sgiibgwi", "ypaqpki",
 "bgyikouapae", "wqakeu", "skolfo", "pzesaa", "ypivhi"}
Returns: 
{"sgiibgwi",
"bgyikouapae",
"coutcu",
"wheewjuguoi",
"hqivaa",
"wqakeu",
"skolfo",
"pzesaa",
"ypaqpki",
"ypivhi" }

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=8374

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=10793&pm=8374

Writer:

jbernadas

Testers:

PabloGilberto , Olexiy , Andrew_Lazarev

Problem categories:

Sorting