TopCoder problem "CompressionText" used in SRM 258 (Division I Level Two)



Problem Statement

    

You are part of a team that is working on a piece of software to handle text compression. Your team has designed the compression algorithm as follows:

To compress a given string of text, two strings, each 3 characters in length, will be chosen as compression keys. The strings may contain any combination of capital letters and/or spaces. Then, a compressed string will be generated from the original such that replacing "*1" in the compressed string with the first string and replacing "*2" with the second string will recreate the original text.

For example, if the two compression keys are "FOO" and "BAR", then the compressed string "*2X*1" would decompress to "BARXFOO", and "*1*1 *2" would decompress to "FOOFOO BAR".

You have been tasked with writing a proof of concept implementation to test the effectiveness of this compression scheme. You will be given a String original. Your goal is to optimally select the two compression keys, and generate the compressed text to be as short as possible. You are to return the length of the shortest possible text.

 

Definition

    
Class:CompressionText
Method:shortestLength
Parameters:String
Returns:int
Method signature:int shortestLength(String original)
(be sure your method is public)
    
 

Constraints

-original will contain between 1 and 50 characters, inclusive.
-Each character of original will be an uppercase letter ('A'-'Z') or a space (' ').
 

Examples

0)
    
"BARXFOO"
Returns: 5
The first example from the problem statement.
1)
    
"FOOFOO BAR"
Returns: 7
The second example from the problem statement.
2)
    
"ABCDEFGHIJKLMNOPQRSTUVWXYABCDEFGHIJKLMNOPQRSTUVWXY"
Returns: 46
It's a long string, but no 3-character substring appears more than twice.
3)
    
"QWERTYUIOPASDFGHJKLZXCVBNM"
Returns: 24
Here, no substring repeats itself at all. The best we can do is to pick any two substrings to replace.
4)
    
"BBAAAABBBBAAABABABBBABABAAABABABAAABBAABABBBABAAAB"
Returns: 40

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=4703

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=7993&pm=4703

Writer:

timmac

Testers:

PabloGilberto , lbackstrom , brett1479 , Olexiy

Problem categories:

Brute Force, Encryption/Compression, String Manipulation