TopCoder problem "PostfixRLE" used in SRM 327 (Division I Level Three)



Problem Statement

    

Postfix notation (or reverse Polish notation) is a method for writing down arithmetic expressions without brackets, but still maintaining the order of operations. The postfix notation for an expression consisting of a single variable or number consists only of that variable or number. The postfix notation for an expression "A#B" (here A and B are other expressions and '#' is the operator evaluated last in our expression) is obtained by concatenating the postfix notation for A, the postfix notation for B, and '#'.

For example, the postfix notation for "((a+f)*b)*(c*d)" is "af+b*cd**". Another way of looking at the postfix notation is that it is a program for a stack calculator. Every variable or number means "push that variable or number on the stack". Every operator means "pop the two top items from the stack, perform the operation on them, and push the result back on the stack". For example, when evaluating "af+b*cd**", the stack changes as follows: {a}, {a,f}, {a+f}, {a+f,b}, {(a+f)*b}, {(a+f)*b,c}, {(a+f)*b,c,d}, {(a+f)*b,c*d}, {((a+f)*b)*(c*d)}.

You are given an expression in postfix notation, containing only variables and operators. All the operators are binary (take two operands), associative, and commutative. That means for any operator #, expressions A, B, and C, the following properties hold:

  • Associativity: A#(B#C)=(A#B)#C.
  • Commutativity: A#B=B#A.

That allows you to rearrange operands corresponding to the same operator. For example, using both rules one could rearrange the expression shown above:

  • As expression: "((a+f)*b)*(c*d)" can be rearranged to "d*((a+f)*(b*c))".
  • In postfix notation: "af+b*cd**" can be rearranged to "daf+bc***".

Your task is to find a postfix expression that can be obtained by rearranging the initial expression using only the associativity and commutativity rules, and having the least RLE-size. The RLE-size of a string is the number of blocks of consecutive equal characters in it (i.e., the RLE-size of "xx+yy+zz+**" is 7).

The input for this problem is given as a String[] expression. Concatenate all elements of that String[] to get the initial expression in postfix notation. Each variable is denoted by a lowercase letter ('a'-'z'), and the operators are denoted by characters '+', '*', '#', '!', '@', '$', '%' and '^'. You are to return the smallest possible RLE-size of a postfix expression that can be obtained from the initial expression by applying some (possibly none) of the above rules.

 

Definition

    
Class:PostfixRLE
Method:getCompressedSize
Parameters:String[]
Returns:int
Method signature:int getCompressedSize(String[] expression)
(be sure your method is public)
    
 

Notes

-All the quotes in the problem statement are for clarity only.
 

Constraints

-expression will contain between 1 and 50 elements, inclusive.
-Each element of expression will contain between 1 and 50 characters, inclusive.
-Each character in each element of expression will be a lowercase letter ('a'-'z'), '+', '*', '#', '!', '@', '$', '%' or '^'.
-expression will be a valid arithmetic expression in postfix notation.
 

Examples

0)
    
{"af+b*cd**"}
Returns: 7
"daf+bc***" is one of the possible best rearrangements.
1)
    
{"xy*x*y*x*y*"}
Returns: 3
Yields "xxxyyy*****" or "yyyxxx*****".
2)
    
{"abcdefg!@#$%^"}
Returns: 13
No point in changing anything here.
3)
    
{"xy@z@ab@c@yc%%%"}
Returns: 9
4)
    
{"abc++",
"abc++",
"abc++",
"a",
"b",
"c",
"*",
"*",
"*",
"*",
"*"}
Returns: 13

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=6834

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=10007&pm=6834

Writer:

Petr

Testers:

PabloGilberto , brett1479 , Olexiy , ged

Problem categories:

Dynamic Programming, Encryption/Compression, Graph Theory, String Parsing