TopCoder problem "ClosestRegex" used in SRM 410 (Division II Level Three)



Problem Statement

    

A regular expression is a pattern describing how a particular string should be formed. For the purposes of this problem, a regular expression consists of atoms. Each atom is either a single lowercase letter (which matches exactly one of that letter), or a single lowercase letter followed by an asterisk ('*'), which matches zero or more of that letter. A string matches a regular expression if it can be partitioned into substrings which match the atoms of the regular expression in the same order. For example, the regular expression ab*c*b is matched by the strings ab, abb, acb and abbbcccb, but not by ba, accbb or babcb.

You have a string text which ought to match the regular expression regex. However, it may have been corrupted. Return the string S that satisfies the following conditions.

  1. S has the same number of characters as text.
  2. S matches regex.
  3. The number of positions in which S differs from text is minimal.
  4. S is lexicographically smallest amongst strings that satisfy the other conditions.

If there is no string that satisfies the above conditions, return the empty string.

 

Definition

    
Class:ClosestRegex
Method:closestString
Parameters:String, String
Returns:String
Method signature:String closestString(String text, String regex)
(be sure your method is public)
    
 

Constraints

-text will contain between 1 and 50 characters, inclusive.
-text will contain only lowercase letters ('a' - 'z').
-regex will contain between 1 and 50 characters, inclusive.
-regex will contain only lowercase letters and '*'s.
-The first character of regex will not be a '*'.
-regex will not contain two consecutive '*'s.
 

Examples

0)
    
"abcd"
"bcdd"
Returns: "bcdd"
Here there are no asterisks, so only one string can match. 'a' must be changed to 'b', 'b' to 'c' and 'c' to 'd'.
1)
    
"topcoder"
"t*px*coa*de*"
Returns: "ttpcodee"
Changing the string to ttpcodee requires two changes.
2)
    
"cmu"
"c*m*fm*u*"
Returns: "cfu"
Any of fmu, cfu and cmf would require one change, but cfu is first lexicographically.
3)
    
"aaaaacccc"
"a*abc*"
Returns: "aaaaabccc"
4)
    
"short"
"lo*ts*of*let*ter*s"
Returns: ""
No 5-letter string matches the regex.

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=9727

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=12182&pm=9727

Writer:

bmerry

Testers:

PabloGilberto , legakis , Olexiy , ivan_metelsky

Problem categories:

Dynamic Programming, String Manipulation