TopCoder problem "ScriptLanguage" used in TCCC '04 Round 4 (Division I Level Three)



Problem Statement

    

You are to write the part of a compiler that generates warning messages. The programming language is a simple script language where the only statements are assignments, IF-THEN-ELSE constructions and RETURN. Variable names are limited to single upper case letters, and they're always 32-bit signed integers. Given a syntactically correct function written in this language, generate the appropriate warning messages as described below.

The syntax of the language only allows one statement per line. Lines will not have leading or trailing spaces and tokens will be separated by exactly one space. Formally, the syntax for each line is as follows:

<line>       :: <head> | <assignment> | <if> | ELSE | END IF | <return>
<head>       :: PARAM <paramlist> | PARAM
<assignment> :: <variable> = <rvalue>
<if>         :: IF <variable> <relation> <value> THEN
<return>     :: RETURN <value>
<paramlist>  :: <variable> | <variable> <paramlist>
<rvalue>     :: <value> | <value> <operator> <value>
<value>      :: <variable> | <integer>
<operator>   :: + | - | * | /
<relation>   :: < | = | >
<variable>   :: A | B | ... | Z
<integer>    :: A 32-bit signed integer (without leading zeros)

The first line in the function will always be the function head (PARAM, optionally followed by a space separated list of variables) and the last line will always contain a RETURN statement. No other line may contain a function head, although RETURN statements can occur elsewhere as well.

Furthermore, each IF-statement has a corresponding END IF-statement and an optional ELSE-statement in between. The IF-statements may be nested, but then they will always be properly nested. An IF-statement will always compare a variable with an integer or a different variable.

You should generate two different warnings, where appropriate. These are: (quotes for clarity only)

"Line <line>: unreachable code"
"Line <line>: variable <variable> might not have been initialized"

The first warning should be given on lines which are never executed, no matter how the IF-statements are evaluated. Each such line will only receive this warning, and no other warnings. Lines containing ELSE and END IF statements will never receive this warning (these lines don't render into executable code by the compiler), see example 4.

The second warning should be given on lines where the value of a variable is used, but that variable might not have been assigned a value yet. A variable will have a value assigned to it if it's in the parameter list in the function head, or once it's been assigned a value by an assignment statement.

When determining which warnings to give, you should assume that all IF-statements can evaluate to either true or false, no matter which instructions have previously been executed, see example 2. Line numbers start from 1, which will always be the function head.

The warnings should be sorted primarily by line number. If a line contains several different variables that all might be uninitialized, those warning messages should be sorted alphabetically by the variable name, see example 0.

Create a class ScriptLanguage containing the method warnings that takes a String[] code, a function in the script language and returns a String[] containing the warning messages in the format specified above.

 

Definition

    
Class:ScriptLanguage
Method:warnings
Parameters:String[]
Returns:String[]
Method signature:String[] warnings(String[] code)
(be sure your method is public)
    
 

Notes

-Make sure you don't misspell the warning messages! Copy and paste from the problem statement is recommended.
-If there are no warnings, return a String[] with no elements.
 

Constraints

-code will contain between 2 and 50 elements, inclusive.
-code will be a syntactically correct function.
-Elements in code will not have leading or trailing spaces, and tokens will be separated by exactly one space.
-The first element in code, and no other element, will be a PARAM-statement.
-The variables in the PARAM-statement will all be distinct.
-The last element in code will be a RETURN-statement.
-An IF-statement will compare a variable with an integer or a different variable.
-Integers in code will fit in a 32-bit signed integer and will not have extra leading zeros.
-Each IF-statement will have a matching END IF-statement and optionally one matching ELSE-statement.
 

Examples

0)
    
{"PARAM A B",
 "IF A > 5 THEN",
 "C = B * A",
 "END IF",
 "D = B - C",
 "Z = Y + X",
 "E = T",
 "F = E + E",
 "V = G + G",
 "RETURN F"}
Returns: 
{ "Line 5: variable C might not have been initialized",
 "Line 6: variable X might not have been initialized",
 "Line 6: variable Y might not have been initialized",
 "Line 7: variable T might not have been initialized",
 "Line 9: variable G might not have been initialized" }

If the IF-statement evaluates to false, C will not be initialized, hence the warning message on line 5. Also note that even though T is uninitialized, E and F shouldn't be considered uninitialized on line 8 and 10, even though their values are unknown. Finally note that there should only be one warning mentioning variable G on line 9.

1)
    
{"PARAM G",
 "RETURN G",
 "B = K",
 "RETURN C"}
Returns: { "Line 3: unreachable code",  "Line 4: unreachable code" }
Notice that there should only be the "unreachable code" warning message on lines 3 and 4.
2)
    
{"PARAM T C",
 "B = T",
 "A = 4",
 "IF A < 4 THEN",
 "IF B > 3 THEN",
 "Q = 100 + F",
 "ELSE",
 "IF C = -1111111111 THEN",
 "Q = T - A",
 "IF Q = 0 THEN",
 "V = V - 1",
 "END IF",
 "ELSE",
 "RETURN I",
 "E = A",
 "END IF",
 "END IF",
 "ELSE",
 "Q = 1",
 "END IF",
 "RETURN Q"}
Returns: 
{ "Line 6: variable F might not have been initialized",
 "Line 11: variable V might not have been initialized",
 "Line 14: variable I might not have been initialized",
 "Line 15: unreachable code" }
This function contains several nested IF-statements. Notice that the line "IF A < 4 THEN" should be considered to be able to evaluate to both true and false even though the line before it says "A = 4".
3)
    
{"PARAM",
 "IF A > 0 THEN",
 "ELSE",
 "END IF",
 "IF A > 0 THEN",
 "END IF",
 "IF A > 0 THEN",
 "A = 2",
 "ELSE",
 "IF A > 0 THEN",
 "END IF",
 "A = 3",
 "END IF",
 "IF A < 0 THEN",
 "END IF",
 "RETURN A"}
Returns: 
{ "Line 2: variable A might not have been initialized",
 "Line 5: variable A might not have been initialized",
 "Line 7: variable A might not have been initialized",
 "Line 10: variable A might not have been initialized" }
4)
    
{"PARAM I J K L T",
 "IF I > 10 THEN",
 "IF I < 100 THEN",
 "IF J > 10 THEN",
 "IF J < 100 THEN",
 "IF K > 10 THEN",
 "IF K < 100 THEN",
 "IF L > 10 THEN",
 "IF L < 100 THEN",
 "A = I + J",
 "B = K + L",
 "C = A + B",
 "RETURN C",
 "IF T > 4 THEN",
 "ELSE",
 "END IF",
 "END IF",
 "END IF",
 "END IF",
 "END IF",
 "END IF",
 "END IF",
 "END IF",
 "END IF",
 "RETURN -1"}
Returns: { "Line 14: unreachable code" }
5)
    
{"PARAM A",
 "A = A + A",
 "A = A * A",
 "A = A - A",
 "A = A / A",
 "RETURN A"}
Returns: { }
Finally a program without warnings!

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=1956

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=5009&pm=1956

Writer:

Yarin

Testers:

lbackstrom , schveiguy

Problem categories:

String Manipulation, String Parsing