TopCoder problem "TCMinMin" used in SRM 213 (Division I Level Three)



Problem Statement

    

You have been charged with writing the 'type checking' module for a new language called TC--. When coding in TC--, the coder is allowed to omit the types of local variables and parameters, and the return types of functions, only when the compiler is able to deduce those types from their usage. Each valid program has one and only one possible "type set".

Sometimes a variable is used before its type is "known" when reading the program from top to bottom, but that should not deter the type checking module you're writing. Once the type of a variable is known, it is retroactively effective to the start of the function, and proactively effective to the end of the function. This applies to local variables, formal parameters, and function return types.

Informally the grammar can be defined as:

  • - A program is one or more function definitions.
  • - A function definition always includes the name of the function, zero or more parameters with optional type definitions, and an optional return type. Every function will return a value, even if it is not explicitly declared. After the function definition, zero or more local variables will be defined, then zero or more program statements, then exactly one return statement. Example:
    	function a(b:string)
    
  • - Local variable definitions. Variable definitions are optional, but when present, must specify the variable name and type:
    	a:int
    	b:string
    
  • - Function call. There can be zero or more actual arguments, which always match the type and number of formal parameters. All arguments are variables only and never integer or string constants, nor expressions.
    	a=b()
    	c=d(e,f,g)
    
  • - Assignment.
    	a=12345
    	s="this is 'a' string constant!"
    	b=c
    
  • - Expressions: (note, you cannot mix ints and strings, and cannot use constants in expressions)
    	a=b+c
    	a=b-c
    	a=b/c
    	a=b*c
    
  • - Return statement: (you can return a variable, or an int or string constant)
    	return x
    

Function calls, assignments and expressions are all instances of "program statements" and can occur in any order within the function. A return statement will always be the last statement in a function.

Note that there are only two data types in TC--: int and string. You cannot mix ints and strings in an assignment or an expression. Only the + and - operators are valid on strings, whereas all four operators are valid on ints. Both arguments to a binary operator must be variables or parameters of the same type (no constants in expressions).

For example (line numbers and whitespace added for clarity):

	 1  function f(a,b,c)
	 2     e:int
	 3     g=h(a)
	 4     e=b+c
	 5     return c
	 6  
	 7  function h(p:string):string
	 8     return p

The TC-- compiler will deduce the following types:

  • - The type of parameter a is string from the call on line 3 to function h - whose argument must be a string.
  • - The type of parameters b and c is int, since e is declared as an int on line 2 and you can only add ints together, on line 4.
  • - Since c is an int, and function f returns it on line 5, we know that the return type of f is int.
  • - Therefore the complete signature of function f is:
    	function f(a:string,b:int,c:int):int
    
  • - The type of local variable g must be string because on line 3 it calls function h, which returns a string, defined on line 7. (Note: the fact that p is declared as a string on line 7, and returned on line 8, is another way to determine that the return type of h is string.)

Write a method, deduceTypes, which is given a valid TC-- program which follows the grammar defined above, and determines each function's complete signature and the types of all local variables in each function. The program provided will be syntactically valid, and will be guaranteed to have only one unique type assignment. Your method should return a String[] which contains two types of entries:

  1. 1. function signatures, in the order they were defined in program, in the following format:
    	function name(param:type,...):type
    
    The only whitespace should be a single space between the word "function" (quotes added for clarity) and the name of the function. As listed in the constraints section, all function and parameter names are single letters, lower case.



  2. 2. local variable declarations, in alphabetical order within each function, in the following format:
    	var:type
    
    There should be no whitespace. type should be either "string" or "int" (quotes added for clarity). As listed in the constraints section, all variable names are single letters, lower case.

Therefore, for the above example, your output should be:

	function f(a:string,b:int,c:int):int
	e:int
	g:string
	function h(p:string):string
 

Definition

    
Class:TCMinMin
Method:deduceTypes
Parameters:String[]
Returns:String[]
Method signature:String[] deduceTypes(String[] program)
(be sure your method is public)
    
 

Notes

-Function names may be reused as parameter or variable names.
-Variables can be used before they are assigned a value.
-A formal grammar for the input can be found at the end of the problem.
 

Constraints

-There will be at least one function defined in program
-There will be at most one definition for each function (no overloading).
-program will contain between 2 and 50 elements, inclusive.
-Each element in program contain between 3 and 50 characters, inclusive.
-All variable, parameter and function names will be single letters in lower case.
-Integer constants range from zero (inclusive) to positive infinity, subject to line length constraints
-String constants will always start and end with double quotes
-String constants may contain any upper or lower case letter, digit, space, or any of the following characters:
!@#$%^*()_+`~-={}|;:,.?
-Spaces (outside quotes) will only be present in two locations: 1. After the function keyword (exactly one single space will always be present) 2. After the return keyword (exactly one single space will always be present)
-The program provided will be syntactically valid, and will be guaranteed to have only one unique type assignment.
-The number of arguments in each function call will always match the number of formal parameters defined for that function.
-Variable names will not 'shadow' parameter names, or vice versa.
 

Examples

0)
    
{"function f(a,b,c)",
	"e:int",
	"g=h(a)",
	"e=b+c",
	"return c",
"function h(p:string):string",
	"return p"}
Returns: 
{ "function f(a:string,b:int,c:int):int",
 "e:int",
 "g:string",
 "function h(p:string):string" }
This is the example from above. (Note that the indenting was added for clarity only, and is NOT within the strings themselves.)
1)
    
{"function a()",
	"return 0"}
Returns: { "function a():int" }
This is the smallest possible TC-- program.
2)
    
{"function a(b:int):int",
	"c:int",
	"return c"}
Returns: { "function a(b:int):int",  "c:int" }
Already a fully-specified program; the compiler doesn't have much to do.
3)
    
{"function f(a,b,c)",
	"g=a(a)",
	"f=b+c",
	"return f",
"function a(a:string)",
	"return a",
"function b()",
	"a=f(b,a,c)",
	"a=0123456789012345678901234567890",
	"return a"}
Returns: 
{ "function f(a:string,b:int,c:int):int",
 "f:int",
 "g:string",
 "function a(a:string):string",
 "function b():int",
 "a:int",
 "b:string",
 "c:int" }
4)
    
{"function s(s)",
"s=s*s",
"return s",
"function f():int",
"a=f()",
"return b"}
Returns: { "function s(s:int):int",  "function f():int",  "a:int",  "b:int" }
Remember, * and / are only for integers, not strings.

For your reference, the BNF grammar for TC-- is:

	program ::= function | program function
	function ::= functionDef optionalLocalVars optionalStatements returnStmt
	functionDef ::= 'function' var '(' optionalParams ')' | 
			'function' var '(' optionalParams ')'  ':' type
	optionalParams ::= params | e
	params ::= param | param ',' params
	param ::= var | var ':' type
	optionalLocalVars ::= localVars | e
	localVars ::= localVar | localVar localVars
	localVar ::= var ':' type
	type ::= 'int' | 'string'
	optionalStatements ::= statements | e
	statements ::= statement | statement statements
	statement ::= funCall | assign | expression
	funCall ::= var '=' var '(' optionalArgs ')'
	optionalArgs ::= args | e
	args ::= var | var ',' args
	assign ::= var '=' intConstant | var '=' stringConstant | var '=' var
	expression ::= var '=' var op var 
	returnStmt ::= 'return' var | 'return' intConstant | 'return' stringConstant
	op ::= '+' | '-' | '*' | '/'
	intConstant ::= 0..infinity
	stringConstant ::= '"' chars '"'
	chars ::= char | chars char
	char ::= [a-zA-Z0-9!@#$%^*()_+`~-={}|;:,.? ]
	var ::= [a-z]
(Note, 'e' means empty and [] is used for grouping)

Problem url:

http://www.topcoder.com/stat?c=problem_statement&pm=2897

Problem stats url:

http://www.topcoder.com/tc?module=ProblemDetail&rd=5859&pm=2897

Writer:

dplass

Testers:

PabloGilberto , lbackstrom , brett1479

Problem categories:

Simple Search, Iteration, Simulation, String Parsing