TopCoder problem "BinaryCodes" used in SRM 368 (Division I Level Three)

Problem Statement


Given a finite alphabet S, a binary code over this alphabet S is a function that maps each element of S to some (possibly empty) string over the alphabet {0,1}.

An example of such a code for S={a,b,c,d} is the function f defined by f(a)=1, f(b)=1010, f(c)=01, f(d)=10101.

Any binary code can be naturally extended to encode strings over the alphabet S simply by concatenating the codes of the string's letters, in order. For example, using the code mentioned above we can encode cac as f(cac)=01101.

A code is called ambiguous if there are two different strings over S that have the same encoding. Obviously, in practice we want to avoid using an ambiguous code.

A code is called really ambiguous if there are three different strings over S that have the same encoding. For example, the code from the above example is really ambiguous: the strings ba, acc, and d are all encoded to 10101.

You will be given a String[] code containing the strings over {0,1} used to encode letters of some alphabet S. Your method should check whether this code is really ambiguous. If it is really ambiguous, find a shortest string over {0,1} that is an encoding of (at least) three different strings over S, and return its length. If the given code is not really ambiguous, return -1.



Method signature:int ambiguous(String[] code)
(be sure your method is public)


-Your method does not need to know the actual elements of S, and the size of S is obviously equal to the number of elements in code.


-code will contain between 2 and 30 elements, inclusive.
-Each element of code will contain between 0 and 50 characters, inclusive.
-Each element of code will only contain the characters '0' (zero) and '1' (one).


Returns: 5
This is the example from the problem statement, and the string 10101 is the shortest string that can be decoded in three different ways.
Returns: -1
This code is obviously not ambiguous.
Returns: 2
This is clearly a really ambiguous code, as there are three different one letter strings over S that are encoded to 11.
Returns: -1
This code is a prefix code, i.e., no code word is a prefix of another code word. If a code has this property, it is guaranteed that it is not ambiguous, but the other direction is not true.
Returns: -1
This is not a prefix code, but it can easily be shown that this code is not ambiguous.
Returns: 0
Having an empty code word is a great way how to design a really ambiguous code.
Returns: 8
The shortest proof that this code is really ambiguous is 00011011. Note that this string can in fact be decoded in four different ways.

Problem url:

Problem stats url:




PabloGilberto , vorthys , Olexiy , ivan_metelsky

Problem categories:

Graph Theory, Search, String Manipulation