TopCoder problem "PartitionGraph" used in ()

Problem Statement

    Given a simple undirected graph G, your task is to partition it into C disjoint sets of nodes. Your goal is to do this in such a way that as few edges are cut as possible, while each of the sets is relatively large. More specifically, you want to minimize the ratio: (edges cut)/(size of smallest set).

In order to receive any points on a test case, your ratio must be within 0.01% of the smallest ratio of any competitor. Assuming that you achieve this ratio, your score will be based on the time it takes you to find the answer. Your score for a test case will be inversely proportional to the time it takes plus 100 milliseconds, while your total score will simply be the sum of the scores for the test cases. Thus, if you have the smallest ratio on two cases with times 100 and 200 milliseconds, your score will be proportional to 1/(100+100)+1/(200+100).

The graphs will be generated as follows:
  1. C, the number of clusters (and input to you) will be chosen uniformly between 2 and 10, inclusive.
  2. N, the number of nodes in the graph will be chosen uniformly between 100 and 5,000, inclusive.
  3. p, the probability of an edge existing between nodes in different clusters, will be chosen uniformly between 0.05 and 0.25, inclusive.
  4. q, the probability of an edge between nodes in the same cluster, will be chosen uniformly between p+2*sqrt(p*C/N) and p+8*sqrt(p*C/N) (with the upper bound capped at 1).
  5. A graph is generated with N nodes, each node is in one of the C clusters with equal probability. Every undirected edge is then added with probability p or q depending on whether the edge is between different clusters or within one cluster.

For example, consider a graph with two clusters and 100 nodes, where nodes 0-49 are in cluster 0, while nodes 50-99 are in cluster 1. Imagine that p were chosen as 0.1, and q was chosen as 0.2. Then, for every pair of integers (i,j), if i and j were both in the same cluster (both 0-49 or both 50-99), there would be an edge between i and j with probability 0.2. If i and j were in different clusters, then the probability of an edge between i and j would be 0.1. Thus, in this case we would expect each node to have 49*0.2 edges to nodes in the same cluster, and 50*0.1 edges to nodes in the other cluster. If our algorithm found this parition, we would expect it to cut roughly 50*0.1*50=250 edges, and thus have a ratio of 250/50 = 5.

The graph will be given to you as two int[]'s: u and v. Corresponding elements of u and v represent a single undirected edge. For instance, there is an edge between u[0] and v[0]. The number of nodes are given as an input nodes. Your return should be a int[] with nodes elements. Element i of the return should represent the partition (out of C total) that node i is in. Everything is indexed from 0, including the return.


Parameters:int[], int[], int, int
Method signature:int[] partition(int[] u, int[] v, int nodes, int C)
(be sure your method is public)


-The graph is undirected, so edges will only be listed once in the input.
-The time limit is 60 seconds.
-The memory limit is 1024 megabytes.
-A partition of nodes into C disjoint sets means that you should assign an integer between 0 and C-1 to every node.
-An edge is cut if its two end points are assigned different numbers (as described in the previous note).
-The scores you see for individual test cases encode the number of edges you cut and the size of your minimum partition. If the score is written down as an integer, the last 5 digits are your time, the 4 digits before them gives the size of your smallest partition, while the rest of the digits give the number of edges cut. For example, the score 1096 0012 00004 (spaces for clarity) indicates that your time was 4 milliseconds, your smallest cluster was 12 nodes, and you cut 1096 edges. This information will also appear in the "fatal errors" section for examples, though it is clearly not actually a fatal error.
-There are 20 non-example test cases. After the competition is over, all solutions will be re-run on a larger test set.
-The total scores will be normalized such that the highest score is exactly 10,000.
-The zip files for the examples contain one edge per line.


-The graph will be generated as described above.


"This graph has 100 nodes, and 9 clusters.  p = 0.1298495467416172, q = 0.562381664187069.  The input C=9 "

The optimal ratio is less than 60.
"This graph has 100 nodes, and 2 clusters.  p = 0.05512270939535211, q = 0.23170504638351752.  The input C=2 "
"This graph has 100 nodes, and 6 clusters.  p = 0.07393663470458767, q = 0.42640730225653994.  The input C=6 "
"This graph has 2103 nodes, and 4 clusters.  p = 0.23752417679934573, q = 0.3425110757342353.  The input C=4 "
"This graph has 4909 nodes, and 10 clusters.  p = 0.11841855221632161, q = 0.20495237307524405.  The input C=10 "
"This graph has 3064 nodes, and 2 clusters.  p = 0.20617708771524107, q = 0.28020071025728954.  The input C=2 "
"This graph has 390 nodes, and 10 clusters.  p = 0.06041993112425246, q = 0.2733661251252104.  The input C=10 "
"This graph has 2789 nodes, and 4 clusters.  p = 0.16688716607823517, q = 0.2312691747743178.  The input C=4 "
"This graph has 2953 nodes, and 2 clusters.  p = 0.1474143432742407, q = 0.18596635921872035.  The input C=2 "
"This graph has 4324 nodes, and 9 clusters.  p = 0.20187512571848237, q = 0.2749548626937394.  The input C=9 "

Problem url:

Problem stats url:



Problem categories: