In this problem a sample will be a collection of numbers. The mean of a sample is the average of all the numbers it contains. The variance of a sample is the sum of the squares of the distances from each number in the collection to its mean, divided by its number of elements. For example, if the sample is {3,4,7,10} the mean is 6 and the variance is (32+22+12+42)/4=7.5
We have k samples that have been combined into one collection and we need to split them up. The only thing we remember about the samples is that none of them were empty, but they could potentially each have a different number of elements. What you need to do for us is find the partition of mixedSamples, the collection, that minimizes the sum of the variances of the obtained samples. See examples for further clarification.
|