Set Theory Exercise

StarLion · September 24, 2014, 11:36pm

I’m mulling this over in my head, trying to determine a good method code-wise; thoughts appreciated.

Problem: Given a set of possible integer values P, and a target integer value T, determine the smallest set S of values from P that can be summed to reach T. Values of P may be duplicated.

IE:
P = {1,2,3}
T = 6.
S could be {1,1,1,1,1,1}, {3,3}, {2,2,2}, {1,2,2,1} etc.

Thoughts:

The permutations of S are irrelevant (so {1,2,2,1} is the same as {2,2,1,1}). Eliminating them would be difficult though.
Is it possible to specify a P such that S does not exist for a given T? Yes, because P = {2} means all odd T’s are impossible, as are all negative even integers.

AnjanaWijesundara · September 25, 2014, 7:05am

$arr = array(1,2,3);
$result = array();
$total = sum($arr);

function combinations($arr, $level, &$result, $curr=array()) {
    for($i = 0; $i < count($arr); $i++) {
        $new = array_merge($curr, array($arr[$i]));
        if($level == 1) {
            sort($new);
            if (!in_array($new, $result)) {
                $result[] = $new;          
            }
        } else {
            combinations($arr, $level - 1, $result, $new);
        }
    }
}
for ($i = 0; $i<count($arr); $i++) {
    combinations($arr, $i+1, $result);
}


foreach ($result as $res) {
   
      if ($total == sum($res)){
            echo join(" ", $res) . '<br>';
      }
}

rpkamp · September 25, 2014, 7:51am

The thing to note here is that if the more high numbers you have, the less numbers you need to reach T, thus minimising the number of elements in S.
So the thing to do is to start filling S with as many high numbers as possible, and when they don’t fit any more go the second highest number, then the third highest number, etc. If you find that you can’t reach T, then you don’t start with the highest number, but with the second highest number, and do the whole thing again.

In code:

<?php

$p = [1000, 400, 10, 5, 1];
$t = 102810;

rsort($p);

$initialIdx = 0;
$pCount = count($p);

for ($i = $initialIdx; $i < count($p); $i++) {
    $j = $i;
    $sum = 0;
    $s = [];
    while ($sum < $t) {
        if ($p[$j] <= $t - $sum) {
            array_push($s, $p[$j]);
            $sum += $p[$j];
        } else {
            if (++$j > $pCount - 1) {
                break;
            }
        }
        echo "i = $i , j = $j , p[j] = {$p[$j]} , t = $t , sum = $sum\n";
    }
    if ($t === $sum) {
        var_dump($s);
        exit;
    }

    if (++$initialIdx > $pCount - 1) {
        throw new Exception('Could not create S');
    }
}

tested!

Note that this script stops at the first answer it finds and although there may exist other answers, in none of them will S contain less elements then in the given answer.

Mittineague · September 25, 2014, 7:54am

I’m too tired to think deeply about this ATM, but I’m thinking a check for prime numbers, or a check for non-prime numbers should be in there somewhere.

StarLion · September 25, 2014, 10:07am

A good theory, but it does not take into account one thing:

I did not specify that P contains only positive numbers. It can contain negatives; so

P = {17,4,-1}
T = 16

Your result: {4,4,4,4}
Actual best result: {17,-1}

Perhaps a modification of the code to do Minimum Distance instead? But… that would depend on there being balancing in P, which there isnt guaranteed to be… Hrm.

StarLion · September 25, 2014, 10:28am

I understand the principle of recursive combinations, but you run up the probability of hitting the memory limit because the tree you’re creating is theoretically limitless.

for ($i = 0; $i<count($arr); $i++) {

Is not valid. The solution set S can be larger than the number of items in P.
IE:
P = {2,19}
T = 16

Your code wouldl stop at level = 2; meaning there would be no more than 2 ellements in S. However, the correct answer for this situation is
S = {2,2,2,2,2,2,2,2}

tpunt · September 26, 2014, 8:40pm

Initially, I did something similar to the above replies (since I forgot to take into account the ability to have negative numbers in the p set):

<?php

$p = [17, 4, 3];
$t = 15;
$s = [];

sort($p);

for($i = count($p) - 1; $i > -1; --$i) {
    if($p[$i] === 0) continue; // prevents infinite looping from 0 values

    while($t - $p[$i] >= 0) {
        $t -= $p[$i];
        array_push($s, $p[$i]);
    }
}

if($t !== 0) die('No solution found.');

die('Solution: {'.implode(',', $s).'}');

But in order to cater for negative numbers in the p set, you have two options (from what I can see).

The first (and less efficient method) is to just try every combination and choose the shortest combination of them. This, of course, forces you to have to loop through n^2 times, and for large values on n, this isn’t practical.

The second approach is to get as close to the solution as possible in the least possible steps (by adding together the largest possible integers), and then begin trying all test cases (of adding positive to negative numbers) to reach your target. In order to do this solution, you’ll need to have one array of positive numbers only and another array of negative numbers only. This will mean that at worst you will have to loop through n^2 times (meaning no combination was found or that it was the last combination tried), or at best you’ll only need to loop through 1 time (meaning that the first positive n tested equalled your t value).

Here’s the above code snippet adapted to solution #2:

<?php

$log = 0;

$p = [17, -1, 3, -2];
$t = 19;
$s = [];

sort($p);

// sort the values in positive ($p1) and negative ($p2) sets
$p1 = $p2 = [];

foreach($p as $v) {
    if($v === 0) continue; // remove 0 if it's in there...

    if($v > 0)
        array_push($p1, $v);
    else
        array_push($p2, $v);
}

for($i = count($p1) - 1; $i > -1; --$i) {
    while($t - $p1[$i] >= 0) {
    	++$log;
        $t -= $p1[$i];
        array_push($s, $p1[$i]);
    }

    if($t === 0) break; // if we have our answer, don't continue

    if(empty($p2)) continue; // no need to loop through negative values if there is none

    // if the largest value is too big to take away, try it with a combination of negatives
    for($j = count($p2) - 1; $j > -1; --$j) {
    	++$log;
        if($t - ($p1[$i] + $p2[$j]) >= 0) {
            $t -= $p1[$i] + $p2[$j];
            array_push($s, $p1[$i], $p2[$j]);
            break;
        }

        if($t === 0) break; // if we have our answer, don't continue
    }
}

if($t !== 0) die('No solution found.');

die('Solution: {'.implode(',', $s).'}<br />Found in: '.$log.' move(s)');

At least that’s the conclusion I came to I’d love to hear if anyone else has a more efficient solution!

EDIT: It wasn’t necessary, but I added in logging to count the number of iterations required to find the shortest combination.