[gameprogrammer] Re: Let's play... Name That Algorithm! :)

  • From: "Mike Gillissie" <Niyoto@xxxxxxxxxx>
  • To: <gameprogrammer@xxxxxxxxxxxxx>
  • Date: Sat, 5 Sep 2009 05:31:12 -0400

FYI: Kevin's solution (or rather, his second-most-recent solution) ran in 1.46 seconds on my dev box. Processing all combinations of a >5000 x >5000 array. And he has since multi-threaded it. I think he did alright... ;)


----- Original Message ----- From: "Kevin Jenkins" <gameprogrammer@xxxxxxxxxx>
To: <gameprogrammer@xxxxxxxxxxxxx>
Sent: Saturday, September 05, 2009 12:56 AM
Subject: [gameprogrammer] Re: Let's play... Name That Algorithm! :)

Probably nobody cares, but I have to apologize to Mike. I wasn't able to get it under 1 second in the time I was willing to put in. This is what I wrote in the end, which was basically a brute force approach


Mike wanted every possible fully connected mesh that could be generated from an adjacency matrix. Originally, I thought he wanted the minimum spanning set of fully connected meshes.

Kevin Jenkins wrote:
If you give me a file with your input set, I'll write a C function that will do it in less than 1 second.

Mike Gillissie wrote:
Hey folks,

Just an unsolicited follow-up - after trying a lot of different approaches, I ended up sticking with the recursive model I had already, but worked in Bob's momoization concept.

Warning - I might go a little long on this one, so read only if you're curious, bored, masochistic or find humour in the way I attempt to explain what passes for "thinking" in my world. ;)

My main concern - even with mid-range input quantities (say, 3000 units) my function executes at least 10s of billions of times (I never bothered going long int to get an actual count). This means that everything I do in this function can have a drastic effect on performance - even using non-synchronized collections costs too many cycles.

So, as my way of caching results, I perform a pre-check on the adjacency table, determining which rows of bools are subsets of others (or are identical, meaning they're compatible with the same sets of other units). Essentialy I determine which unit compatibility sets are subsets of others, and store this information in an array - unit index, subset-of index, delta (number of values that are different) and dependancy.

Quick note - in the following example, 1 represents an incompatibility. Thus 0 represents compatibility.

1)01110 (incompatible with 2, 3, 4)
2)10011 (incompatible with 1, 4, 5)
3)10001 etc...

In the above, I find that 1 and 5 are equal, and 3 is a subset of 4 with a delta of 1 (representing unit 2 which is compatible with 4 but not with 3).

Since 1 and 5 are identical, I need only calculate for Unit 1. I can then take all of the resulting combinations for Unit 1 and replace the 1 with a 5.

Since 3 is a subset of 4, I calculate for Unit 3 first (which has LESS incompatibilities than Unit 4), and then simply perform the same "search and replace" to create the combinations for Unit 4, only I exclude any that contain the "Delta" element (in this case, Unit 2).

The same subset approach would be done to calculate Unit 2, since Unit 3 is also a subset of 2.

As Bob suggests, the success of this approach depends strongly on the make up of the data itself - there are always going to be situations where you just don't have enough subsets to justify the overhead of searching for them. But in the data set I'm currently working on, my execution time (which had previously gone from 3.6 hours to just over 1 hour) dropped to 4-8 seconds. And this can surely be fine-tuned further.

A further look at the problem told me that, while some compatibility sets will just not have any subsets to exploit, I could always measure the Similarity between two sets. In the above, Unit 1 and Unit 2 have a similarity of 1 (representing their shared compatibility with Unit 4). So if I was to create a "virtual" compatibility set containing Unit 4, it would become a subset of both Unit 1 and Unit 2, meaning I would only have to recursively calculate that combination once, and then use the above-approach for both Units 1 and 2. The challenge here is in finding the optimal grouping that minimizes the number of recursive calls without overpaying in overhead cycles.

It's definitely an interesting problem for me... :)

To unsubscribe go to http://gameprogrammer.com/mailinglist.html

To unsubscribe go to http://gameprogrammer.com/mailinglist.html

Other related posts: