[gameprogrammer] Re: Let's play... Name That Algorithm! :)
- From: "Mike Gillissie" <Niyoto@xxxxxxxxxx>
- To: <gameprogrammer@xxxxxxxxxxxxx>
- Date: Sat, 5 Sep 2009 05:31:12 -0400
FYI: Kevin's solution (or rather, his second-most-recent solution) ran in
1.46 seconds on my dev box. Processing all combinations of a >5000 x >5000
array. And he has since multi-threaded it. I think he did alright... ;)
-Mike
----- Original Message -----
From: "Kevin Jenkins" <gameprogrammer@xxxxxxxxxx>
To: <gameprogrammer@xxxxxxxxxxxxx>
Sent: Saturday, September 05, 2009 12:56 AM
Subject: [gameprogrammer] Re: Let's play... Name That Algorithm! :)
Probably nobody cares, but I have to apologize to Mike. I wasn't able to
get it under 1 second in the time I was willing to put in. This is what I
wrote in the end, which was basically a brute force approach
http://www.rakkar.org/sourcecode/FCMSpanningCombinationSet.cpp
Mike wanted every possible fully connected mesh that could be generated
from an adjacency matrix. Originally, I thought he wanted the minimum
spanning set of fully connected meshes.
Kevin Jenkins wrote:
If you give me a file with your input set, I'll write a C function that
will do it in less than 1 second.
Mike Gillissie wrote:
Hey folks,
Just an unsolicited follow-up - after trying a lot of different
approaches, I ended up sticking with the recursive model I had already,
but worked in Bob's momoization concept.
Warning - I might go a little long on this one, so read only if you're
curious, bored, masochistic or find humour in the way I attempt to
explain what passes for "thinking" in my world. ;)
My main concern - even with mid-range input quantities (say, 3000 units)
my function executes at least 10s of billions of times (I never bothered
going long int to get an actual count). This means that everything I do
in this function can have a drastic effect on performance - even using
non-synchronized collections costs too many cycles.
So, as my way of caching results, I perform a pre-check on the adjacency
table, determining which rows of bools are subsets of others (or are
identical, meaning they're compatible with the same sets of other
units). Essentialy I determine which unit compatibility sets are subsets
of others, and store this information in an array - unit index,
subset-of index, delta (number of values that are different) and
dependancy.
Quick note - in the following example, 1 represents an incompatibility.
Thus 0 represents compatibility.
1)01110 (incompatible with 2, 3, 4)
2)10011 (incompatible with 1, 4, 5)
3)10001 etc...
4)11001
5)01110
In the above, I find that 1 and 5 are equal, and 3 is a subset of 4 with
a delta of 1 (representing unit 2 which is compatible with 4 but not
with 3).
Since 1 and 5 are identical, I need only calculate for Unit 1. I can
then take all of the resulting combinations for Unit 1 and replace the 1
with a 5.
Since 3 is a subset of 4, I calculate for Unit 3 first (which has LESS
incompatibilities than Unit 4), and then simply perform the same "search
and replace" to create the combinations for Unit 4, only I exclude any
that contain the "Delta" element (in this case, Unit 2).
The same subset approach would be done to calculate Unit 2, since Unit 3
is also a subset of 2.
As Bob suggests, the success of this approach depends strongly on the
make up of the data itself - there are always going to be situations
where you just don't have enough subsets to justify the overhead of
searching for them. But in the data set I'm currently working on, my
execution time (which had previously gone from 3.6 hours to just over 1
hour) dropped to 4-8 seconds. And this can surely be fine-tuned further.
A further look at the problem told me that, while some compatibility
sets will just not have any subsets to exploit, I could always measure
the Similarity between two sets. In the above, Unit 1 and Unit 2 have a
similarity of 1 (representing their shared compatibility with Unit 4).
So if I was to create a "virtual" compatibility set containing Unit 4,
it would become a subset of both Unit 1 and Unit 2, meaning I would only
have to recursively calculate that combination once, and then use the
above-approach for both Units 1 and 2. The challenge here is in finding
the optimal grouping that minimizes the number of recursive calls
without overpaying in overhead cycles.
It's definitely an interesting problem for me... :)
-Mike
---------------------
To unsubscribe go to http://gameprogrammer.com/mailinglist.html
---------------------
To unsubscribe go to http://gameprogrammer.com/mailinglist.html
Other related posts: