[openbeos] Re: app_server: MMX/SSE help wanted

  • From: Adi Oanca <adioanca@xxxxxxxxxxxxxx>
  • To: openbeos@xxxxxxxxxxxxx
  • Date: Tue, 10 Aug 2004 11:02:24 +0300

Scott MacMaster wrote:

It seems to me that you misunderstood what Mat said about improving
algorithms before making smaller optimizations.

You seem to be talking about the choice of algorithm.  A software based
algorithm compared to a hardware based algorithm (MMX/SSE).  Anyone who
knows anything about the difference between the two would know the hardware
algorithm would be better.

Mat appears to be talking about higher level 'control' algorithms.  If you
had a really bad algorithm for determining when windows need to update, what
parts of the screen to update, and how often to update it won't matter much
how fast you alpha_blending algorithm is.

Example:
    First Revision of code has bad control algorithm that calls
alpha_blending algorithm 10,000 times a second
        - optimizing of alpha_blending algorithm (possibly by switching to
MMX) results in 10% improvement
        - rewriting of control algorithm (so it now calls alpha_blending
algorithm 100 times a second) results in 100% improvement.

The example is somewhat contrived but I feel it's an accurate example of my
point.  Basically, use MMX/SSE but I (and others) highly recommend trying to
optimize your control algorithm first.

I got what Mat said. It seems like you are missing the point too. (IMO)

When you have big amounts of data, and you want to process it somehow, it is clear to me that beside choosing an algorithm(which doesn't have to be the best!), in the end you still have: a 'for' in 'for'. And that means you are bandwidth dependant. Now, compare 64, not even talking 128 in case of SSE, bit processing against 32. Me, I see a 2x, 3x difference which is more than a motivation not to storm my brain for the best algorithm possible when I have a really nice one in place which uses 80-90% of the *special engineered* resources.

One more thing, to make sure you have this right. My algorithm is not poor designed, it is what I have thought the best for some time. There is room for optimization, but not that great as 100%; no way! It's poor designed one, then.



Adi.

Other related posts: