[gameprogrammer] Re: RES: Re: C# vs C++

On Tuesday 24 February 2009 00:33:46 Bob Pendleton wrote:
> On Sat, Feb 21, 2009 at 6:45 PM, Garrett Gaston <garrett85@xxxxxxxxxxx> 
wrote:
> > I am a very VERY novice programmer, what do you mean when you call
> > certain programming languages slow and fast?
>
> When they say that a language is fast or slow they are saying that
> code produced with that language takes less time or more time to run.
> So, generally C code runs faster than Java code. But, the actual speed
> of code execution is only somewhat related to the language.

...Snip...

> So, really, when some one says that language X is faster than language
> Y they are not saying what most people think they are saying.

Also worth mentioning is the fact that modern compilers start to do much 
better job at optimizing things so in most cases it might be better to make 
some crazy looking code by not optimizing things that previously were well 
worth the optimization.

There are of course still room for improvements on the compiler side, but that 
has been heavily developed at the moment, which will improve the produced code 
even futher. This compact coding is atleast for me as an older programmer 
quite hard to keep in mind. When we started coding the real optimizations came 
from the programmer not from the compiler.


PS. Here is a small example to those information.

This shows how the situation has changed.

--------------------- foobar.c -----------------------
float a[4] = {1.0f};
float b[4] = {1.2f};

void
bar() {
    a[0] = b[0]*b[0];
    a[1] = b[1]*b[1];
    a[2] = b[2]*b[2];
    a[3] = b[3]*b[3];
}

void
foo() {
  for( i = 0; i<4; i++ )
    a[i] = b[i]*b[i];
}
---EOF------------------ foobar.c --------------------EOF---

Here foo and bar will do exactly the same thing.

The faster of these two is either foo or bar or they are the same.
Using older compilers foo would not out perform the bar what ever you would 
do.

So here are the different cases using different optimizations:
1) bar is faster if the compiler don't do any loop unrolling.
2) Using loop unrolling will make these functions equal.
3) Using tree vectorizing and foo will be much faster.

Some assembler produced by gcc-4.3.3 using --march=core2 under 64bit OS for 
fuction foo.

First with -O2 -funroll-loops
( In this case the code is equal to the code produced for bar)

 foo:
.LFB3:
        movss   b(%rip), %xmm3
        movss   b+4(%rip), %xmm2
        mulss   %xmm3, %xmm3
        mulss   %xmm2, %xmm2
        movss   b+8(%rip), %xmm1
        movss   b+12(%rip), %xmm0
        mulss   %xmm1, %xmm1
        mulss   %xmm0, %xmm0
        movss   %xmm3, a(%rip)
        movss   %xmm2, a+4(%rip)
        movss   %xmm1, a+8(%rip)
        movss   %xmm0, a+12(%rip)
        ret
.LFE3:
        .size   foo, .-foo

And now with -ftree-vectorize the for loop has beed reduced to 3 assembler 
instructions.

foo:
.LFB3:
        movaps  b(%rip), %xmm0
        mulps   %xmm0, %xmm0
        movaps  %xmm0, a(%rip)
        ret
.LFE3:


---------------------
To unsubscribe go to http://gameprogrammer.com/mailinglist.html


Other related posts: