[gameprogrammer] Re: RES: Re: C# vs C++
- From: Sami Näätänen <sn.ml@xxxxxxxxxxxxxxx>
- To: gameprogrammer@xxxxxxxxxxxxx
- Date: Wed, 25 Feb 2009 22:16:32 +0200
On Tuesday 24 February 2009 00:33:46 Bob Pendleton wrote:
> On Sat, Feb 21, 2009 at 6:45 PM, Garrett Gaston <garrett85@xxxxxxxxxxx>
wrote:
> > I am a very VERY novice programmer, what do you mean when you call
> > certain programming languages slow and fast?
>
> When they say that a language is fast or slow they are saying that
> code produced with that language takes less time or more time to run.
> So, generally C code runs faster than Java code. But, the actual speed
> of code execution is only somewhat related to the language.
...Snip...
> So, really, when some one says that language X is faster than language
> Y they are not saying what most people think they are saying.
Also worth mentioning is the fact that modern compilers start to do much
better job at optimizing things so in most cases it might be better to make
some crazy looking code by not optimizing things that previously were well
worth the optimization.
There are of course still room for improvements on the compiler side, but that
has been heavily developed at the moment, which will improve the produced code
even futher. This compact coding is atleast for me as an older programmer
quite hard to keep in mind. When we started coding the real optimizations came
from the programmer not from the compiler.
PS. Here is a small example to those information.
This shows how the situation has changed.
--------------------- foobar.c -----------------------
float a[4] = {1.0f};
float b[4] = {1.2f};
void
bar() {
a[0] = b[0]*b[0];
a[1] = b[1]*b[1];
a[2] = b[2]*b[2];
a[3] = b[3]*b[3];
}
void
foo() {
for( i = 0; i<4; i++ )
a[i] = b[i]*b[i];
}
---EOF------------------ foobar.c --------------------EOF---
Here foo and bar will do exactly the same thing.
The faster of these two is either foo or bar or they are the same.
Using older compilers foo would not out perform the bar what ever you would
do.
So here are the different cases using different optimizations:
1) bar is faster if the compiler don't do any loop unrolling.
2) Using loop unrolling will make these functions equal.
3) Using tree vectorizing and foo will be much faster.
Some assembler produced by gcc-4.3.3 using --march=core2 under 64bit OS for
fuction foo.
First with -O2 -funroll-loops
( In this case the code is equal to the code produced for bar)
foo:
.LFB3:
movss b(%rip), %xmm3
movss b+4(%rip), %xmm2
mulss %xmm3, %xmm3
mulss %xmm2, %xmm2
movss b+8(%rip), %xmm1
movss b+12(%rip), %xmm0
mulss %xmm1, %xmm1
mulss %xmm0, %xmm0
movss %xmm3, a(%rip)
movss %xmm2, a+4(%rip)
movss %xmm1, a+8(%rip)
movss %xmm0, a+12(%rip)
ret
.LFE3:
.size foo, .-foo
And now with -ftree-vectorize the for loop has beed reduced to 3 assembler
instructions.
foo:
.LFB3:
movaps b(%rip), %xmm0
mulps %xmm0, %xmm0
movaps %xmm0, a(%rip)
ret
.LFE3:
---------------------
To unsubscribe go to http://gameprogrammer.com/mailinglist.html
Other related posts: