Hi,i don't know about other compilers but gcc (for arm and mips) does very good job optimizing (when -O3 or -Os is used)..
but the patch i've done here is very simple but effective one.. for (j = b->vblitsize; j--; ) { ... }here the test contition is j-- and it's just a; is j non-zero, decrement and loop..
for (j = 0; j < b->vblitsize; j++) { ... }but here it has to subtract j from b->vblitsize and check if it's not zero, increment and loop..
more like; AMIGA: move.l b->vblitsize,d0 .mylop: .... dbf d0,.mylop PC: mov ecx, b->vblitsize .mylop: ... loop .mylop VS. AMIGA: move.l #0,d0 .mylop: .... add.l #1,d0 cmp b->vblitsize,d0 bne .mylop PC: mov ecx,0 .mylop: ... add ecx,1 cmp ecx,b->vblitsize bne .mylopmy other patch was to remove unused srcc (uae_u32 srcc = b->bltcdat), normally compiler warns about unused variables
but the thing is e.g: void blitdofast_0 { uae_u32 srcc = b->bltcdat; // srcc is loaded .. b->bltcdat = srcc; //and back.. }the srcc value does not change here.. and because of the (b->bltcdat = srcc) line, compiler thinks this variable is used and leaves it there..
these may not make a speed difference on high end pcs, but it makes a lot of difference for 266 mhz psp and 200 mhz gp2x. -GnoStiC/bronx. Toni Wilen wrote:
genblitter diff for wip4.. this should be even faster..Shouldn't it be C-compiler's job to do these kinds of optimizations? (I thought all modern compilers handle these easily)