The gain in this is due to the way the jump predictor does optimizations. There are actually a lot of useless jumps in this code, and they will get taken unfortunately: optimization or not. There is also code repeat. Also, Ken, unless if tyler runs an optimizer on here, it will not get optimized out because the operations he is using change the state of flags and so are considered to be not atomic. For example, the code will actually be measurably faster, in a loop of course, if he removes the 4 or 5 jumps I talked about in my last email. There is some talk about whether intel employs macrocode optimization, but that's more of a cache oriented process than anything else, from my memory on the stuff. Also, there's a lot of byte dereferencing here, and so one possibility is to simply combine some of these operations into words. Furthermore, I haven't taken a close enough look, but I think I can remove every single jump in this code, simply by translating the dereferenced bytes and using ebx as a pointer into the data segment via the use of the xlat op code. Take care, Sina -----Original Message----- From: programmingblind-bounce@xxxxxxxxxxxxx [mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Ken Perry Sent: Sunday, January 16, 2011 10:53 PM To: programmingblind@xxxxxxxxxxxxx Subject: RE: code optomization:any way to do this better? I don't think there is a really good way to speed this up. The fact is most assemblers will turn this into some really optimized machine code and with chips now days they can do multiple look ahead if statements so that this would almost happen in one pulse of a chips timing. I think Sina might be able to talk on the speed of this better than I but I can tell you the code is not bad. You have to understand Asm is different than most languages because its normal to have some repetitive code. You could make a block that tests for null and returns you to a passed in location so that way every time you wanted to test for null you jump to your test block and then if its null jump to end or jump back to the location in the code where you wanted to be but I don't see any gain in doing that. Ken -----Original Message----- From: programmingblind-bounce@xxxxxxxxxxxxx [mailto:programmingblind-bounce@xxxxxxxxxxxxx] On Behalf Of Littlefield, Tyler Sent: Sunday, January 16, 2011 9:56 PM To: programmingblind@xxxxxxxxxxxxx Subject: code optomization:any way to do this better? So I've been playing with assembly a lot lately, and was curious if there was a better way to do this. most importantly, the whole three branched if check (null, not null). section .text global _strcmp _strcmp: enter 0,0 ;we copy our arguments to EBX and ECX mov EBX, [EBP+8] mov ECX, [EBP+12] .loop: ;we need one value in a register mov EDX, [ECX] ;check for null termination cmp byte [EBX], 0 je .null jne .notnull ;we have a null termination. ;if the other string is null terminated, we jump to success. otherwise it fails because they obviously aren't equal. .null: cmp byte [ECX], 0 je .success jne .fail ;byte wasn't null, now we check for null on the other byte. ;if one is null, it's a fail because again they aren't equal. If it is not null, we do another check. .notnull: cmp byte [ECX], 0 ;not equal, we check for equalness between the two now. jne .check je .fail ;we check for equalness between the two bytes here. .check: cmp [EBX], EDX je .next jne .fail ;here we increase pointers and jump back up to the top of the loop. .next: inc EBX inc ECX jmp .loop ;strings compared fully .success: mov EAX,1 jmp .finish ;strings did not compare fully. .fail: mov EAX, 0 ;code cleanup. ;no need for a jmp, it just falls through. .finish: leave ret -- Thanks, Ty __________ View the list's information and change your settings at //www.freelists.org/list/programmingblind __________ View the list's information and change your settings at //www.freelists.org/list/programmingblind __________ View the list's information and change your settings at //www.freelists.org/list/programmingblind