Den 26 mars 2012 21:37 skrev Axel Dörfler <axeld@xxxxxxxxxxxxxxxx>: > On 26.03.2012 21:15, Clemens Zeidler wrote: >> shouldn't memcpy use the fasted method by itself? or is that not >> efficiently possible? > > > It already does so at boot time. However, it might not be the optimal method > for your machine (in which case the general mechanism should be improved) or > your use case. > In case of the app_server, I'm not sure whether a dedicated memcpy() is the > correct solution. I guess we should have a benchmark for this, so that we > can decide which memcpy() versions end up in our kernel. At the moment we use the standard rep movs[l,b] copying. And as far as I know gcc doesn't try to replace this by default, because gcc trusts glibc by default to do it fast. The interesting thing is that we have a cpu_module which allows to setting other implementations of memcpy and memset depending on cpu detection at least for X86. The default memcpy is here: http://haiku.it.su.se:8180/source/xref/src/system/kernel/arch/x86/arch_x86.S#179 And the checking for optimized version for the cpu is here: http://haiku.it.su.se:8180/source/xref/src/system/kernel/arch/x86/arch_cpu.cpp#839 I've played around with this a bit, but since my asm optimization knowledge is from 486/Pentium I need to learn how to do fast assembly for modern cpus. If anyone is interested in playing with this libmicro can help in testing and give you an idea about the performance of your code. I used a script that called a few of the tests for my experimentation. /Fredrik Holmqvist, TQH