Ithamar R. Adema <ithamar@xxxxxxxxx> wrote: > Don't you think that getting a version *running* and then optimizing it > heavily might be a better plan of attack? Ok, we need to take this into > account while coding (not knowingful leave performance holes in the code) > but this is pretty easy with many coders on the same project, we'll correct > each other with the relativant experience we have, so making better code > and *learning how to code better* as an extra added bonus :) and what if the basis kernel was initially so far from the targetted performance that no posterior 'heavy optimizations' could make it match the former? mind you, mimicking an existing system sets _particular_ performance requirements which need to be met, otherwise you risk breaking all existing performance-intensive/ time-critical apps (and that could be multi-media apps just as well). and AFAIK BeOS is top-notch wrt kernel performance! btw, in my practice i've been through numerous cases where a given 'piece' of code (as big as a the whole TnL pipe + the rasterizer) reaches its maximal performance levels and no further sane optimizations could squeeze any tangible bit of performance, and still the targeted performance is not met. in such cases one scraps that code and comes with a new solution, designed right from the start with the required performance in mind. so how may different kernels do you think this project can affort coming up with? > I can tell you looking at the disasm of the kernel that they use a special > instruction (not even recognized by the disasm unit of objdump) for doing a > fast ring 0 gate return back into userspace for some Intel/AMD chips.... > You can even see this looking at the symbol dump that objdump can give > you.... They're pretty bleeding obviously called fast_ xxx slow_xxxx > xxxx_amd, or any other variant you can think of :) Using objdump helps, > really :) didn't know about that optimization, though it'd be logical to be there. gotta do my share of disasm, i guess ;) > I meant that we need to capture that info and other information we find out > while coding in a good format, and close to the relevant code.... Using a > documentation system that integrates code and doc helps, trust me, I've > seen it work :) ok, i see your point, but there are things that better be know apriori, and those are still not in the bebook. i still think a healthy body needs to take care of that. Michael Noisternig <michael.noisternig@xxxxxxxxx> wrote: >I do not quite understand what you mean with kernel timing thing. I >don't think it's too slow (without having tested it). MMX/3DNow code is >hardly useful here (except in special cases like memcpy - see below). >Performance issues come last, we would all be happy to get *any* working >kernel at first. various kernel timings off the top of my hat: * ISR entry latency * ISR-entry-to-IRQ-line-clear latency * IRQ-to-device-driver latency * scheduler quantum overhead (min./max at at various levels of system thread-load) * scheduler latency running a 'realtime' unblocked thread (at various levels of system thread-load) * scheduler arbitration latency when running a 'regular' thread (----``-----) * etc, etc (gotta go to lunch already) finally, i disagree that '*any* working kernel' would be of much good if you're after binary compatibility with an os such as BeOS (see earlier paragraph) and IMHO, binary compatibility is a not a viable direction to take in the context of this project. >> I can tell you looking at the disasm of the kernel that they use a special >> instruction (not even recognized by the disasm unit of objdump) for doing a >> fast ring 0 gate return back into userspace for some Intel/AMD chips.... >> You can even see this looking at the symbol dump that objdump can give >> you.... They're pretty bleeding obviously called fast_ xxx slow_xxxx >> xxxx_amd, or any other variant you can think of :) Using objdump helps, >> really :) > >These fastest version functions can be found in all MMX/3DNow developer >manuals and can be easily integrated in our kernel. no doubt, but i believe the original poster's point was just to give an example. -blu