[openbeos] Re: couple o'things

  • From: "Martin Krastev" <blue@xxxxxxxxx>
  • To: <openbeos@xxxxxxxxxxxxx>
  • Date: Mon, 27 Aug 2001 13:03:01 +0300

Ithamar R. Adema <ithamar@xxxxxxxxx> wrote:

> Don't you think that getting a version *running* and then optimizing it 
> heavily might be a better plan of attack? Ok, we need to take this into 
> account while coding (not knowingful leave performance holes in the code) 
> but this is pretty easy with many coders on the same project, we'll correct 
> each other with the relativant experience we have, so making better code 
> and *learning how to code better* as an extra added bonus :)

and what if the basis kernel was initially so far from the targetted performance
that no posterior 'heavy optimizations' could make it match the former? mind 
you,
mimicking an existing system sets _particular_ performance requirements which
need to be met, otherwise you risk breaking all existing performance-intensive/
time-critical apps (and that could be multi-media apps just as well). and AFAIK
BeOS is top-notch wrt kernel performance! btw, in my practice i've been through
numerous cases where a given 'piece' of code (as big as a the whole TnL pipe +
the rasterizer) reaches its maximal performance levels and no further sane 
optimizations
could squeeze any tangible bit of performance, and still the targeted 
performance is
not met. in such cases one scraps that code and comes with a new solution, 
designed
right from the start with the required performance in mind. so how may different
kernels do you think this project can affort coming up with?

> I can tell you looking at the disasm of the kernel that they use a special 
> instruction (not even recognized by the disasm unit of objdump) for doing a 
> fast ring 0 gate return back into userspace for some Intel/AMD chips.... 
> You can even see this looking at the symbol dump that objdump can give 
> you.... They're pretty bleeding obviously called fast_ xxx slow_xxxx 
> xxxx_amd, or any other variant you can think of :) Using objdump helps, 
> really :)

didn't know about that optimization, though it'd be logical to be there. gotta
do my share of disasm, i guess ;)

> I meant that we need to capture that info and other information we find out 
> while coding in a good format, and close to the relevant code.... Using a 
> documentation system that integrates code and doc helps, trust me, I've 
> seen it work :)

ok, i see your point, but there are things that better be know apriori, and 
those
are still not in the bebook. i still think a healthy body needs to take care of 
that.

Michael Noisternig <michael.noisternig@xxxxxxxxx> wrote:

>I do not quite understand what you mean with kernel timing thing. I
>don't think it's too slow (without having tested it). MMX/3DNow code is
>hardly useful here (except in special cases like memcpy - see below).
>Performance issues come last, we would all be happy to get *any* working
>kernel at first.

various kernel timings off the top of my hat:

* ISR entry latency
* ISR-entry-to-IRQ-line-clear latency
* IRQ-to-device-driver latency
* scheduler quantum overhead (min./max at at various levels of system 
thread-load)
* scheduler latency running a 'realtime' unblocked thread (at various levels of 
system thread-load)
* scheduler arbitration latency when running a 'regular' thread (----``-----)
* etc, etc (gotta go to lunch already)

finally, i disagree that '*any* working kernel' would be of much good if you're 
after
binary compatibility with an os such as BeOS (see earlier paragraph) and IMHO,
binary compatibility is a not a viable direction to take in the context of this 
project.

>> I can tell you looking at the disasm of the kernel that they use a special
>> instruction (not even recognized by the disasm unit of objdump) for doing a
>> fast ring 0 gate return back into userspace for some Intel/AMD chips....
>> You can even see this looking at the symbol dump that objdump can give
>> you.... They're pretty bleeding obviously called fast_ xxx slow_xxxx
>> xxxx_amd, or any other variant you can think of :) Using objdump helps,
>> really :)
>
>These fastest version functions can be found in all MMX/3DNow developer
>manuals and can be easily integrated in our kernel.

no doubt, but i believe the original poster's point was just to give an example.

-blu


Other related posts: