Hi, Cool article: I never saw that! (suddendly more pieces of the puzzle fit together :) Answering 'out-of-order': > "Much More Parallel > > Some people with old S3 and Cirrus Logic video cards experienced hard > system freezes with earlier > versions of the R4 beta. These freezes prove that the incredibly buff > new R4 graphics driver > architecture (designed and implemented by our own Trey "Ball-Buster" > Boudreau) is working correctly > -- too well, in the case of these cards. Prior to R4, the locking > done when a view was drawing was > coarse-grained; no two threads could draw to the frame buffer at the > same time. > > In R4, any number of threads can draw to the frame buffer > simultaneously. Yep. Correct, about R4 and later (of course :). But, they kept in mind (and I love that about them!) that the new architecture should also be compatible with those older 'non- compatible' cards: to (not) serialize the framebuffer access, we have a special flag used in the accelerants: B_PARALLEL_ACCESS If this flag is set, you may access the framebuffer parallel. If not, serialize! (ProposeMode within the accelerant is required to clear or set the flag to what it needs/wants, the R5/Dano app_server nicely adhere to it) Also note my comment in the accelerant: /* BTW: B_PARALLEL_ACCESS in combination with a hardcursor enables * BDirectWindow windowed modes. */ Which is only logical now :) > The only resource locked > for exclusive access is the acceleration engine, and that is locked > only for the time required to > feed the rendering commands through the FIFO; Indeed. Be as quick as possible. Release the engine ASAP after issuing a command. Do not wait for it to finish commands if not absolutely needed. (the app_server 'hang' mentioned by Gabe is a nice example :) So, understand please that you issuing a command to the engine _does not mean_ it is executed immediately. The command is placed in (one of) the engine's FIFO(s), which means, your command is placed at the rear of the queue waiting to be executed. The 'front' of this queue is being served by execution of the requested command. > synchronizing with the engine is intelligent and is done only when > absolutely necessary. (and the question Adi asked:) > What is/means engine synchronization? Engine synchronisation is the synchronisation between: 1. the engine drawing, and 2. the app_server (or app) drawing directly in the framebuffer. As I said, issuing a command does not mean that the command is exectuted immediately. There may be other commands issued before this one, also still waiting to be executed. But, sometimes you will need to know when a certain command you issued has actually being executed completely. For instance, if you want to move a window, and after that draw something unaccelerated inside it on the new location. I don't know. You'll have to tell me. What you _should_ do, is _prevent_ you have to sync to engine too much. If you need to sync, please see what other stuff you can do while waiting, so you are not actually waiting yet. Only after you did all you could do, sync to the engine. You'll need to see for yourself what you want or need to do with this. OK, now let's talk about _how_ you synchronize with the engine. There are two ways: 1.The 'dumb' way: wait until the engine is completely idle. hook: WAIT_ENGINE_IDLE() This hook can be called before or _after_ you release the engine. If you want to force the engine becoming idle, you should probably _not_ release the engine before, as other threads could issue new commands then constantly, so the engine never becomes fully idle. OTOH you have to realize that not releasing means other threads will be (probably) waiting until you do. 2. The intelligent way: wait until the engine has completed your (list of) commands, while it may still be busy doing other commands issued after the commands you are waiting for to complete where completed. You can easily issue more commands, before waiting for a certain earlier given command to be completed. This would mean the engine does _not_ become completely idle when your command in question is finished as the command after that is starting to be executed. Working this way can potentially speedup the process as a whole (or it would not be invented I guess :). Using this method requires you giving the accelerant a token to which it can sync. You do that be creating an empty token, and then give a pointer to that to the accelerant while releasing the engine with the hook: status_t RELEASE_ENGINE(engine_token *et, sync_token *st) or alternatively you can get a token without releasing the engine just yet by calling: status_t GET_SYNC_TOKEN(engine_token *et, sync_token *st) The accelerant will check if you gave a token pointer, and if you did, it will provide you with one. This token is kind of a time-stamp that the accelerant knows how to interpret to help you by giving you the option to wait until your specific (list of) command(s) is executed while the engine remains executing later issued commands. So, you don't do anything with that token yourself, you just give it back to the accelerant if you want to synchronize. You do that when you re-aquire the engine with: status_t ACQUIRE_ENGINE(uint32 capabilities, uint32 max_wait, sync_token *st, engine_token **et) Note that you cannot use this if you did NOT get the token before! (i.e. you have to had aquired the engine earlier in order to get a token you can now pass along) Alternatively, you can sync to the engine _without_ having aquired the engine at this time by calling: status_t SYNC_TO_TOKEN(sync_token *st) ------- OK, I hope the setup is clear now. If you have a look at my drivers, you will see that if you use the sync_token stuff, it won't make any difference, as I just call an internal function to let you wait until the engine is totally idle. Why do I do that? Well, the answer is both simple and painfull: lack of specs!! Probably the ATI driver has this stuff in place though, as Thomas has the kind of setup in place that I cannot do, that supports this function. ATI was the best company sofar to get the most detailed info about (some of) their cards. So, how could this potentially work internally in the driver: There are two ways that I know of: 1. The accelerant sets up a FIFO in memory (circulair buffer). You have a begin and end pointer to this buffer that indicate where the front and tail of the commands waiting are. If someone requests a sync_token, the accelerant will place the tail pointer in it indicating where the last command was dumped that you want to wait to complete for. IF you sync to token, the accelerant will wait for the front pointer to go beyond that (now) old tail pointer, indicating your (last) command(s) has been executed. Technical detail: the engine has to be told where the fifo is, and how big it is. the engine will need to give me access to the front and back pointers it keeps. This requires me to have the info needed to know howto set this up: which currently I don't have. 2.The accelerant simply issues an extra acceleration engine command. This command will however not do something onscreen, but instead set or clear some variable somewhere (cardRAM, internal registers?) which belongs to the token given to the user. If syncing has to be done, the accelerant will simply wait until this variable is modified, in effect telling that your (last) command(s) has been executed. Technical info: this command should be some special command I guess: I haven't yet thought too much about it. As a workaround I could possibly setup for instance a rect_invert() command that inverts a variable I just place in 'reserved' offscreen memory. As long as no apps will mess -up this memory by writing outside their allocated areas, this setup should work. For now, I won't bother however. I am first going to have a decent look at for instance the nvidia opensource 3D driver in utahGLX to see if (and if so, how) the sync_to_token scheme has been setup in there. I can imagine that this setup becomes important if hardware 3D acceleration will be setup, and not yet before. ====== That's it! Hope it helps. Greetings, Rudolf.