2013/12/17 François Revol <revol@xxxxxxx>: > On 17/12/2013 05:45, pdziepak-github.scheduler wrote: >> added 3 changesets to branch 'refs/remotes/pdziepak-github/scheduler' >> old head: b2cd1dba1e8df965e967fc185921375c2fed535c >> new head: b3723fdada15080f9d8cad90d57c7689bb029bed >> overview: https://github.com/pdziepak/Haiku/compare/b2cd1db...b3723fd >> >> ---------------------------------------------------------------------------- >> >> 4c1853d: x86: Let each CPU have its own GDT >> >> 2ceb9d5: x86: Store pointer to the current thread in gs:0 >> >> Apparently, reading from dr3 is slower than reading from memory >> with cache hit. >> >> Also, depending on hypervisor configuration, accessing dr3 may cause >> a VM exit (and, at least on kvm, it does), what makes it much slower >> than a memory access even when there is a cache miss. ... > Although won't using gs collide with anything else? > IIRC gcc4 can use it for stack overflow detection on Linux (actually > some place at offsets like 40), not sure it's part of an official ABI > though... It's actually a bit more complicated than that. GCC expects the canary value to be stored either in TLS at certain offset (e.g. %gs:0x14 on x86 platforms using glibc) or in global variable __stack_chk_guard. It chooses the former if there is 'gnu' somewhere in the target triple or GCC knows by any other means that the resulting code will be linked against glibc. Since Linux can be built without cross-compiler they have to provide support for that also in the kernel. In all other cases the latter option is used, hence we don't have to worry about it. However, if we want to support stack protectors with per thread canaries in userland as well as in the kernel (that would obviously require updating GCC configure) this patch actually makes it easier. On x86 only the first 4 bytes of the segment at gs are used to store the pointer to the thread and on x86_64 gs points to architecture dependent thread data. GCC expects to find the canary value at %gs:0x14 for x86, %gs:0x18 for x32 and %gs:0x28 for x86_64 so there would be no problem to put it there. Summing it up, as far as we use the same canary value to all threads in a team this patch is completely unrelated. When we want to start using different canary value for each thread then we only need to adjust the structures gs points to (and update the canary value at each task switch). Paweł