[haiku-development] Re: Haiku SSE status? (fixed?) RFC kernel patch.

  • From: Alexander von Gluck <kallisti5@xxxxxxxxxxx>
  • To: <haiku-development@xxxxxxxxxxxxx>
  • Date: Fri, 20 Jan 2012 13:10:16 -0600

On 20.01.2012 12:15, Alexander von Gluck wrote:
On 20.01.2012 12:00, Alexander von Gluck wrote:
On 20.01.2012 00:18, Fredrik Holmqvist wrote:
2012/1/20 Ingo Weinhold <ingo_weinhold@xxxxxx>:
Alexander von Gluck wrote:
On 19.01.2012 17:26, Ingo Weinhold wrote:
> Alexander von Gluck wrote:
>> 1) You can download the following (simple) sse2 test c program:
>> http://pub.haikufire.com/sse.c
>>
>> 2) compile it via: gcc sse.c -msse2 -o sse
>>
>> 3) run it to have it crash as soon as it attempts the sse code.
>
> Like on executing the first SSE instruction? And how does it crash (which
> exception)?

"General Protection Fault"

If it happens when executing the first SSE instruction, then it's very likely something that we're not setting up (correctly). A look at the IA-32 specs should help find out what our setup is missing yet.

I have the same issue with MMX. I havn't used MMX much before so I
thought it was just me. I've tried w/o alignment, saving/restoring
fpu, emms, sfence and so on. But just doing a 'movd %eax, %mm0' hangs.

From the x86 docs i've read, fxsave has to be called before a task switch, and fxrstore has to be called after a task
switch to keep the fpu in a valid state with sse

I see references to us calling fxsave in kernel/arch/x86/arch_thread.cpp on arch_thread_init, however no calls to
fxrstore.



Ok, so it seems i've fixed the sse.c application...

After making the following change within the kernel, the sse.c application runs and shows faster sse results without the general protection fault...


diff --git a/src/system/kernel/arch/x86/arch_cpu.cpp b/src/system/kernel/arch/x86/arch_cpu.cpp
index 035c352..e40ccfe 100644
--- a/src/system/kernel/arch/x86/arch_cpu.cpp
+++ b/src/system/kernel/arch/x86/arch_cpu.cpp
@@ -658,7 +658,7 @@ x86_double_fault_get_cpu(void)
 status_t
 arch_cpu_preboot_init_percpu(kernel_args *args, int cpu)
 {
- x86_write_cr0(x86_read_cr0() & ~(CR0_FPU_EMULATION | CR0_MONITOR_FPU));
+       // Set initial non-sse FPU swap call before vm init
        gX86SwapFPUFunc = i386_fnsave_swap;

// On SMP system we want to synchronize the CPUs' TSCs, so system_time()


Do the powers to be know *why* this works?

We enable the FPU early on (without sse), then call init_sse when each cpu is spun up (after vm init) which: sets cr0, cr4 and sets gX86SwapFPUFunc to i386_fxsave_swap vs i386_fnsave_swap *if* sse2 is available.

Disabling this early pre-cpu-boot FPU control register setting *seems* to fix sse things (maybe mmx as well)

Thanks!
 -- Alex

Other related posts:

  • » [haiku-development] Re: Haiku SSE status? (fixed?) RFC kernel patch. - Alexander von Gluck