[openbeos] Re: app_server: MMX/SSE help wanted

  • From: "Marcus Overhagen" <ml@xxxxxxxxxxxx>
  • To: <openbeos@xxxxxxxxxxxxx>
  • Date: Wed, 11 Aug 2004 09:41:15 +0200

Christian Packmann Christian.Packmann@xxxxxx wrote:

> About data alignment issues (short version):
> 
> Most CPUs prefer if they can perform reads/writes on natural alignmnent 
> borders, i.e. a boundary of 2 bytes for a word (int16), 4 bytes for double 
> word (int32), 8 bytes for quad word (abstract MMX datatype), etc.
> 
> When doing unaligned accesses, it'll take the CPU some extra cycles to 
[...]
> But even if you're 'only' loosing a few cycles during each mem access that 
> can hurt performance quite badly. 
That all is correct, and can be a real pain.

> The problem with current BeOS is that it doesn't provide any kind of 
> control over alignment when allocating bitmaps, they'll only be aligned to 
> 4-byte boundaries. In order to get optimum memory throughput, I have to 
> write special code which reads 32bit values until I reach a well-aligned 
> address (8/16 bytes for MMX/SSE), then do full-width accesses until there's
> 
> only a 'remainder' of data left which has to be read in 32bit chunks again.
> 
> The resulting code is messy.
Thats right. But as long at you can not guarantee that all data you process is 
properly (e.g. 64 byte) aligned, you do have to implement the additional 32 bit
accesses.
I'm not even sure that the minimum alignment is 32 bit.

> So from the SIMD coders perspective, it would be very good if Haiku would 
> offer some control over data alignment for bitmap allocations.
> This includes not only the base address of the bitmap, but should ideally 
> extend to each bitmap row, in cases where each row has to be processed 
> separately (e.g. by blur routines:). 
Yes, it would be good if you could assume a specific alignment for SIMD 
processing, unfortunately, this isn't possible in BeOS.
Perhaps in haiku BBitmap can be modified to *always* align the start address
to some value (like 64 byte or 4 kB), and to always align row starts to 
something
well chose. I think BytePerRow() is used by almost every app, because BeOS
already did some alignment, and doing calculation using width*pixelsize
already doesn't work relyable

> Of course there'd be some 'waste', but I don't think this would matter too 
> much on modern systems. Binary compatibility shouldn't be a problem either,
> as the BeBook already says that BytesPerRow() are decisive on determing a 
> bitmaps actual size.
right

> This might be implemented by an additional constructor without too much 
> fuss, I'd guess. It would be great if you could implement this, as this 
> would make SIMD coding much easier and less bug-prone.
While adding an additional constructor might work for an application
that implements it's own SIMD processing, it won't help you with the
coding of the operating system, as you might still get unaligned data.

I just noticed that BBitmap already has a constructor that is:

BBitmap::BBitmap(BRect bounds, uint32 flags, color_space colorSpace,
     int32 bytesPerRow, screen_id screenID)

and allows to specify the bytesPerRow. Unfortunatley, this will prevent the 
openrating system to force a specific row alignment, so all bitmap manipulating 
functions have to be able to work with arbitrary row alignments.

And how are you going to deal with non 32 bit RGB data, some applications
will still pass in BBitmaps that are in 256 color data with palette, etc.

regards
Marcus




Other related posts: