> I'm going to start tinkering with the shader processor at: > https://github.com/hermanhermitage/videocoreiv-qpu > > Its > in a separate repo incase there are any copyright issues - basically > I'm going to document it based on a differential analysis feeding the > blob different inputs and capturing the outputs. > > My understanding > is the outputs of a computer program are generally not copyrightable as > a program cant be considered an author of an artistic work. > > For anyone interested in contributing, patent "US20110227920 Method and > System for a Shader Processor With Closely Couple Peripherals" is a good > starting point. [1] I've added a simple tool at https://github.com/hermanhermitage/videocoreiv-qpu/tree/master/qpu-sniff It uses the /opt/vc/bin/vcdbg to walk relocatable memory allocations on the videocore side, searching for ones marked as GL related. Whilst a OpenGL program is active, run it as: $ ./qpu-scan --qpuscan type = 'mem_strdup' size = 108 type = 'GL20_PROGRAM_T.uniform_data' size = 20 .......? 00000000 3f800000 0 1 .......@ bf800000 40000000 -1 2 ...?.... 3f000000 8000000b 0.5 -1.541e-44 'shader code': 00000000: 009e7000 100009e7 ra=39, rb=39, adda=0, addb=0, mula=0, mulb=0, wa=39, wb=39, F=0, X=0, packbits=0x00; addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op01 00000002: 009e7000 400009e7 ra=39, rb=39, adda=0, addb=0, mula=0, mulb=0, wa=39, wb=39, F=0, X=0, packbits=0x00; addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op04 00000004: 15827d80 10020ba7 ra=32, rb=39, adda=6, addb=6, mula=0, mulb=0, wa=46, wb=39, F=0, X=0, packbits=0x00; addop21<cc1> io46, io32, io32; mulop00<cc0> io39, A0, A0; op01 00000006: 009e7000 300009e7 ra=39, rb=39, adda=0, addb=0, mula=0, mulb=0, wa=39, wb=39, F=0, X=0, packbits=0x00; addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op03 00000008: 009e7000 100009e7 ra=39, rb=39, adda=0, addb=0, mula=0, mulb=0, wa=39, wb=39, F=0, X=0, packbits=0x00; addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op01 0000000a: 009e7000 500009e7 ra=39, rb=39, adda=0, addb=0, mula=0, mulb=0, wa=39, wb=39, F=0, X=0, packbits=0x00; addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op05 ... [2] Some background to aid understanding: - My understanding is there are 3 "slices". Each slice has QPUs. - Each QPU is a 4 way SIMD unit with an add ALU and a mulitply ALU. - 3 Slices * 4 QPUs * (4+4)*250MHz -> 24 GFLOPS - Each QPU has two register banks ra and rb with limitations on read/write ports and latencies. - The first 32 entries ra0..ra31 and rb0..rb31 are normal registers - The second 32 entries are actually references to units such as exp, log, reciprocal, reciprocal-squareroot and 3d pipeline registers. - Each QPU has 4 or more Accumulators (these are high speed registers) - for back to back access. - The split of slices and QPUs is due to balancing of other shared units for 3d pipeline (Texturing, Tiling etc). - Bit encodings can be deduced from parts of the blob (eg. the shader emitter). - The main fragments in the blob seem related to OpenVG, where as the majority of OpenGL ES stuff is generated dynamically from dataflow graphs of the user supplied shaders. Examining a compiled shader fragment: void main(void) { gl_FragColor = vec4(1,1,0,0.5); } Compiles to: # addop; mulop; controlop; addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op01 addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op04 addop21<cc1> io46, io32, io32; mulop00<cc0> io39, A0, A0; op01 addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op03 addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op01 addop00<cc0> io39, A0, A0; mulop00<cc0> io39, A0, A0; op05 Assuming <cc0> = never <cc1> = always io32 = fetch from uniform memory io39 = discard/ignore io46 = gl_FragColor addop21 = one of mov, or, and This gives: nop; nop; op01 nop; nop; op04 mov gl_FragColor, uniform nop; nop; op03 nop; nop; op05 By playing around with different fragment inputs I think i have a handle on operations for: add, sub, mul, exp, log, min, max, etc Also gl_FragCoord[.xyzw], gl_FrontFacing, etc... Will try and post the OpenGL shader sample program soon. Cheers HH.