我们已经知道可以使用vertex shader替换固定功能的顶点处理,pixel shader则用来替换固定功能的像素处理。多个纹理阶段串联可以完全被pixel shader所替换。specular addition,雾化以及frame buffer处理则不会受pixel shader的影响。
使用固定功能像素处理,应用程序需要为每个texture stage设置状态,并且尝试在多纹理串联的受限数据流里执行per-pixel处理。使用pixel shader,你只要写一个很小的程序,从纹理里面采样,然后组合采样值产生一个新的颜色,将这个颜色值传递到Frame buffer, 做进一步的处理。
在DirectX 9.0C里面Direct3D有几种不同的pixel shader 架构版本。我们将看看每个版本的寄存器模型,指令模型和指令集。跟vertex shader一样,pixel shader也是使用的是一组Token,每个指令一个Token。大部分应用程序不会直接构建TOKEN数组,将会使用D3DX将文本指令编译成TOKEN数组。
pixel shader的架构图
在所有的版本里面里面pixel shader的架构基本都类似,指令集和寄存器文件有所不一样。Pixel shader 必须执行两项任务:描述纹理是怎么寻址采样以及怎么组合采样值产生新的颜色值。
Pixel Shader指令执行流程
各个版本feature区别列表:
Feature | 1.0 | 1.1 | 1.2 | 1.3 | Phase 1 | 1.4 | Phase 2 |
Arithmetic Instructions | 8 | 8 | 8 | 8 | 8 | 8 | |
Texture Address instructions | 4 | 4 | 4 | 4 | 6 | 6 | |
Total Instructions | 8 | 12 | 12 | 12 | 14 | 14 | |
Constant registers(cn) | 8 | 8 | 8 | 8 | 8 | ||
temporary registers(rn) | 2 | 2 | 2 | 2 | 6 | ||
texture registers(tn) | 4 | 4 | 4 | 4 | 6 | ||
color registers(vn) | 2 | 2 | 2 | 2 | 0 | 2 |
各个版本可用指令列表:
Instruction | Syntax | Meaning | 1.0 | 1.1 | 1.2 | 1.3 | 1.4 |
add | d,s0,s1 | component addition | YES | YES | YES | YES | YES |
bem | d.rg,s0,s1 | bump enviroment map | YES | ||||
cmp | d,s0,s1,s2 | compare to 0.0 | YES | YES | YES | ||
cnd | d,s0,s1,s2 | compare to 0.5 | YES | YES | YES | YES | YES |
def | d,v0,v1,v2,v3 | constant definition | YES | YES | YES | YES | YES |
dp3 | d,s0,s1 | dot product | YES | YES | YES | YES | YES |
dp4 | d,s0,s1 | dot product | YES | YES | YES | ||
lrp | d,s0,s1,s2 | linear interpolation | YES | YES | YES | YES | YES |
mad | d,s0,s1,s2 | multiply and add | YES | YESS | YES | YES | YES |
mov | d,s | register copy | YES | YES | YES | YES | YES |
mul | d,s0,s1 | component multiply | YES | YES | YES | YES | YES |
nop | no operation | YES | YES | YES | YES | YES | |
phase | instruction phase | YES | |||||
ps | .major.minor | shader version | YES | YES | YES | YES | YES |
sub | d,s0,s1 | component subtraction | YES | YES | YES | YES | YES |
tex | d | sample texture | YES | YES | YES | YES | |
texbem | d,s | bump enviroment map | YES | YES | YES | YES | |
texbeml | d,s | bumpmap with luminance | YES | YES | YES | YES | |
texcoord | d | texture is coordinate | YES | YES | YES | YES | |
texcrd | d,s | texture is coordinate | YES | ||||
texdepth | d | compute pixel depth | YES | ||||
texdp3 | d,s | texture dot product | YES | YES | |||
texdp3tex | d,s | dot product with lookup | YES | YES | |||
texkill | s | kill source pixel | YES | YES | YES | YES | YES |
texld | d,s | sample from register | YES | ||||
texm3*2depth | d,s | compute pixel depth | YES | ||||
texm3*2pad | d,s | partial matrix product | YES | YES | YES | YES | |
texm3*2tex | d,s | final matrix product | YES | YES | YES | YES | |
texm3*3 | d,s | final matrix product | YES | YES | |||
texm3*3pad | d,s | partial matrix product | YES | YES | YES | YES | |
texm3*3spec | d,s0,s1,s2 | reflection vector lookup | YES | YES | YES | YES | |
texm3*3tex | d,s | final matrix product | YES | YES | YES | YES | |
texm3*3vspec | d,s | variable reflection lookup | YES | YES | YES | YES | |
texreg2ar | d,s | dependent texture lookup | YES | YES | YES | YES | |
texreg2gb | d,s | dependent texture lookup | YES | YES | YES | YES | |
texreg2rgb | d,s | dependent texture lookup | YES | YES |
Pixel Shader 1.0 Register Restriction
Instruction | Operand | cn | rn | tn | vn |
add d,s0,s1 | d s0,s1 |
* |
* * |
* |
* |
cnd d,s0,s1,s2 | d s0 s1,s2 |
* |
* r0.a * |
* |
* |
dp3 d,s0,s1 | d s0,s1 |
* |
* * |
* |
* |
lrp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* |
* |
mad d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* |
* |
mov d,s | d s |
* |
* * |
* |
* |
mul d,s0,s1 | d s0,s1 |
* |
* * |
* |
* |
sub d,s0,s1 | d s0,s1 |
* |
* * |
* |
* |
tex d | d | * | |||
texbem d,s | d,s | * | |||
texbeml d,s | d,s | * | |||
texcoord d | d | * | |||
texkill s | s | * | |||
texm3*2pad d,s | d,s | * | |||
texm3*2tex d,s | d,s | * | |||
texm3*3pad d,s | d,s | * | |||
texm3*2spec d,s0,s1,s2 | d,s0,s1 s2 |
* |
* | ||
texm3*3tex d,s | d,s | * | |||
texm3*3vspec | d,s | * | |||
texreg2ar d,s | d,s | * | |||
texreg2gb d,s | d,s | * |
Pixel Shader 1.1 Register Restrictions
Instruction | Operand | cn | rn | tn | vn |
add d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
cnd d,s0,s1,s2 | d s0 s1,s2 |
* |
* r0.a * |
* * |
* |
dp3 d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
lrp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
mad d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
mov d,s | d s |
* |
* * |
* * |
* |
mul d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
sub d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
tex d | d | * | |||
texbem d,s | d,s | * | |||
texbeml d,s | d,s | * | |||
texcoord d | d | * | |||
texkill s | s | * | |||
texm3*2pad d,s | d,s | * | |||
texm3*2tex d,s | d,s | * | |||
texm3*3pad d,s | d,s | * | |||
texm3*2spec d,s0,s1,s2 | d,s0,s1 s2 |
* |
* | ||
texm3*3tex d,s | d,s | * | |||
texm3*3vspec | d,s | * | |||
texreg2ar d,s | d,s | * | |||
texreg2gb d,s | d,s | * |
Pixel Shader 1.2 Register Restrictions
Instruction | Operand | cn | rn | tn | vn |
add d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
cmp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
cnd d,s0,s1,s2 | d s0 s1,s2 |
* |
* r0.a * |
* * |
* |
dp3 d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
dp4 d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
lrp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
mad d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
mov d,s | d s |
* |
* * |
* * |
* |
mul d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
sub d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
tex d | d | * | |||
texbem d,s | d,s | * | |||
texbeml d,s | d,s | * | |||
texcoord d | d | * | |||
texkill s | s | * | |||
texm3*2pad d,s | d,s | * | |||
texm3*2tex d,s | d,s | * | |||
texm3*3pad d,s | d,s | * | |||
texm3*2spec d,s0,s1,s2 | d,s0,s1 s2 |
* |
* | ||
texm3*3tex d,s | d,s | * | |||
texm3*3vspec | d,s | * | |||
texreg2ar d,s | d,s | * | |||
texreg2gb d,s | d,s | * | |||
texreg2rgb d,s | d,s | * |
Pixel Shader 1.3 Register Restrictions
Instruction | Operand | cn | rn | tn | vn |
add d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
cmp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
cnd d,s0,s1,s2 | d s0 s1,s2 |
* |
* r0.a * |
* * |
* |
dp3 d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
dp4 d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
lrp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
mad d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
mov d,s | d s |
* |
* * |
* * |
* |
mul d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
sub d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
tex d | d | * | |||
texbem d,s | d,s | * | |||
texbeml d,s | d,s | * | |||
texcoord d | d | * | |||
texdp3 d,s | d,s | * | |||
tex3p3tex | d,s | * | |||
texkill s | s | * | |||
texm3*2depth d,s | d,s | * | |||
texm3*2pad d,s | d,s | * | |||
texm3*2tex d,s | d,s | * | |||
texm3*3pad d,s | d,s | * | |||
texm3*3spec d,s0,s1,s2 | d,s0,s1 s2 |
* |
* | ||
texm3*3tex d,s | d,s | * | |||
texm3*3vspec | d,s | * | |||
texreg2ar d,s | d,s | * | |||
texreg2gb d,s | d,s | * | |||
texreg2rgb d,s | d,s | * |
Pixel Shader 1.4 Register Restrictions:Phase 1
Instruction | Operand | cn | rn | tn | vn |
add d,s0,s1 | d s0,s1 |
* |
* * |
||
bem d.rg,s0,s1 | d s0 s1 |
* |
* * * |
||
cmp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
||
cnd d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
||
dp3 d,s0,s1 | d s0,s1 |
* |
* * |
||
dp4 d,s0,s1 | d s0,s1 |
* |
* * |
* |
|
lrp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
||
mad d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* * |
* |
mov d,s | d s |
* |
* * |
* * |
* |
mul d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
sub d,s0,s1 | d s0,s1 |
* |
* * |
* * |
* |
tex d | d | * |
Pixel Shader 1.4 Register Restrictions: Phase 2
Instruction | Operand | cn | rn | tn | vn |
add d,s0,s1 | d s0,s1 |
* |
* * |
* |
|
cmp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* |
|
cnd d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* |
|
dp3 d,s0,s1 | d s0,s1 |
* |
* * |
* |
|
dp4 d,s0,s1 | d s0,s1 |
* |
* * |
* |
|
lrp d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* |
|
mad d,s0,s1,s2 | d s0,s1,s2 |
* |
* * |
* |
|
mov d,s | d s |
* |
* * |
* |
|
mul d,s0,s1 | d s0,s1 |
* |
* * |
* |
|
sub d,s0,s1 | d s0,s1 |
* |
* * |
* |
|
texcrd d,s | d s |
* | * |
||
texdepth d | d | r5 | |||
texkill s | s | * | * | ||
texld d,s | d s |
* * |
* |
版本1.0包含一些算术操作和纹理寻址指令。通常,算术指令能使用任何寄存器作为源操作数,任何临时寄存器作为目的操作数,但是cnd指令除外,它只能使用r0.a作为它的第一个源操作数。texXXXX类似的纹理寻址指令提供了纹理读取操作。
版本1.1扩展了1.0架构,允许更多的指令。算法指令的目的操作数可以是纹理寄存器tn.
扩展了指令cmp,dp4,texdp3,texdp3tex,texm3*3和texreg2rgb。
扩展了指令texm3*2depth。
1.4归纳了纹理寻址以及共同phase指令阶段的dependent纹理读取的概念。phase指令是纹理地址和纹理采样之间的标志。PS1.4指令组成:math, address, phase,math,address。在指令装载的不同的阶段可使用的寄存器是不一样的。
在阶段1,算术指令能够使用const和临时寄存为作为他们的源操作数,临时寄存器作为他们的目的寄存器。纹理寻址指令能够使用纹理寄存器作为他们的源操作数,临时寄存器作为他们的目的操作数。
在阶段2, 算术指令除了临时和constant寄存器还能使用颜色寄存器作为他们的源操作数,算术指令的目的操作数必须是一个临时寄存器。阶段2的纹理寻址除了纹理寄存器还能够使用临时寄存器作为源操作数。
每个像素shader必须使用ps指令声明它的架构版本。这个指令必须是pixel shader的第一个指令。major和minor必须要是架构的major和minor版本号码。 在SetPixelShader绑定pixel shader的时候constant寄存器也要被绑定。def指令定义了constant寄存器。def指令系需要出现在版本指令以后计算指令之前。
mov d,s
add d, s0,s1
sub d,s0,s1
mul d, s0,s1
mad d, s0,s1,s2
lrp d, s0,s1,s2
dp3 d , s0,s1
dp4 d, s0,s1
cmp d,s0,s1,s2
d= (f(s0x,s1x,s2x),f(s0y,s1y,s2y),f(s0z,s1z,s2z),f(s0w,s1w,s2w)) f(a,b,c) = b(a >=0) ; c (a<0)
cnd d,s0,s1,s2
d= s1(s0a>0.5) s2(s0a<=0.5)
tex d
d = Td(ud,vd,wd,qd)
texcoord d
d= (f(ud),f(vd),f(wd),1), f(x) = 0(x<0), x(0<=x<=1), 1(x>1)
texcrd d, s
d = (us,vs,ws,*)
texld d,s
d= Td(us,vs,ws)
texdp3 d,s
d = (f,f,f,f) f= (ud,vd,wd).(sr,sg,sb)
texdp3tex d,s
u' = (ud,vd,wd).(sr,sg,sb)
d = Td(u',0,0)
texkill s
abort if us<0 or vs <0 or ws <0
texm3*2pad d,s
d= (ud,vd,wd).(sr,sg,sb)
texm3*2tex d,s
u' = (ud-1,vd-1,wd-1).(sr,sg,sb)
v' = (ud,vd,wd).(sr,sg,sb)
d = Td(u',v')
texm3*2depth d,s
z = (ud-1,vd-1,wd-1).(sr,sg,sb)
w = (ud,vd,wd).(sr,sg,sb)
f= 1 (w=0) ; z/w(w<>0)
d= (f,f,f,f)
texm3*3pad d, s
d = (ud,vd,wd).(sr,sg,sb)
texm3*3 d,s
u' = (ud-2,vd-2,wd-2).(sr,sg,sb)
v' = (ud-1,vd-1,wd-1)(sr,sg,sb)
w' = (ud,vd,wd).(sr,sg,sb)
d = (u',v',w',1)
texm3*3tex d,s
u' = (ud-2,vd-2,wd-2).(sr,sg,sb)
v' = (ud-1,vd-1,wd-1)(sr,sg,sb)
w' = (ud,vd,wd).(sr,sg,sb)
d= (u',v',w')
texm3*3spec d, s0,s1
u' = (ud-2,vd-2,wd-2).(s0r,s0g,s0b)
v' = (ud-1,vd-1,wd-1)(s0r,s0g,s0b)
w' = (ud,vd,wd).(s0r,s0g,s0b)
n = (u',v',w')
(u'',v'',w'') = 2(n.s1/n.n )n - s1
d = Td(u'',v'',w'')
texm3*3vspec d, s0,s1
u' = (ud-2,vd-2,wd-2).(s0r,s0g,s0b)
v' = (ud-1,vd-1,wd-1).(s0r,s0g,s0b)
w' = (ud,vd,wd).(s0r,s0g,s0b)
n = (u',v',w')
e= (qd-2,qd-1,qd)
(u'',v'',w'') = 2(n.e/n.n)n -e
d = Td(u'',v'',w'')
texreg2ar d,s
d = Td(sa,sr)
texreg2gb d,s
d = Td(sg,sb)
texreg2rgb d, s
d = Td(sr,sg,sb)
texdepth s
z = (sr/sg(sg<>0)) ; 1 (sg = 0)
bem d, s0,s1
d.r = s0r + b00d s1r + b10d s1g
d.g = s0g + b10d s1r + b11d s1g
texbem d,s
u' = ud + b00d sr + b01d sg
v' = vd + b10d sr + b11d sg
d = Td(u',v')
texbeml d,s
u' = ud + b00d sr + b01d sg
v' = vd + b10d sr + b11d sg
d = Td(u',v1) (sbld + Od)
abx d, s
crs d, s0, s1
dcl d
dcl_textureType d
dp2add d, s0,s1,s2
exp d,sc
frc d,s
log d,sc
m3*2 d,s0,s1
m3*3 d,s0,s1
m3*4 d,s0,s1
m4*3 d,s0,s1
m4*4 d,s0,s1
max d,s0,s1
min d,s0,s1
nrm d,s
pow d,s0c,s1c
rcp d,sc
rsq d,sc
sincos d,s0c,s1,s2
texld d,s0,s1
texldb d,s0,s1
texldp d,s0,s1
defb
defi
label
call
callnz
callnz_pred
ret
dsx
dsy
if
else
endif
if_comp
if_pred
rep
endrep
break
break_comp
break_pred
setp
texldd
dcl_usage
loop
endloop
texldl
设备上pixel shader的属性使用SetPixelsShader和GetPixelShader管理。shader属性是一个DWORD句柄,它对应一个DWORD Token数组,一个Token对应一个指令。
HRESULT GetPixelShader(DWORD * value);
HRESULT SetPixelShader(DWORD value);
Token与句柄之间的联系是通过CreatePixelShader创建的。为了销毁pixel shader句柄与Token的联系,并且释放设备上的Token内存,释放pixel shader物体上的所有的reference。 你能够使用GetPixelShaderFunction取得相关的Token数组。
HRESULT CreatePixelShader(const DWORD * function, DWORD * result);
HRESULT GetPixelShaderFunction(DWORD handle,void * value, DWORD * size);
当为了获得pixel shader的token数组时,首先使用value为空的调用,这样可以取得Token数组的大小。然后再传递一个size大小的数组指针作为value,这样就可以返回某个pixel shader的Token的数组。
GetPixleShaderConstantF和SetPixelShaderConsantF方法管理constant寄存器。
pixel shader通过TSS Color Op和TSS Alpha OP替换了固定功能流水线的功能,但是它不能替换纹理寻址和采样的机制。