Forum Discussion
lazydodo
13 years agoHonored Guest
Shader Optimization.
I was playing with the shader and wondered if it could be optimized.
Lets start with the standard Shader
That's 11 Multiplies, 6 Additions and 3 Subtractions per pixel. (eyeballed the numbers sorry, could be off by one or two) and on my GTX670 takes 0.1447264 milliseconds to execute (based on averaging the time of a 100 frames of just the draw with the shader)
Which could be good, could be bad, nothing to compare it to, so lets compare it to a shader that does not warp and just samples the texture.
which takes 0.0272857 milliseconds to execute (again GTX670, 100 frames)
So the warping shader is 5.3 times slower than just bare sampling.
There must be a better way, why are we recalculating the distortion map every frame single frame? seems rather wasteful. Lets render the distortion map to texture instead. We'll use the Red channel for the X distortion, and the Green channel for the Y distortion. If we make a texture of format R32G32_Float it should fit our needs nicely. And the upside is we only have to do the nasty math once!
our shader now looks like this
The resulting texture will look kinda like this (Used bad parameters, had to guess since I don't have a rift yet so the horizontal mapping is somewhat off)

Sweet, almost in business! lets update the warping shader to make use of the new texture. First of all the linear sampler has to go, a MinMagMipPoint sampler will fit our needs better.
Ahh much simpler now! New execution time 0.0598547 milliseconds and no visual difference! 2.5 times speed up from the original shader! Not bad for 20 minutes of tinkering.
Lets start with the standard Shader
float2 HmdWarp(float2 in01)
{
float2 theta = (in01 - LensCenter) * ScaleIn; // Scales to [-1, 1]
float rSq= theta.x * theta.x + theta.y * theta.y;
float2 theta1 = theta * (HmdWarpParam.x + HmdWarpParam.y * rSq +
HmdWarpParam.z * rSq * rSq + HmdWarpParam.w * rSq * rSq * rSq);
return LensCenter + Scale * theta1;
}
float4 main( float2 texCoord : TEXCOORD0,float4 color : COLOR0 ) : SV_Target
{
float2 tc = HmdWarp(texCoord);
if (any(clamp(tc, ScreenCenter-float2(0.25,0.5), ScreenCenter+float2(0.25, 0.5)) - tc))
return 0;
return Texture.Sample(Linear, tc);
};
That's 11 Multiplies, 6 Additions and 3 Subtractions per pixel. (eyeballed the numbers sorry, could be off by one or two) and on my GTX670 takes 0.1447264 milliseconds to execute (based on averaging the time of a 100 frames of just the draw with the shader)
Which could be good, could be bad, nothing to compare it to, so lets compare it to a shader that does not warp and just samples the texture.
float4 pixelShader( float2 texCoord : TEXCOORD0,float4 color : COLOR0 ) : SV_Target
{
float4 c = Texture.Sample(Linear, texCoord);
return c;
}
which takes 0.0272857 milliseconds to execute (again GTX670, 100 frames)
So the warping shader is 5.3 times slower than just bare sampling.
There must be a better way, why are we recalculating the distortion map every frame single frame? seems rather wasteful. Lets render the distortion map to texture instead. We'll use the Red channel for the X distortion, and the Green channel for the Y distortion. If we make a texture of format R32G32_Float it should fit our needs nicely. And the upside is we only have to do the nasty math once!
our shader now looks like this
float2 HmdWarp(float2 in01)
{
/// Same as above, cut to keep the post size down.
}
float4 main( float2 texCoord : TEXCOORD0,float4 color : COLOR0 ) : SV_Target
{
float2 tc = HmdWarp(texCoord);
if (any(clamp(tc, ScreenCenter-float2(0.25,0.5), ScreenCenter+float2(0.25, 0.5)) - tc))
return 0;
return float4(tc.x, tc.y,0,0);
};
The resulting texture will look kinda like this (Used bad parameters, had to guess since I don't have a rift yet so the horizontal mapping is somewhat off)

Sweet, almost in business! lets update the warping shader to make use of the new texture. First of all the linear sampler has to go, a MinMagMipPoint sampler will fit our needs better.
SamplerState PointSampler : register(s0);
Texture2D Texture : register(t0);
Texture2D TextureWarp : register(t1);
float4 main( float2 texCoord : TEXCOORD0,float4 color : COLOR0 ) : SV_Target
{
float4 c = TextureWarp.Sample(PointSampler, texCoord);
float2 RealCoord = float2(c.r,c.g);
return Texture.Sample(PointSampler, RealCoord);
};
Ahh much simpler now! New execution time 0.0598547 milliseconds and no visual difference! 2.5 times speed up from the original shader! Not bad for 20 minutes of tinkering.
80 Replies
- RayoqueHonored GuestThis is some really good work! I love seeing progress towards making an existing product better. :D I wonder if the Rift's firmware is easily flashable, and if Oculus could build and push out an official update at some point that incorporates upgrades like this? Or is everything implemented via the SDK?
- atavenerAdventurerNicely done lazydodo. Back before programmable shaders it was all about using textures as functions (look up tables). There's still value to such techniques. :)
- EntroperHonored GuestNice job!
I was thinking earlier that since HmdWarpParam.w was always zero, you could leave out the r^6 term in the equation, and save yourself another two multiplies and an add (or four multiplies depending on how good the compiler is at optimization and scheduling). But this looks like it beats the pants off that little tweak anyway. :)
I wonder if you could generalize this further and avoid the dependent texture lookup. What if, instead of just drawing a single big quad, you drew a mesh of smaller quads with the right tex coords baked into the vertices? It would be an approximation, but I bet that bilinear filtering would be close enough even with a fairly sparse quad mesh. You could even pre-warp the quad mesh so that you aren't drawing the black area at all. Any thoughts? - owenwpExpert ProtegeThe distortion is not a linear function, so you will get swimming artifacts if you do it that way.
- lazydodoHonored GuestSounds like a fun test, I'll try it out over the weekend.
- EntroperHonored Guest
"owenwp" wrote:
The distortion is not a linear function, so you will get swimming artifacts if you do it that way.
The question is, how dense do you need to make the mesh to avoid this. If you can get the error down to half a pixel or less, it should look okay. If you did 32x40 quads, each quad would span 20x20 pixels pre-warp. And the larger error amounts will probably be in your peripheral vision, anyway. - AnonymousNice work. I tried to do this in OpenGL yesterday but somehow managed to slow things down. It's good to know the idea was sound and I'm merely a terrible programmer.
Would distorting the left and right eyes at the same time speed things up even more? Halving the number of times you sample the distortion map seems like a good thing, but I've no idea if any gains would be negated by the extra stuff that would need doing.
I'll give it a try tomorrow, but considering my previous attempt I'd be hesitant to draw any kind of conclusions based on the results. - cyberealityGrand ChampionThis is interesting. I can talk to the team to see if this is something we'd want to add to the SDK.
- lazydodoHonored GuestI ran the tests on my office box (Geforce 210, low, low end $35 hardware) the speedup wasn't quite as big as on the 670 but still well worth it. Given you run the shader twice, it got the frametime down by almost a full milliseccond.
warpShader : 2.094713
Just Sample : 0.792888
WarpLookup : 1.597678 - gallantpigeonHonored GuestCorrect me if I am wrong, but doesn't the oculus sdk use a Maclaurin series expansion to approximate a full barrel distortion? Wikipedia describes the full expansion: http://en.wikipedia.org/wiki/Distortion_(optics).
Assuming this is correct, we can use more than the 6 terms the sdk uses, r0 = r(k0 + k1r^2 + k2r^4 + k3r^6), to get a higher distortion accuracy since the math operations are only done at start up. I'm not sure if adding more terms will make any noticeable difference in quality, the error may be negligible.
Quick Links
- Horizon Developer Support
- Quest User Forums
- Troubleshooting Forum for problems with a game or app
- Quest Support for problems with your device
Other Meta Support
Related Content
- 2 months ago