Forum Discussion
tlopes
12 years agoHonored Guest
Dev Feedback on OVR SDK 0.3.1 Preview
Hey guys, it looks like you're cramming a whole lot of features in to this release! Here's my first impressions with the new Distortion Renderer code in the new SDK:
(Disclaimer: I mostly looked at the D3D9 part of the Distortion Renderer code)
I think that the D3D9 version of WaitUntilGpuIdle() should look like this:
(Disclaimer: I mostly looked at the D3D9 part of the Distortion Renderer code)
- I love the changes in the Distortion Renderer to drastically simplify the pixel shaders by rendering a large mesh. These fragment shaders are *way* faster than the previous ones which tried to compute chromatic correction in the pixel shader.
- It looks to me like the CAPI D3D9 Distortion Renderer code is reeeeeeeeally barebones.
- The WaitUntilGpuIdle and WaitTillTimeAndFlushGpu functions are #ifdef'd out so that they don't do anything.
- The distortion mesh's vertex declaration may have compatibility issues on older video cards. I'd suggest instead of using three D3DDECLUSAGE_POSITION elements, to instead place the TimeWarpFactor and VignetteFactor elements of the ovrDistortionVertex struct into the decl as TEXCOORD elements rather than POSITION.
- I like that Oculus is trying to be like a good middleware citizen by recording and restoring all state before rendering (and then reverting state after rendering). One issue with DistortionRenderer::RecordAndSetState is that IDirect3DDevice9::Get* calls are not supported on D3DCREATE_PURE devices. Since many graphics engines cache render-state at the engine level, I think that a better alternative method to use here would be to add two callback functions that the renderer can implement to get render-states from the state cache (two functions because you may need one for D3DSAMPLERSTATETYPE values and one for D3DRENDERSTATETYPE values).
- Many renderer changes are not being saved and restored at the end. These include Viewport, StreamSource, Vertex Declaration, Index Buffer, Vertex and Pixel Shaders and Shader Constants, and Textures.
- Since you guys are rendering a 2D grid for the Distortion Mesh, I'm almost entirely certain that you could be using a TRIANGLESTRIP (8194 indices required, not counting strip/row restarts) rather than a TRIANGLELIST (24576 indices required). This saves a massive amount of index buffer space and is likely faster to render the distortion mesh.
- Using D3DPOOL_DEFAULT and D3DUSAGE_WRITEONLY on the distortion mesh vertex buffer and index buffer may result in a performance boost in D3D9.
- How did you guys arrive at the conclusion that 8192 triangles per eye was an optimal number for the distortion renderer? Wouldn't larger numbers of triangles result in lower linear interpolation stretch error? Did it just hit the point of diminishing returns as 8192 "looks good enough"?
I think that the D3D9 version of WaitUntilGpuIdle() should look like this:
// Note this is untested code, also I've gotta check the documentation to be sure I'm using the right flags...
LPDIRECT3DQUERY9 waitQuery = NULL;
if (device->CreateQuery(D3DQUERYTYPE_EVENT, &waitQuery) == S_OK && waitQuery != NULL)
{
BOOL done = FALSE;
waitQuery->Issue(D3DISSUE_END);
// GetData will returns S_OK for both done == TRUE or FALSE.
// Exit on failure to avoid infinite loop.
do { }
while(!done &&
!FAILED(query->GetData(&done, sizeof(BOOL), 0)
);
waitQuery->Release();
}
7 Replies
- jhericoAdventurer
"tlopes" wrote:
Since you guys are rendering a 2D grid for the Distortion Mesh, I'm almost entirely certain that you could be using a TRIANGLESTRIP (8194 indices required, not counting strip/row restarts) rather than a TRIANGLELIST (24576 indices required). This saves a massive amount of index buffer space and is likely faster to render the distortion mesh.
They may be trying to minimize the difference between OpenGL and DirectX. On GL you can't use glPrimitiveRestartIndex unless you're working with 3.1 or higher."tlopes" wrote:
How did you guys arrive at the conclusion that 8192 triangles per eye was an optimal number for the distortion renderer? Wouldn't larger numbers of triangles result in lower linear interpolation stretch error? Did it just hit the point of diminishing returns as 8192 "looks good enough"?
LibOVR/Src/Util/Util_Render_Stereo.cpp contains some text about this
//-----------------------------------------------------------------------------------
// ***** Distortion Mesh Rendering
// Pow2 for the Morton order to work!
// 4 is too low - it is easy to see the "wobbles" in the HMD.
// 5 is realllly close but you can see pixel differences with even/odd frame checking.
// 6 is indistinguishable on a monitor on even/odd frames.
static const int DMA_GridSizeLog2 = 6;
static const int DMA_GridSize = 1<<DMA_GridSizeLog2;
static const int DMA_NumVertsPerEye = (DMA_GridSize+1)*(DMA_GridSize+1);
static const int DMA_NumTrisPerEye = (DMA_GridSize)*(DMA_GridSize)*2;
It looks like they tested by alternating frames between some per-pixel calculation and the mesh calculation, and discovered that it's indistinguishable at a grid size of 2^6, or 64x64 vertices, which comes out to 8192 triangles. The code in that file also shows that they're alternating the direction of the triangles depending on the quadrant being rendered in order to minimize the effect of the interpolated mesh values varying from the computed distortion value at the vertex. i.e. the long edge of the triangle is always orthogonal to the distortion radius.
Interesting stuff. - tlopesHonored Guest
"jherico" wrote:
That's true for D3D as well. Strip restarts are an undocumented feature (that is not supposed to be used) in D3D8 and 9, and they're a real feature in D3D10 and 11. But even without official strip restarts, you can get nice triangle-stripping out of a grid. Even if you have to use degenerate triangles it's much more efficient to use a strip than a list:"tlopes" wrote:
Since you guys are rendering a 2D grid for the Distortion Mesh, I'm almost entirely certain that you could be using a TRIANGLESTRIP (8194 indices required, not counting strip/row restarts) rather than a TRIANGLELIST (24576 indices required). This saves a massive amount of index buffer space and is likely faster to render the distortion mesh.
They may be trying to minimize the difference between OpenGL and DirectX. On GL you can't use glPrimitiveRestartIndex unless you're working with 3.1 or higher.
One perfect strip 8192 polygons long: 8192 + 2 indices (actually I think this may be 8191 + 2, but still...)
Strip + two degenerate polygons to stitch every row together: 8192 + 2 + 63 * 2 = 8318 indices
Triangle list 8192 polygons long: 8192 * 3 = 24576 indices"jherico" wrote:
"tlopes" wrote:
How did you guys arrive at the conclusion that 8192 triangles per eye was an optimal number for the distortion renderer? Wouldn't larger numbers of triangles result in lower linear interpolation stretch error? Did it just hit the point of diminishing returns as 8192 "looks good enough"?
LibOVR/Src/Util/Util_Render_Stereo.cpp contains some text about this
//-----------------------------------------------------------------------------------
// ***** Distortion Mesh Rendering
// Pow2 for the Morton order to work!
// 4 is too low - it is easy to see the "wobbles" in the HMD.
// 5 is realllly close but you can see pixel differences with even/odd frame checking.
// 6 is indistinguishable on a monitor on even/odd frames.
static const int DMA_GridSizeLog2 = 6;
static const int DMA_GridSize = 1<<DMA_GridSizeLog2;
static const int DMA_NumVertsPerEye = (DMA_GridSize+1)*(DMA_GridSize+1);
static const int DMA_NumTrisPerEye = (DMA_GridSize)*(DMA_GridSize)*2;
It looks like they tested by alternating frames between some per-pixel calculation and the mesh calculation, and discovered that it's indistinguishable at a grid size of 2^6, or 64x64 vertices, which comes out to 8192 triangles. The code in that file also shows that they're alternating the direction of the triangles depending on the quadrant being rendered in order to minimize the effect of the interpolated mesh values varying from the computed distortion value at the vertex. i.e. the long edge of the triangle is always orthogonal to the distortion radius.
Interesting stuff.
That's a very cool find, I'm reading into that now. Thanks! :) - jhericoAdventurer
"tlopes" wrote:
Even if you have to use degenerate triangles it's much more efficient to use a strip than a list:
One perfect strip 8192 polygons long: 8192 + 2 indices (actually I think this may be 8191 + 2, but still...)
I'm super-wary of degenerate triangles, since I've seen instances where they would choke the OpenGL rendering pipeline on some drivers, causing an entire triangle strip to disappear, or only triangles before the degenerate to be rendered. Of course this was bout 10 years ago, so maybe drivers aren't as wonky anymore.
In general, I agree that rendering with strips would be a better idea - bcoyleAdventurerI just tried the Unity integration from v0.3.1 - I'm not sure if it's just me but I get really strange warping of the image when I look left/right, and even stranger warping/squishing when I tilt my head looking straight, left/right. Any ideas how I might remedy this?
Loading the older Unity integration fixes it. "tlopes" wrote:
Even if you have to use degenerate triangles it's much more efficient to use a strip than a list:
It's true the list version will have more indices. However a well optimised triangle list will use the vertex cache far better than a strip, resulting in less vertex shader calls.
Here's an article on grid rendering: http://www.ludicon.com/castano/blog/2009/02/optimal-grid-rendering
From there:Method ACMR ATVR
Scanline 1.062 1.882
NVTriStrip 0.818 1.450
Morton 0.719 1.273
K-Cache-Reorder 0.711 1.260
Hilbert 0.699 1.239
Forsyth 0.666 1.180
Tipsy 0.658 1.166
Optimal 0.564 1.000
The Forsyth method on there (third best) is by Tom Forsyth, who works at Oculus now. :)
http://home.comcast.net/~tom_forsyth/papers/fast_vert_cache_opt.html- RayJayHonored Guest
"bcoyle" wrote:
I just tried the Unity integration from v0.3.1 - I'm not sure if it's just me but I get really strange warping of the image when I look left/right, and even stranger warping/squishing when I tilt my head looking straight, left/right. Any ideas how I might remedy this?
Loading the older Unity integration fixes it.
That's the first thing I noticed after I downloaded the SDK yesterday, even after running the new OculusConfigUtil and updating the settings it is still out of whack in the Unity Demo. the Tuscany C++ WorldDemo warps correctly.
To be honest I don't like the new Warping method yet, I get better results with the old Pixel Shader and increased Back Buffer.
If I increase the Pixel Density ( Back Buffer ) in the Tuscany C++ demo to get less blur, but then the Warping falls apart, above 1.5 and the Warp grid explodes into a mess of triangles. :? - tomfExplorer
Since you guys are rendering a 2D grid for the Distortion Mesh, I'm almost entirely certain that you could be using a TRIANGLESTRIP (8194 indices required, not counting strip/row restarts) rather than a TRIANGLELIST (24576 indices required). This saves a massive amount of index buffer space and is likely faster to render the distortion mesh.
We render the triangles in a Morton (zig-zag) ordering to keep the texture caches warm. This is extremely important - using a naive "scanline" ordering (which would be strip-friendly) adds 25% to the time taken to render the distortion. The Morton order also helps the vertex cache reuse as noted, though it's probably not a big factor in the performance.
Quick Links
- Horizon Developer Support
- Quest User Forums
- Troubleshooting Forum for problems with a game or app
- Quest Support for problems with your device
Other Meta Support
Related Content
- 4 years ago
- 9 months ago