Forum Discussion
msat
13 years agoHonored Guest
Partially/fully pre-rendered ray tracing...
I'm sure you're all aware of the image quality that can be had from ray tracing. While pulling it off with impressive results in real-time is becoming more of a reality, it's still some ways away, and even then it's not quite on par with the quality we see in big budget movies today. But what if we could view pre-rendered scenes with limited 6DOF and in 3D? Well, that's the whole purpose of this post! :D
I have been interested in the concept of pre-rendered still and animated stereoscopic "panoramas" for the past several months now, and have been thinking about how it could be accomplished. Well, I figured I would share what has been on my mind. It's my understanding there are no implementations of panorama viewers that allow for both, 6DOF and stereoscopic viewing. As far as I know, the method I'm going to describe has not been done before, and while I unfortunately don't have the skills to implement an example myself, I hope someone might find these thoughts interesting and useful enough to give it a shot. :)
Off the top of my head, some of the applications where this could be useful is something as simple as sitting on an extremely detailed beach, to interactive media with limited or no real-time dynamic visuals such as certain types of adventure games, to being a "fly on the wall" in a Pixar movie. The primary benefit of this is the quality of visuals that can be achieved without having to do it in real-time. Well, at least not having to render the entire scene in real-time.
The most appropriate and descriptive name for the method that I can think of is 'light-field cube ray mapping'. Maybe that sounds like nonsense but bear with me for a moment. Lets say you wish to view a scene from the vantage point of a person sitting on a stool in the middle of a CG room. Now imagine enclosing that person's head in a virtual glass box that's big enough to allow for comfortable but limited head movement in all directions (rotation and position). This virtual box will form the basis of both the light-field cube camera during the pre-rendering of the scene, as well the area we will need to perform ray lookups during run-time.
There's no specific way pre-rendering the scene and capturing the data to the light-field cube needs to be performed, but I'll describe the way I had in mind. Each face of the cube contains a finite array of elements somewhat similar to pixels, but instead of recording just a single color value, it captures the angular data of the light rays entering it as well as their color information. Each element must likely be able to "capture" more than one ray (though in practice sometimes it may capture none). What you end up with is 6 faces of the cube (you don't necessarily have to do all 6 faces) with all the various light rays that entered it during pre-rendering mapped along the array of surface elements (ray maps). I just want to point out that one consequence of this approach is that the ray tracing engine for the pre-render phase would need to start from the light source, rather than the common method of starting from the camera and following it to the source in reverse. As you can probably imagine, capturing video would essentially create constantly changing ray maps.
In order to view the scene at run-time, we start with a typical ray tracer, where a ray extends outwards from the camera viewport, but it only intersects a single object - an element on the inside surface of the cube - and performs a lookup for a ray of the matching angle. The performance of this method will heavily depend on the efficiency of the lookup algorithm, but an optimized system should be substantially faster than a typical ray tracer for a given detail level. Of course, the drawback is that dynamically drawn elements are pretty much impossible unless you also incorporate aspects of traditional 3D engine which you could use for the purpose of rendering certain elements in real time.
You can take this concept a step further and fill an entire scene (or at least the areas the viewport can be) with the light-field cube cameras, and traverse from one light-field cube to the next in real time. This would also make rendering dynamic elements more feasible.
For content that wouldn't be affected by these limitations, the visual quality it could produce at a given performance level might be hard to achieve any other way.
I have been interested in the concept of pre-rendered still and animated stereoscopic "panoramas" for the past several months now, and have been thinking about how it could be accomplished. Well, I figured I would share what has been on my mind. It's my understanding there are no implementations of panorama viewers that allow for both, 6DOF and stereoscopic viewing. As far as I know, the method I'm going to describe has not been done before, and while I unfortunately don't have the skills to implement an example myself, I hope someone might find these thoughts interesting and useful enough to give it a shot. :)
Off the top of my head, some of the applications where this could be useful is something as simple as sitting on an extremely detailed beach, to interactive media with limited or no real-time dynamic visuals such as certain types of adventure games, to being a "fly on the wall" in a Pixar movie. The primary benefit of this is the quality of visuals that can be achieved without having to do it in real-time. Well, at least not having to render the entire scene in real-time.
The most appropriate and descriptive name for the method that I can think of is 'light-field cube ray mapping'. Maybe that sounds like nonsense but bear with me for a moment. Lets say you wish to view a scene from the vantage point of a person sitting on a stool in the middle of a CG room. Now imagine enclosing that person's head in a virtual glass box that's big enough to allow for comfortable but limited head movement in all directions (rotation and position). This virtual box will form the basis of both the light-field cube camera during the pre-rendering of the scene, as well the area we will need to perform ray lookups during run-time.
There's no specific way pre-rendering the scene and capturing the data to the light-field cube needs to be performed, but I'll describe the way I had in mind. Each face of the cube contains a finite array of elements somewhat similar to pixels, but instead of recording just a single color value, it captures the angular data of the light rays entering it as well as their color information. Each element must likely be able to "capture" more than one ray (though in practice sometimes it may capture none). What you end up with is 6 faces of the cube (you don't necessarily have to do all 6 faces) with all the various light rays that entered it during pre-rendering mapped along the array of surface elements (ray maps). I just want to point out that one consequence of this approach is that the ray tracing engine for the pre-render phase would need to start from the light source, rather than the common method of starting from the camera and following it to the source in reverse. As you can probably imagine, capturing video would essentially create constantly changing ray maps.
In order to view the scene at run-time, we start with a typical ray tracer, where a ray extends outwards from the camera viewport, but it only intersects a single object - an element on the inside surface of the cube - and performs a lookup for a ray of the matching angle. The performance of this method will heavily depend on the efficiency of the lookup algorithm, but an optimized system should be substantially faster than a typical ray tracer for a given detail level. Of course, the drawback is that dynamically drawn elements are pretty much impossible unless you also incorporate aspects of traditional 3D engine which you could use for the purpose of rendering certain elements in real time.
You can take this concept a step further and fill an entire scene (or at least the areas the viewport can be) with the light-field cube cameras, and traverse from one light-field cube to the next in real time. This would also make rendering dynamic elements more feasible.
For content that wouldn't be affected by these limitations, the visual quality it could produce at a given performance level might be hard to achieve any other way.
25 Replies
- cyberealityGrand ChampionInteresting concept. Not sure how feasible it is, but it sounds cool in theory.
- usb420Honored GuestSo you wouldn't be able to move theough the environment, only look around?
That sounds similar to google maps couldnt you just render the view through a fish eye lens, So you wouldnt need a lookup table or am I missing something? - jojonHonored Guest
"usb420" wrote:
So you wouldn't be able to move theough the environment, only look around?
That sounds similar to google maps couldnt you just render the view through a fish eye lens, So you wouldnt need a lookup table or am I missing something?
I believe it's a matter of getting proper stereoscopic views from any viewing angle, out of one panoramic image --think an environment cube map, only the image on each face is a light field, instead of the usual flat projection. This would allow him to calculate an image for the offset of each eye, "after the fact", so to speak. (EDIT: ...one could probably save on file size, by culling anything that falls outside a certain window.)
The OP might want to try some real-world experiments with one of these: https://www.lytro.com/camera, before going through all the work of adapting renderers... :7 - msatHonored Guest
"usb420" wrote:
So you wouldn't be able to move theough the environment, only look around?
That sounds similar to google maps couldnt you just render the view through a fish eye lens, So you wouldnt need a lookup table or am I missing something?
You would only be able to move within the confines of the cube, but as I mentioned, you could have multiple cubes butted up against each other, so you could move from one cube to the next. Theoretically, you could fill the entire area with these cubes if you wanted, which in essence is a pre-render a ray traced scene (though of course the results would be data intensive). As jojon pointed out, it allows for real stereoscopic viewing at any position and angle within the cube, which standard panoramas cannot do. Again, it's useful for content that can work within the constraints of the method, and is not meant to replace every other rendering tech.
@jojon
It was the lytro camera that first brought to my attention the concept of light field (even though the concept is quite old itself!) :) Rudimentary tests could be performed with one, but it has quite low spatial resolution. Might work if you're simulating a head the size of an ant's.
I didn't mention it as my post was verging on tl;dr territory already, but with suitable light-field cameras, real scenes could also be captured. Technically, when all is said an done, they're really no different than pre-rendered scenes except in the way they're captured. The content is still viewed the same way.
I agree that certain data could be culled (such as rays with really small angles relative to the surfaces of the cube walls). Compression is also a must - particularly if the scene is animated or you have a high cube count. - usb420Honored Guest
"jojon" wrote:
"usb420" wrote:
I believe it's a matter of getting proper stereoscopic views from any viewing angle, out of one panoramic image --think an environment cube map, only the image on each face is a light field, instead of the usual flat projection. This would allow him to calculate an image for the offset of each eye, "after the fact", so to speak. (EDIT: ...one could probably save on file size, by culling anything that falls outside a certain window.)
I agree that pre rendered or real imagery is interesting in VR
What if you were to just render the stereo views
something like this
http://www.stereomaker.net/panorama/ana/cherry01.htm
just trying to understand the benefits of using the light field method. - msatHonored Guest
"usb420" wrote:
"jojon" wrote:
"usb420" wrote:
I believe it's a matter of getting proper stereoscopic views from any viewing angle, out of one panoramic image --think an environment cube map, only the image on each face is a light field, instead of the usual flat projection. This would allow him to calculate an image for the offset of each eye, "after the fact", so to speak. (EDIT: ...one could probably save on file size, by culling anything that falls outside a certain window.)
I agree that pre rendered or real imagery is interesting in VR
What if you were to just render the stereo views
something like this
http://www.stereomaker.net/panorama/ana/cherry01.htm
just trying to understand the benefits of using the light field method.
Because at best those solutions can only offer 3DOF (if that). 6DOF is not possible, or at least not in a practical way. - geekmasterProtege
"msat" wrote:
It's my understanding there are no implementations of panorama viewers that allow for both, 6DOF and stereoscopic viewing. As far as I know, the method I'm going to describe has not been done before, and while I unfortunately don't have the skills to implement an example myself, I hope someone might find these thoughts interesting and useful enough to give it a shot. :)
You can film 6DOF video with 12 cameras (minimum), but you need to interpolate between any adjacent three of them to provide all of roll, pitch, and yaw. You can even provide for small amounts of translation by interpolating on the x/y/z axes, limited by your inter-camera distance. Stereoscopic views are created from the overlapping portions of the right side of a left-most camera and the left side of a right-most camera, with camera selection determined by viewpoint. Using triangular overlap from three cameras allows you to fill in (or remove) occlusions for each eye. I am not aware of any specific implementations, but I have put plenty of thought into it and I am certain that I can accomplish this given sufficient time and resources. But base on past experience, somebody I am not yet aware of has probably already done this or is working on it now.
Regarding light-field rendering, that sounds very interesting. Light-field cameras typically employ many more "cameras" (typically implemented as a multi-facet "fly eye" lens on a large image sensor).
My experience with pre-rendered ray-tracing was only to give surface color to voxel-based graphics. I love using pre-rendered ray-traced voxel objects, but they lack realism if they get too close to a distinct neighboring object that does not get reflected or rafracted by them. However, for non-reflective and non-transparent textures, where you can adjust their surface color a bit, they can look spectacular, depending on how you apply them. The Euclideon stuff keep fascinating me, but it will never achieve true realism due to lack of environmental responsiveness.
It is nice to see others who share common interests, and who provide extra interesting ideas such as light-fields rendering. :D
EDIT: Actually, you can use six cameras to film a "cube map" image, or you can even get away with as few as two cameras if they have 180-degree fisheye lenses on them. But that limits resolution in the overlap areas. More cameras also allows you to extract stereoscopic information from the outer areas of non-adjacent cameras. - PyryHonored GuestI've been working on exactly this, except I've been working with a spherical light-field surface rather than a cube. The points on the sphere are sampled by just placing and rendering normal cameras around the surface of a sphere (a normal perspective camera samples angularly, so by placing normal cameras in space you can sample the lightfield both spatially and angularly). The benefit of using this representation is that the view-synthesis side of things can be efficiently written into shaders, even with a large number of views.
Then, as you say, viewpoints within the sphere can be synthesized. The difficulty is that you need a lot of data to get reasonable results.
Here's an example of rendering with 600 sampled images: http://www.youtube.com/watch?v=mPIcP5AW1J0
If you have some information about the scene geometry, such as a coarse sphere map of the depth, you can do better.
Here's 320 views + depth information: http://www.youtube.com/watch?v=zAeNJ3WBc1w
80 views + depth information: http://www.youtube.com/watch?v=Q_GnSh_AVYQ
Note that the depthless version will work for any possible scene, including scenes with reflections, semi-transparent materials, and refractive materials. The version that uses depth information will have artifacts on those types of surfaces, but is much better looking on diffuse surfaces.
Also note that you can roll the camera as well, but I just didn't show it off in the videos. - mdkHonored GuestMaybe a bit offtopic, but are you familiar with the Brigade game engine? Brigade does real-time pathtracing.
It's pretty impressive what it can do, but it's still quite not there yet. I've been thinking that in VR a pathtracing solutions might work a bit better if we had eye tracking. That way it's possible to pump more samples to the areas that the player is actually looking at. Brigade also is an unbiased meaning that it doesn't fake things that much. I would rather have something that was 90% as good, but was biased and way faster. Maybe in a few years when Nvidia launches it's Volta series cards we might have enough horsepower to run things like this without noise.
http://raytracey.blogspot.co.nz/
http://www.youtube.com/watch?feature=player_embedded&v=VtLuStcTRXQ#! - msatHonored GuestFirst of all, great work, Pyry!! I seem to vaguely recall that scene, so maybe I had been aware of your work before and forgot? :oops:
I'm a bit surprised that view synthesis is more efficient on a spherical surface rather than a flat one, particularly when the view is not at the center. I'll take you word for it though. :)
I imagine capturing the light-field of a volume directly during the rendering process would be the most ideal, but you're just making do with the tools available to you, correct? I'm still trying to wrap my head around the concept and inherent difficulties of using cameras to capture the light-field like this. Do you set the projection plane for each capture to be along the surface of the sphere, and take progressive snapshots at varying FoV?
I'm curious about what you mean by 'depth information'. In what way is this information used for view synthesis? Are you calculating the light-field from the images in real-time, or did you compute a map ahead of time and use that for view synthesis? Also, what kind of frame rate are you achieving?
Lastly, do you have a blog or maybe some on-going forum thread on this? Of course, feel free to discuss it here if you'd like. I'd love to keep up on your progress and perhaps pick your brain on the topic a little. This is fascinating stuff! Maybe I'm overestimating this technique, but I think it could have some major implications.
Cheers!
Quick Links
- Horizon Developer Support
- Quest User Forums
- Troubleshooting Forum for problems with a game or app
- Quest Support for problems with your device
Other Meta Support
Related Content
- 13 years ago
- 2 years ago