Inigo Quilez   ::     ::  

Intro



Over the years I've written a few demos and engines that work in VR (mostly in the mid 2000s and again now in the mid 2010s). Since it's a hot topic again, I have had to explain to different people in different emails to different degrees of detail how I design my VR demos/engine. So, I thought it might be a good idea to stop doing so and instead drop here some basic advices and psuedocode that makes the points I usually try to get across. Now, this is how _I_ do VR, not necesarily how my employer or anybody else than me recommends it to be done. Also, I won't be discusing advanced topics such as multi GPU and multi context (obsolete since the death of SGI), late latching or foveated rendering.

Now, lets start!


Basics



Before I describe my engine design, some basic things you should take into account that you might miss if you come from a traditional engine design:



Design



Ok, now we are ready to have a look to how I have been organizing my VR engines and demos so far:

ComputeFrame( k ) { DoProcessFrame( k ); // Compute animation, sound, LOD computation, simulations DoRenderFrame( k ); // Render reflection maps, clouds layers, LUTs, global shadow maps for( int j=0; j < numDisplays; j++ ) // numDisplays = 5 for a CAVE, 1 for HMD (RIFT, Vive) { DoProcessEye( k, j ); // Do frustum culling, compile draw command/indirect buffers DoRenderEyeCommon( k, j ); // Render eye-shared shadow maps, and distant geometry with no parallax for( int i=0; i < 2; i++ ) // stereo { DoRenderEye( k, j, i ); // Render regular geometry and postpro effects with parallax } } }
This is the basic structure. The three contexts (frame, display and eye) are represented by k, j and i. The tasks that happen in each one of them is not totally rigid, and it depends on the specifics of the target configuration. For example, in VR based on Head Mounted Devices, LOD or Frustum culling can be done at frame granularity, while in a CAVE or systems with multiple displays (disjoing frustums) it might be worth doing it at a per display level. Also, depending on your engine, you might decide to do your culling in the GPU rather than the CPU. But overall, the structure of the pseudocode above is pretty useful.




You might be thinking that rendering objects at the Display (j) level and Eye (i) level might create one extra doubling of shader permutations since they might have to pull data from different uniform blocks (one to get the camera position for the display - somewhere near the center of both eye - and the other from the actual eye) to do proper lighting. You can solve this maybe with an extra uniform block that aliases one or the other somehow.

Conclusions



I cannot go in much more detail in this article, but if you want to see lower level implementation details to make this work fast, check this article on fast stereo rendering. Otherwise, hopefully you still got a big picture idea of the changes you might need to apply to your old mono-single-display-non-vr-boring-2d engine :)