Wednesday, April 23, 2014

vogl Windows port, new regression test system, new vogl_chroot repo

Windows Port Progress


John McDonald has officially begun the Windows port of vogl. The voglcore lib and voglgen (our code generator tool) are now running on Windows as of this morning!


New Regression Test System


vogl has a shiny new tracing/replaying/trimming regression and smoke test system written by Mike Sartain that runs the following steps on a library of traces:
  • Plays back either an apitrace or a vogl trace, captures its output using the libvogltrace SO, and records the backbuffer CRC's (or per-component checksums on traces with multisampling) to a text file.
  • Plays back this trace and diffs the backbuffer CRC's vs. the CRC's seen during tracing.
  • Finally, we trim the test trace, then play back the trimmed trace and compare the backbuffer CRC's vs. the original trace's CRC's. Trimming involves playing back the test trace up to a predetermined point, capturing the entire GL state vector to memory and serializing it out, so we get a lot of good coverage in this step.
The system is located in the test directory of vogl's chroot repository, here. The script that runs the test is run_tests.sh. (This little script actually compiles and launches a small .C file that contains the entire test system.) The file tests.json configures which traces are tested and the parameters to the various test steps. 

Currently, only our smallest traces (from the g-truc 3.x suite) are pushed up to vogl_chroot. We also have many GB's of game traces (please drop me a message if you would like these traces). It's pretty easy to add your own traces - I'll be documenting how on vogl's wiki this afternoon.

Here are some shots of it in action on our dual Xeon (20 core/40 HW thread) test machine, using "/run_tests.sh -j 6" - to spawn up to 6 parallel processes at a time vs. the default 4:



Interestingly, the limiting scaling factor on this system seems to primarily be GPU video memory, not raw CPU or GPU performance. Metro Last Light, TF2, and DotA2 each can use ~1 GB of VRAM (and we only have a 3GB 780 Ti on this system). We don't try to order the trace replay order in any particular way to optimize overall throughput, which would be a nice addition.

vogl_src and vogl_chroot repos

Thanks to Carl Worth (Intel OTC) and Sir Anthony for submitting some patches to help us break up the previously huge vogl repo into two smaller repos. The primary one on github contains only the (buildable) source:

And vogl_chroot is the optional portion we use internally to simplify building and testing vogl:

You don't strictly need vogl_chroot, but beware you'll need to manually figure out the build dependencies if you don't. Building both 32-bit and 64-bit vogl without using the chroot approach can be a huge pain due to sometimes unresolvable/obscure i386 vs. amd64 system dependency issues. (If you disagree, I claim you haven't tried to actually do it. And no, gcc-multilib is not enough.)

Next Steps


We've been supplied with more test traces from various teams working on titles that will be released later this year on Steam Linux. (Hey - if you're working on a new GL game or port, feel free to send us more traces!) Also, Rad Game Tools just provided us with a fresh drop of Bink video, which now supports using compute shaders to massively accelerate video decoding. I'll be adding support for its GL 4.x callstream next week.

Friday, April 4, 2014

vogl support for Unreal Engine 4

We're extremely excited that Epic is porting Unreal Engine 4 to Linux -- see the official announcement or some press here and here. Once we heard UE4 Linux was coming we pretty much dropped everything to ensure vogl can handle UE4 callstreams. The latest code on github now supports full-stream tracing/replaying and trimming of UE4 callstreams in either GL3 or GL4 mode. UI support for UE4 is still in the early stages, but now that we can snapshot/restore UE4 and continue to play back the callstream without diverging it's only matter of time before the UI comes up to speed.

UE4's OpenGL renderer is the most advanced we've worked with so far. It has provided us with valuable real-world test cases for several modern GPU features we've not had traces to validate our code against, such as compute shaders and cubemap arrays. We'll be making UE4 GL callstreams part of our regression test suite going forward.

Here are some shots of a trace of UE4's test game being replayed in voglreplay64's --interactive mode (which relies on state snapshotting/restoring):




Here's a trimmed trace loaded in the editor:


Known problems:

  • UI: Peter Lohrmann just added a dropdown that lets you select which context's state to view. This code is hot off the presses and is a bit fiddly at the moment. Also, UE4 uses several texture formats that the vogl UI can't display right now (LunarG is helping us fix this, see below.)
  • Snapshotting UE4 during tracing is currently unsupported (but snapshotting during replaying works), because the tracer can't snapshot state while any buffers are mapped. (We also have this problem with the Steam Big Picture renderer.) We have a fix in the works.
  • We're seeing several query related warnings/errors while snapshotting and replaying UE4 callstreams. (This problem is in vogl's replayer, not UE4.) These need to be investigated, but they don't seem to cause the replayer to diverge.
  • There are several "zombie" buffer objects that have been deleted on one context but remain bound on another, which causes the snapshot system to report handle remapping errors on these objects during snapshotting. These buffers don't appear to be actually referenced after they are deleted, so this doesn't cause the replay to diverge. We've got some ideas on how to improve vogl's handling of this scenario (which is unfortunately very easy to do by accident in GL).

Other news:

LunarG has provided us with the first drop of their universal OpenGL texture format converter/transformer module, which will be going open source soon. This module allows us to convert any type of OpenGL/KTX texture data to various canonical formats (such as 8-bit or float RGBA) in a driver independent manner, with the optional transforms we need to build a good texture/framebuffer viewer UI. The current vogl UI uses some temporary and very incomplete stand-in code to convert textures to formats Qt accepts, so we're really looking forward to switching to LunarG's solution.

Finally, John McDonald recently joined Valve and the SteamOS team and is currently getting up to speed on the vogl codebase.