Monday, 24 September 2012

Microstutter Exposed - all hail 'Inside the second'

Earlier this year I was fortunate enough to pick up a Dell U3011. This wonderful monitor was great in every way, except one – its huge resolution was simply too much for a single 570 to handle – especially in Battlefield 3. Not being able to afford a 7970 or a GTX 670/680 I picked up pair of watercooled GTX 480’s. The 1.5GB of framebuffer and combined 960 CUDA cores provides a massive amount of power for very little outlay, and once watercooled the beastly heat output is easily tamed.

With this setup (and a 2500K @ 4Ghz) Battlefield 3 skips along in High at a steady 60fps (Adaptive VSync active), and can even be run at Ultra at 60fps for 99% of the time. I know this, because I always run the MSI Afterburner overlay whilst playing, and regularly run fraps benchmarks.

Previously, playing in high, the game has been very smooth, with the occasional bit of random stutter, but nothing to detract from the (fantastic) gameplay, and average framerates always being 60fps (capped by Vsync remember). In Ultra mode, framerate again averages at 60fps, but in a lot of scenarios it just doesn’t feel smooth. With the very minor visual difference between High and Ultra, I was happy to settle for High.

That all changed last week with the release of the Armoured Kill DLC. Perhaps this is purely down to the size of the levels in the DLC, but performance at High has plummeted to what I consider unplayable. However, this was in no way reflected in the FPS. A 120s run on Bandar desert (the worst offender) showed a MinMaxAve of 55/61/60. vRAM use stayed at around the 1.2GB level – so well within limits.

Clearly pure FPS benchmarks weren’t telling the whole story.

The basic premise is that there are 1000ms in 1 second, and thus at 60fps, 1 frame is generated every 16.66ms (and for example, at 30fps 1 frame is generated every 33.33ms).

In a perfect setup, in one second 60 frames would be generated, one frame exactly every 16.66ms and then delivered to the monitor. In real life though this never happens, as some frames (large explosions for example) take longer than others to be drawn. If we take a 1 second snapshot of a game that is perceived to be smooth running by the user we might see 60 frames delivered in this time (60fps), and a range of frametime delivery between say, 8ms and 25ms. These figures are within the limits of what we’d consider tolerable, as long as the larger frametimes don’t occur too often, as frames are being delivered with sufficient speed (even 25ms, the lowest figure, is still equivalent to 40fps) and also, approximate regularity.

So, to summarise, for smooth gameplay we are now looking at two factors. 1 – how many frames are delivered per second, and 2 – the regularity of frame delivery.

In powerful single card scenarios, this isn’t usually a problem, however with SLI, where two cards employ Alternate Frame Rendering to render those 60fps (one card does the odd frames, the other does even), we have an issue well known to most hardware enthusiasts – Micro stutter. This occurs because for one reason or another one of the cards takes a long time to render a frame, and the other card must wait for this to happen before it can deliver the next frame. Simple benchmarks will tell you you are achieving 60fps, but because within a single second you may have to wait 50ms for one frame to render (equivalent to 20fps) and then have the next frame delivered within 8ms (equivalent to 125fps), what is actually perceived by the user is a giant lag while we wait for the 50ms frame to render, followed by a smooth experience while the 8ms renders. If this happens enough (say, several times a second) we get a high average frame rate, but a stuttery visual experience.

Let’s take a look at this in a way that is easier to digest. Below is a graph recording 120 seconds of gameplay on the Bandar Desert map. In this time the SLI 480’s rendered 7141 frames – equivalent to 59.51 fps (7141/120). Sounds like it would be smooth doesn’t it?

It didn’t feel like it, and the graph shows us why. 

We can see that most of the frames are rendered between about 15ms and 23ms – and in this space we have nice smooth gameplay due to 1. A high number of frames delivered per second and 2. Fairly regular delivery of these frames. Looking closer though we have 574 frames that took longer than 25ms to render, 223 frames that took more than 30ms to render, 120 frames that took more than 35ms to render, and even 74 frames that took more than 40ms to render. This jitter in the regularity of frame delivery is what causes the choppy gameplay I’ve been experiencing and for whatever reason, in the latest DLC, this issue is far more pronounced. 

Ultimately, as Scott Wasson has shown irregularity of frametime delivery is something none of us can escape. However, the best way to mitigate it, is by investing in single fast GPU, rather than two mid range cards - both setups might achieve 60fps, but it will only feel and look like 60fps on one.

No comments:

Post a Comment