DLSS vs FSR vs XeSS Input Latency TESTED
|Upscaling tech has become incredibly common to find in games – so much so that all three GPU manufacturers offer their own solutions to do just that. Generally speaking, NVIDIA is the front runner here with their closed source, proprietary option called DLSS (deep learning super sampling). It’s in its third generation, with the big new feature being “Frame Generation”. I won’t bore you with the details, but in short it basically looks at the last frame, the next frame, then creates an intermediary frame to quite literally double your framerate. The catch to this is that it has to know what the next frame looks like, which means it has to hold back that new frame, draw its intermediary frame, render that, THEN render the actual new frame. That adds latency, and that’s what I want to test here. Now AMD does have their own upscaling tech, called FSR or FidelityFX Super Resolution, which is open source and works on any GPU – including the RTX 4090 I’m testing with here. It doesn’t use any type of AI magic, which helps make it more accessible, at the cost of some quality in certain scenarios. Intel too now offers XeSS, or Xe Super Sampling, which does use a neural network, but it too is open source.
Before jumping into the results, I should explain why I don’t have any results for DLSS Frame Generation in Cyberpunk 2077. I’ll just play the clip, I think you’ll understand. Basically, with Frame Generation on, the game became unplayable. It seemed to render frames in small batches, then halt, then another batch. The whole game engine seemed to lock up too, as inputs were completely missed. If it’s not obvious, this meant I couldn’t get it to register any inputs, therefore couldn’t measure the latency. I don’t know if this is a problem isolated to my test system – I did reinstall the game and the latest GeForce drivers and neither fixed it so I’m not sure.
But, for the results that I could get with Cyberpunk, how did they fare? Well, at stock with no upscaling tech running, on my test system and using a Gigabyte M32Q at 165Hz, LDAT reported 35.76 ms of total system input latency. Sticking DLSS on auto drops that down to 26.14 ms, DLSS on ultra performance drops it further to 24.65 ms, AMD’s FSR on auto pretty much matches DLSS on auto with the slightly higher 26.7 ms, but interestingly with FSR on ultra performance the latency actually increased to 30.38 ms. That’s still over 5 ms faster than with no upscaling tech though, and for what it’s worth I am testing with an NVIDIA card. Still, those results might not be quite what you expected to see, I mean the whole point of upscaling tech is it takes a lower resolution frame, then upscales it once it’s finished rendering. That sounds an awful lot like added latency, and it is, but the key point to see here is that the stock frame ends up taking longer to render than the lower resolution frame that gets upscaled.
So that’s Cyberpunk, but what about a game with all three technologies, including working Frame Generation? Look no further than Hitman 3! This one was a bit of a challenge to measure latency on, I ended up having to use my old methodology of a 1000FPS camera and a mouse with an LED soldered to the left click switch, but I kept a consistent measurement pattern and used the entire clip in the Silverballer. So, how did they perform? Well rather interestingly, all except DLSS Frame Generation offered lower-than-stock latency. Some were more impressive than others though. FSR on the Quality mode actually offered the best latency by far at just 45 ms, down from 60 ms without any upscaling. That’s a sizable improvement – albeit for a game that isn’t exactly a twitch-shooter.
Intel’s XeSS – which they do explicitly list as ‘better on Intel GPUs’ – is the worst performing here in upscaling alone. The “Ultra Quality” mode, the same as “Quality” on both FSR and DLSS, only drops the average latency by 3 ms, and the most performant mode, “Performance”, only drops that another 1 ms. Compare that to DLSS which on Quality ran 2.5 ms faster than XeSS Performance, or DLSS Ultra performance a further 2 ms faster. Interestingly, again FSR’s “Ultra Performance” mode actually increased the latency compared to the quality preset – still well below stock, XeSS and even DLSS Quality modes, but considerably slower than FSR Quality.
The real interesting one for me is DLSS Ultra Performance with Frame Generation enabled. This was a lot closer to stock than I was expecting. It’s only 2 ms slower on average, admittedly that is 10 ms slower than just running DLSS Ultra Performance on its own, but to go from around 100 FPS at stock to over 200 FPS with it enabled, it certainly provides a smoother visual experience. There is one major catch to Frame Generation – beyond the latency disadvantage – which is that it’s not input-aware. As in, if you click your mouse to shoot, the generated frames won’t know because they don’t come from the game engine; they are just interpolated frames between the last full frame, and the next. That does mean the latency is likely to be higher, and can be made worse depending on how the game engine handles inputs. It’s far from a perfect solution for sure.
Of course, both of the games I could get working to test with here aren’t exactly latency-intensive, but they do echo results I’ve done with older versions of DLSS, and I think give a bit of an insight into how the different technologies operate. It’s clear that DLSS is a fast process. Having a lower resolution frame upscaled in the same or less time than a higher resolution one shows the upscaling process isn’t locked to how much upscaling it has to do. FSR, by comparison, seems to be a more intensive task as in both titles the latency went up when using the higher performance mode. Still, seeing FSR Quality offer such a low result in Hitman – even on NVIDIA hardware – is incredibly promising considering how open their solution is by comparison. Intel’s XeSS didn’t fare quite as well, although as I mentioned, it does specify it’s for Intel GPUs, but has a fallback option for AMD and NVIDIA cards, so its results here might not be its absolute best showing. But considering how many people have Intel Arc cards, this might very well be the experience you can expect.
As a final note, if you want to be able to test stuff like this yourself and you don’t happen to have NVIDIA’s LDAT tool, I’m building a fully open source latency testing tool to complement my open source response time tool. It’s still a little while off yet, but if that’s something you’d be interested in, head over to OSRTT.com and drop your email in the mailing list box at the bottom of the page. I don’t share your email with anyone, and you will only hear from me when I actually have something to share rather than some pointless weekly update.