Sunday, January 8, 2012

Is Your Gaming Laptop's RAM Slowing It Down?


We look at the effect of memory bandwidth and clockspeed on gaming performance.

The mystique of adding RAM to a system to “increase performance” is often misunderstood by the average person. Most think that if their seven-year-old Windows XP build is getting slow, doubling the RAM from 2GB to 4GB will speed it up. Any PC tech worth his Pringles knows that won’t do much for Windows XP performance. Generally, it’s very easy to hit the point of diminishing returns with system RAM.  But there’s one bad pattern we’ve been seeing in many of the notebooks with integrated graphics lately: configuring RAM for the minimum system bandwidth.
If you’re a browser jockey, that’s not a huge issue but if you play any games that rely on the graphics card, that configuration can hobble your performance if you’re trying to play games. To see what the situation is, we decided to take a typical modern notebook and see the impact of system bandwidth on gaming. Read on.

A short history of integrated graphics

The issue at hand is how integrated graphics accesses RAM vs. a traditional discrete card. A discrete card in a notebook has its own dedicated pool of RAM. Besides offering far higher data rates from using GDDR5, a discrete GPU’s RAM usually runs at much higher speeds since they are soldered directly to the board the GPU rides on and the wires run straight to a dedicated, very wide, high-speed memory controller. While the width of the memory controller and the speed of the RAM has changed, discrete GPUs have pretty much been the same.
Integrated graphics, on the other hand, have changed quite a bit since introduction. Initially “integrated graphics” meant the graphics core was a discrete chip and RAM was soldered to the motherboard and connected to the CPU via PCI. This eventually moved directly into the core logic chipset itself with chipsets from SiS, VIA and Intel’s 810 “Whitney.” Instead of relying on RAM on the motherboard, core logic chipset-based graphics mostly use main system memory which is far cheaper to implement. We say mostly, because both AMD and Nvidia have tried to ameliorate memory bandwidth and size issues by adding internal cache to the integrated graphics component. Those solutions have mostly been outside the mainstream though. Integrated graphics has always been about making it as cheap as possible.
It doesn’t get any cheaper than integrating it directly into the CPU itself. This theoretically lowers the cost of the chipset, the overall cost of the system and conserves power too. Intel’s Clarkdale/Clarksfield and Sandy Bridge CPUs and so do AMD’s Llano and Brazos APUs. Despite this technology step forward though, integrated graphics still suffer greatly in one area: memory bandwidth. A dual-channel DDR3/1333 setup, for example, offers a theoretical bandwidth of 21.3GB/s. Compare this to a stock clocked GeForce GTX 560 Ti which has 128GB/s of bandwidth on tap and the top-end GeForce GTX 580 which takes it 192.4GB/s. Mobile GPUs don’t offer quite  the same amount of bandwidth but the  GeForce GTX 580M mobile part is moving along 96GB/s. It’s not always the case, but generally discrete parts offer boatloads more memory bandwidth.
Memory bandwidth isn’t everything in the graphics equation but it does matter quite a bit. So when we started seeing integrated notebooks with two memory slots and only one of those populated we scratched our head and wondered how much it hurt performance.
To find out, we took a Toshiba Portege R830 which was equipped with two SO-DIMM slots but only one Samsung 4GB SO-DIMM DDR3/1333 module running in single-channel mode. Making our test even more interesting, the notebook oddly was running 32-bit Windows 7 Professional so it couldn’t even address more than 3.5GB anyway. We ran the Portege in three different configurations. The first was the stock 4GB of single-channel DDR3/1333. The second was with a standard Corsair DDR3/1333 kit of two 4GB SO-DIMMS in dual-channel mode. The third was a new take on modules hitting notebooks: overclocked modules. Unlike performance desktops that give you control over what frequency you want your RAM to run at, the vast majority of notebooks have no such BIOS control – they rely solely on what the SPD or serial presence detect chip on the memory to set the speed. Overclocked RAM, such as the Kingston HyperX DDR3/1866 modules we used for our test, tells Sandy Bridge-based notebooks to run the RAM at DDR3/1866 even if you have no way to set it in the BIOS (the Portege, for example, did not).
For our test, we reached into the dust bin for several older benchmarks including Quake III, Quake IV and 3D Mark 2006. We also used some newer benchmarks such as Resident Evil 5 and Dirt 2. To see the actual theoretical memory bandwidth, we ran Sisoft Sandra 2012 as well.

The upshot:

We saw worthwhile performance increases going from single-channel DDR3 to dual-channel DDR3. We have to reiterate that even though there is a memory size difference here, it has minimal impact since we are running 32-bit Windows 7. The extra RAM adds nothing, it’s really about the memory bandwidth.
Far more dated 3D workloads, where the barrier isn’t the actual performance chip itself, we saw very significant performance gains of 29 percent. Going to DDR3/1866 saw that go to 38 percent when the two are compared. As we move from Quake III to Quake IV the frame rates from the feeble integrated graphics plummet but the performance spread from adding bandwidth is about the same.
With more of a graphics load from 3DMark 2006, we saw the spread drop a bit but still maintain a healthy 21 percent and 33 percent difference from adding more bandwidth. That’s not bad, but this is 2012.
But once you get to something far more modern such as 2009’s Dirt 2, the performance impact from single-channel to dual-channel closes up to about 3.6 percent. We didn’t expect it but moving to the DDR3/1866 modules gave the game a pretty substantial bump of about 18 percent. That really isn’t bad, but certainly not magical. You’re basically looking at 35 fps vs.  42 fps with a more modern workload. It just reiterates that you can’t magically make an integrated graphics part twice as fast by adding more memory bandwidth when running modern workloads.
Again, it’s very much about what is holding you back, the graphics core or the memory bandwidth. To illustrate our point, we ran 2009’s Resident Evil 5 at an Xbox “HD” resolution of 1280x720 in DX9 mode with the textures set to high. With the integrated Intel “HD” Graphics 3000 core in the 2.7GHz Core i7-2620M, it isn’t hard to swamp it. We still see about 12.5 percent more frames with the dual-channel configuration and 24.2 percent bump running the DDR3/1866 modules. That’s 27 fps in single channel vs. 34 fps in dual-channel. With a few tweaks though, we can get the frame rates up. Running at 1024x768 with the texture level set to low, we see the frame rates pop up nicely and a 26.4 percent bump going from single-channel to dual-channel and the overclocked RAM giving us a very nice 37.7 percent increase or 40 fps in single channel vs. 58 fps with the DDR3/1866 modules. Since you’ll likely have to crank down the image quality levels anyway, that frame rate bump can help in gaming.

What about system bandwidth?

To find out, we ran the synthetic memory bandwidth test in SiSoft Sandra 2012. Dual channel gave us – no surprise – nearly a 100 percent increase over single-channel. Those hot DDR3/1866 modules opened it up to a 167 percent increase in available memory bandwidth.
In the final analysis, we think it’s well worth running your new laptop in dual-channel mode if you are chasing 3D performance.  The cost of the Corsair kit is essentially a steal today with an 8GB SO-DIMM kit (you would only need one module if your notebook already has a single SO-DIMM in it so cut the price even more) at $35 after rebate. For the overclocked RAM you’ll have to think a bit harder. The Kingston Hyper X kit we used fetches about $120 online. You’ll have to justify its use at that price but you will definitely see a frame rate advantage from it. How much depends on the graphics load. But then again, maybe it would have been a better idea to get a notebook with a discrete graphics part in it. But if you can’t and you’re unsatisfied with the gaming performance increasing the memory bandwidth is definitely a route worth exploring.





Benchmarks
4GB 1x4 DDR3/13338GB 2x4 DDR3/13338GB 2x4 DDR3/1866
Memory ModeSingle ChannelDual ChannelDual Channel
Quake III "High-Quality"231298320
Quake IV "High-Quality"40.150.857.6
3DMark 20063,8194,6485,083
Dirt 2, 10x7, Ultra Low35.536.641.8
Resident Evil 5, 12x7, DX9, AA Off, Motion Blur Off, High Textures, Variable Benchmark27.3
30.7
33.9
Resident Evil 5, 10x7, DX9, AA Off, Motion Blur Off, Shadow, Texture and Overall set to Low, Variable Benchmark43.555.059.9
SiSoft Sandra9.2GB/s18.1GB/s24.6GB/s