It's that time of year again. A time I always dread, when familiar hardware and disk images are replaced with the frightening introduction of a completely new beast. New motherboards, new CPUs, new hard drives, and new and scary drivers. I never know what's going to happen. Earlier this year we butted heads with Skulltrail and eventually lost going back to what we were using before. Hopefully this time around will be a bit different.
In any case, while I'm in the middle of the changeover, I figured I would write a little something about our graphics test beds, what we look for in one, and how we set them up. It's always controversial and debated in many of our articles, so maybe it will make for some good discussion (or flame wars) here.
First off, in doing graphics tests for the purpose of comparing graphics hardware, we always use the highest end desktop system we can build. By using the fastest processors and memory, we eliminate bottlenecks in the rest of the system and reveal the maximum potential of any given video card. Looking at relative performance in this light will always provide us with better and more reliable information on which card is capable of higher performance. Adding in artificial performance limiters like lower end CPUs and RAM compresses our data and makes it more difficult to see what graphics solution is more desirable.
Even if the CPU in my home system is something low end, I'm still going to want to install the best option I can afford - the performance leader in games I like at my target price when choosing between brands / manufacturers. There are a lot of reasons for this, but a couple stand out to me. With higher graphics performance I should see less choppiness and higher minimums even if my CPU limits average frame rate. I could have more headroom for higher visual quality settings, so the higher performance part (even when CPU limited) should be more capable of playing near-term future games that might be more graphics than CPU limited even on a lower end CPU.
This is absolutely not to say that CPU and RAM aren't important considerations. There is definitely a place for tests that look at the performance of games on certain combinations of CPU and GPU hardware. But that is not something for a graphics hardware review.
Currently, what we do with independent CPU and GPU testing allows people to see where the limits would cross. Imagine I test a bunch of CPUs on the absolute highest end graphics card and see a range between 40 and 60 frames per second for MadeUpTestGame. Then, imagine I take a bunch of GPUs and test them on the absolute highest end CPU and see a performance range between 20 and 60 frames per second with the same MadeUpTestGame benchmark. If I know what CPU and GPU I have I can tell what framerate I should expect to represent my absolute maximum potential performance: the minimum score of a CPU tested with a high end GPU and a GPU tested with a high end CPU.
Now, I might be able to get more accurate information if I actually tested every combination of CPU and GPU, but that's a little out of the scope of a simple GPU launch article. If I only test with a lower end CPU, I will see a lot of the performance numbers get compressed and I will have a harder time extracting information that is useful for comparison purposes. If I test with a high-end CPU, someone with a lower end CPU can find performance information for that CPU and decide if the graphics cards will be overkill or will be a good fit. But that's a different issue than assessing the relative performance of graphics hardware.
So there's that. But what about building the test bed?
Switching hardware and software platforms can often lead to dealing with a lot of new problems. With the old hardware I've been testing on, I know what to expect, what problems constitute a system issue and what are probably a product issue. Even if my system isn't as reliable as I would like it to be, knowing what the issues are really helps in dealing with testing issues. So the first problem I run into is that I don't know what can and will go wrong. This makes troubleshooting take a bit longer than it should, but it's got to be done eventually.
Choosing components is simple: find the fastest thing we've got and shove it in a system. In this current case, that means I'm changing over to an as-of-yet unreleased motherboard and CPU, which makes the potential for problems even larger. The RAM and hard drive we will be using for graphics going forward are things we've already tested though: high performance OCZ DDR3 and an Intel SSD. Yes, the limited size of the Intel SSD will make it tough to get a lot of games on there, but the increase in boot speed and responsiveness of the system go a long way to making testing easier and better, and it should also minimize the impact of random hits to the disk while benchmarking.
As for setting up the system, after we install the 64-bit version of Vista (I really wish there were some other platform on which to game), we set about disabling all sorts of things to get the computer to a state that will allow for consistent testing. Turning features off isn't really so much about gaining performance as it is ensuring consistency. With the amount of things happening in the background with Vista, we see more fluctuations in benchmark performance from run to run. To get a fair comparison without having to run everything 10 times and average performance, we perform the following steps.
First off, we turn off and disable the side bar. Next, we open the security center where automatic updating and security center alerts are disabled. Then we disable user account control.
After a quick reboot (and disabling the welcome screen), we head to advanced system settings and disable system protection (system restore) and remote assistance. While there, we adjust performance settings (in the advanced tab) to best performance and we set the virtual memory page file to a fixed size (custom size with initial == maximum) of 1.5x the amount of ram in the system (though this time, with the limited size of the SSD and the vast amount of RAM in the system, our page file is set to RAM + 512MB).
Once done with that, we reboot and begin disabling the search indexing (by deselecting the folders that are indexed) and the screen saver, moving on from there to power settings. We select High Performance mode and further adjust these to not turn off the hard drive for 40 minutes and to turn off the display after 2 hours. I also like my start menu power button to turn the computer off rather than make it sleep, but that's personal preference.
At this point, any service packs are installed, then chipset drivers, then graphics drivers, then any other system drivers that are needed. After the billion reboots there and removing any backup files left from the service pack install (if we aren't using a slipstreamed disc), we get back to the process at hand: un-Vistaing Vista.
In no particular order, moving files to the recycle bin on delete is disabled, scheduled defragmentation is disabled, the desktop resolution is set to the max, and folder options are changed to show all hidden files. We even prevent the notification area from hiding unused icons and disable the start menu highlighting of new programs. Then it's on to a couple services we disable as well. SuperFetch and ReadyBoost are both disabled, SuperFetch because app launch times don't matter and we use multiple runs to get tests loaded into memory, and ReadyBoost because we are using an SSD and don't need it.
We used to also disable audio, but there are some games that don't run without audio support. Enabling and disabling audio is more trouble than it's worth. In games that have the ability to disable sound during testing, we do so, but if there is no option we do nothing.
Our desktop features shortcuts to batch files that delete the contents of the prefetch directory and run ProcessIdleTasks. However, with an SSD it isn't really necessary or desirable to run ProcessIdleTasks because of the fact that one of the idle tasks is defrag (which you don't want to run on an SSD anyway).
So that's about it as far as system set up goes. Well, after installing games and all that good stuff anyway. Right now we are also looking at updating our game suite. On the short list are: Far Cry 2, Crysis Warhead, Fallout 3, S.T.A.L.K.E.R. Clear Sky, Call of Duty World at War, and Brothers in Arms Hell's Highway. While I'm not sure if we will actually be able to incorporate all these games into our next round of graphics card testing, the first games we drop will be ones that are precluded by these new ones: Fallout 3 will replace Oblivion and Crysis Warhead will replace Crysis.
I'd love to be able to test 20 games for every graphics hardware review, but it's just not possible to do that kind of testing under normal circumstances. We will do our best to evaluate games and pick the ones that make the most sense going forward.
Oh, and I can't wait until I can talk more about what is actually in this new graphics test bed. It's pretty freaking sweet :-)
In any case, while I'm in the middle of the changeover, I figured I would write a little something about our graphics test beds, what we look for in one, and how we set them up. It's always controversial and debated in many of our articles, so maybe it will make for some good discussion (or flame wars) here.
First off, in doing graphics tests for the purpose of comparing graphics hardware, we always use the highest end desktop system we can build. By using the fastest processors and memory, we eliminate bottlenecks in the rest of the system and reveal the maximum potential of any given video card. Looking at relative performance in this light will always provide us with better and more reliable information on which card is capable of higher performance. Adding in artificial performance limiters like lower end CPUs and RAM compresses our data and makes it more difficult to see what graphics solution is more desirable.
Even if the CPU in my home system is something low end, I'm still going to want to install the best option I can afford - the performance leader in games I like at my target price when choosing between brands / manufacturers. There are a lot of reasons for this, but a couple stand out to me. With higher graphics performance I should see less choppiness and higher minimums even if my CPU limits average frame rate. I could have more headroom for higher visual quality settings, so the higher performance part (even when CPU limited) should be more capable of playing near-term future games that might be more graphics than CPU limited even on a lower end CPU.
This is absolutely not to say that CPU and RAM aren't important considerations. There is definitely a place for tests that look at the performance of games on certain combinations of CPU and GPU hardware. But that is not something for a graphics hardware review.
Currently, what we do with independent CPU and GPU testing allows people to see where the limits would cross. Imagine I test a bunch of CPUs on the absolute highest end graphics card and see a range between 40 and 60 frames per second for MadeUpTestGame. Then, imagine I take a bunch of GPUs and test them on the absolute highest end CPU and see a performance range between 20 and 60 frames per second with the same MadeUpTestGame benchmark. If I know what CPU and GPU I have I can tell what framerate I should expect to represent my absolute maximum potential performance: the minimum score of a CPU tested with a high end GPU and a GPU tested with a high end CPU.
Now, I might be able to get more accurate information if I actually tested every combination of CPU and GPU, but that's a little out of the scope of a simple GPU launch article. If I only test with a lower end CPU, I will see a lot of the performance numbers get compressed and I will have a harder time extracting information that is useful for comparison purposes. If I test with a high-end CPU, someone with a lower end CPU can find performance information for that CPU and decide if the graphics cards will be overkill or will be a good fit. But that's a different issue than assessing the relative performance of graphics hardware.
So there's that. But what about building the test bed?
Switching hardware and software platforms can often lead to dealing with a lot of new problems. With the old hardware I've been testing on, I know what to expect, what problems constitute a system issue and what are probably a product issue. Even if my system isn't as reliable as I would like it to be, knowing what the issues are really helps in dealing with testing issues. So the first problem I run into is that I don't know what can and will go wrong. This makes troubleshooting take a bit longer than it should, but it's got to be done eventually.
Choosing components is simple: find the fastest thing we've got and shove it in a system. In this current case, that means I'm changing over to an as-of-yet unreleased motherboard and CPU, which makes the potential for problems even larger. The RAM and hard drive we will be using for graphics going forward are things we've already tested though: high performance OCZ DDR3 and an Intel SSD. Yes, the limited size of the Intel SSD will make it tough to get a lot of games on there, but the increase in boot speed and responsiveness of the system go a long way to making testing easier and better, and it should also minimize the impact of random hits to the disk while benchmarking.
As for setting up the system, after we install the 64-bit version of Vista (I really wish there were some other platform on which to game), we set about disabling all sorts of things to get the computer to a state that will allow for consistent testing. Turning features off isn't really so much about gaining performance as it is ensuring consistency. With the amount of things happening in the background with Vista, we see more fluctuations in benchmark performance from run to run. To get a fair comparison without having to run everything 10 times and average performance, we perform the following steps.
First off, we turn off and disable the side bar. Next, we open the security center where automatic updating and security center alerts are disabled. Then we disable user account control.
After a quick reboot (and disabling the welcome screen), we head to advanced system settings and disable system protection (system restore) and remote assistance. While there, we adjust performance settings (in the advanced tab) to best performance and we set the virtual memory page file to a fixed size (custom size with initial == maximum) of 1.5x the amount of ram in the system (though this time, with the limited size of the SSD and the vast amount of RAM in the system, our page file is set to RAM + 512MB).
Once done with that, we reboot and begin disabling the search indexing (by deselecting the folders that are indexed) and the screen saver, moving on from there to power settings. We select High Performance mode and further adjust these to not turn off the hard drive for 40 minutes and to turn off the display after 2 hours. I also like my start menu power button to turn the computer off rather than make it sleep, but that's personal preference.
At this point, any service packs are installed, then chipset drivers, then graphics drivers, then any other system drivers that are needed. After the billion reboots there and removing any backup files left from the service pack install (if we aren't using a slipstreamed disc), we get back to the process at hand: un-Vistaing Vista.
In no particular order, moving files to the recycle bin on delete is disabled, scheduled defragmentation is disabled, the desktop resolution is set to the max, and folder options are changed to show all hidden files. We even prevent the notification area from hiding unused icons and disable the start menu highlighting of new programs. Then it's on to a couple services we disable as well. SuperFetch and ReadyBoost are both disabled, SuperFetch because app launch times don't matter and we use multiple runs to get tests loaded into memory, and ReadyBoost because we are using an SSD and don't need it.
We used to also disable audio, but there are some games that don't run without audio support. Enabling and disabling audio is more trouble than it's worth. In games that have the ability to disable sound during testing, we do so, but if there is no option we do nothing.
Our desktop features shortcuts to batch files that delete the contents of the prefetch directory and run ProcessIdleTasks. However, with an SSD it isn't really necessary or desirable to run ProcessIdleTasks because of the fact that one of the idle tasks is defrag (which you don't want to run on an SSD anyway).
So that's about it as far as system set up goes. Well, after installing games and all that good stuff anyway. Right now we are also looking at updating our game suite. On the short list are: Far Cry 2, Crysis Warhead, Fallout 3, S.T.A.L.K.E.R. Clear Sky, Call of Duty World at War, and Brothers in Arms Hell's Highway. While I'm not sure if we will actually be able to incorporate all these games into our next round of graphics card testing, the first games we drop will be ones that are precluded by these new ones: Fallout 3 will replace Oblivion and Crysis Warhead will replace Crysis.
I'd love to be able to test 20 games for every graphics hardware review, but it's just not possible to do that kind of testing under normal circumstances. We will do our best to evaluate games and pick the ones that make the most sense going forward.
Oh, and I can't wait until I can talk more about what is actually in this new graphics test bed. It's pretty freaking sweet :-)
33 Comments
View All Comments
Sc4freak - Tuesday, October 28, 2008 - link
I have to agree. It was particularly nice to see games like Devil May Cry 4 and Mass Effect being tested in the laptop reviews - they're modern games with awesome graphics.JarredWalton - Tuesday, October 28, 2008 - link
Heh... well, I can't say I care for DMC4, but since it has a built-in test it's easy to run. :-) I did try for a nice cross-section of gaming.erikejw - Thursday, October 30, 2008 - link
Maybe you guys should do 2 reviews of every graphics card.One FPS review like the current one and then one
Game review and try to include all kind of games.
That will both make you as a reviewer happy(FPS review galore) and us(Game review) who will actually read it.