We've a customer who is interested in upgrading to Nehalem. He's running on Windows with Oracle database for SAP Enterprise Portals.
Could you kindly let us know your recommendations please?
The approximate concurrent users would be around 3000 Portal users.
Keenly looking forward for your response and if you could state any instances of Nehalem installed in SAP environment for production usage, that would be a great deal of help.
I understand the PHP not-enough-threads explanation as to why Dual X5570 doesn't scale up.
But, can anyone please explain why when you add another AMD Opteron 2384 the increase is from 42.9 to 63.9, while when you add another Xeon X5570 there isn't such an increase?
Well done on an excellent review using as many real-world tests as possible. The VMWare test is a real eye opener and shows how the 55xx can match double the number of CPUs from the last generation of Xeons *AND* crucially save $$$$ on licensing from Windows and MS SQL and other per-socket licensed software, plus the power saving which is again a financial saving if you hire rack space in a datacentre.
I eagerly await your own in-house VM tests. Please consider also testing using Windows 2008 Hyper-V which I think doesn't have the 55xx optimisations that the latest release of VMWare has (and might not have until R2?).
Thanks for the time you put in to running the endless tests. The results make a brilliant business case for anyone wanting to upgrade their servers. You must have had the chips a good week before Intel officially launched them. :-) I do feel sorry for AMD though. I'm sure they have plenty of motivation to come back with a vengeance like they did a few years ago.
You mention octal servers from Sun and HP for VM's, but does anybody really use these systems for VM's? I can't imagine why anybody would, since you are paying a serious premium for 8 sockets vs. 2 x 4 socket servers, or even 4 x 2 socket servers. Then the redundancy options are much lower when running only a few 8 socket servers vs many 2 or 4 socket servers when utilizing v-motion, and the expansion options are obviously far less w/ NIC's and HBA's. From what I've seen, most 8 socket systems are for DB's.
What i mentioned after reading the review is there are very few benches on benchmarks a little bit favored by AMD.
For example, only 1 3DSmax test (so unusefull) at least 2 are needed
Only 1 virtualization benchmark, which is really a shame....
Virtualization is becoming so important and you guys only throw in one test?
Besides that, the review feels a bit biased towards intel, but i will check some other reviews of the xeon 5570
So lets see all the prebuilds of esx3.5 update 4 get a real high score of 16 tiles almost as much as a 4s shanghai while Vmware performance team themselves stated that we should never see the HT core as a real cpu in Vmware (even with the new code for HT) while yet the benchmark shows a high performance increase, no not like anandtech is stating that this is due to the more available memory and its bandwith, those Vmmarks are not memory starving. Now look at the official Intel benchmark with ESX update 4, it provides 10 tiles and a healthy increase, that from a technical point of view seems much more realistic. All other marketing stuff like switching time etc, all nice, but then again is within the same line of current shanghai.
What kind of tests are you looking for? The techreport guys have a lot of HPC tests, we are focusing on the business apps.
"very few benches on benchmarks a little bit favored by AMD."
That is a really weird statement. First of all, what is a test favored by AMD?
Secondly, this new kind of testing with OLTP/OLAP testing was introduced in the Shanghai review. And it really showed IMHO that there was a completely wrong perception about harpertown vs Shanghai. Because Shanghai won in the tests that mattered the most to the market. While many tests (inclusive those of Intels) were emphasizing purely CPU intensive stuff like Blackscholes, rendering and HPC tests. But that is a very small percentage of the market, and that created the impression that Intel was on average faster, but that was absolutely not the case.
"Only 1 virtualization benchmark, which is really a shame..."
Repeat that again in a few weeks :-). We have just succesfully concluded our testing on Nehalem.
Personally I am a bit shocked about the "not enough tests" :-). Because any professional knows how hard these OLTP/OLAP tests are to set up and how much time they take. But they might not appeal to the enthousiast, I am not sure.
I didn't mean to offend you, because i can imagine how much time it takes to test hardware properly. And i personally think that OLTP/OLAP testing is very innovative and needed. Because otherwise people would have no idea what to buy for servers. You cannot let you server purchase be influenced with meaningless (for servers) simple benchmarks like 3D 2006/Vantage/FPS test etc.
You guys always are doing a great a job at testing any piece of hardware, but it is just feeling to much biased towards Intel. For example, at the last page of this review you get a link to Intel resource Center (in the same place as the next button). If you have things like that, you are not (trying to be) objective IMO.
Thank you for clarifying in a very constructive way.
"the last page of this review you get a link to Intel resource Center"
I can't say I am happy with that link as it creates the wrong impression. But the deal is: editors don't involve in ad management, ad sales people don't get involved when it comes to content.
So all I can say is to judge our content, not our ads. And like I said, it didn't stop us from claiming that Shanghai was by far the best server CPU a few months ago. And that conclusion was not on many sites.
But ad sales people should know this creates the wrong impression. A review site (for me at least) is all about objectivity and credibility. When you place a link to Intel's Resource Center at the end of every review, it feels weird. People on forums already call Anandtech, Inteltech. And i don't think this is what you guys want.
I always liked Anandtech since when I was a kid, and I still do. You guys always have one of the most in-depth reviews (especially on the very technical side) and I like that. But you guys are gaining some very negative publicity on the net.
AMDZone is the biggest joke on the internet. I just went there to see how the zealots like abinstein are still doing their damage control; just like before he went on rambling how the Penryn is still weak against Shanghai, and the old and tired excuses like how if people all bought AMD they can drop in upgrades etc etc. ZootyGray...he's the biggest joke on AMDZone. None of them had the mental capacity to accept AMD has been DEFEATED, which is disappointing but funny to say the least
It's not just AMDZone, you are just the opposite. Its like in Woodcrest and conroe times, it's not because the high-end cpu is the best of all that the rest of the available cpu's in the line is by default better. It's all about price performance ratio. Like many who were buying the low-end and think they had bought the better system, well wrong bet.
As mentioned before, why not test the mid range that is where the sales will be. Time to test 5520-5530 against 2380-82 after all those have the same price.
Your argument is valid, however, it just so happens that for low end 1S systems the Penryns are doing just fine against the Shanghais, for higher end 2S systems they used to be limited by memory bandwidth and AMD pulls ahead. No more is this the case, Intel now beats AMD in their own territory.
There's more to HPC applications than you indicate: environmental modeling apps, particularly, tend to be dominated by memory access patterns rather than by I/O or pure computation. Give me a ring if you'd like some help with that -- I'm local for you, in fact...
Thanks for the extremely informative and interesting review Johan. I am definitely looking forward to more server reviews; are the 4-way CPUs out later this year? That will be interesting as well.
Forgot to mention that I was suprised HT has such an impact that it did in some of the benches. It made some huge differences in certain applications, and slightly hindered it in others. Overall, I can see why Intel wanted to bring back SMT for the Nehalem architecture.
awesome performance, but would like to see how the intel 5510-20-30 fare against the amd 2378-80-82 after all that is the same price range.
It was the same with woodcrest and conroe launch, everybody saw huge performance lead but then only bought the very slow versions.... then the question is what is still the best value performance/price/power.
Istanbul better come faster for amd, how it looks now with decent 45nm power consumption it will be able to bring some battle to high-end 55xx versions.
Very informative article... I would also be interested in seeing how any of the midrange 5520/30 Xeons compare to the 2382/84 Opterons. Especially now that some vendors are giving discounts on the AMD-based servers, the premium for a server with X5550/60/70s is even bigger. It would be interesting to see how the performance scales for the Nehalem Xeons, and how it compares to Shanghai Opterons in the same price range. We're looking to acquire some new servers and we can afford 2P systems with 2384s, but on the Intel side we can only go as far as E5530s. Unfortunately there's no performance data for Xeons in the midrange anywhere online so we can make a comparison.
I only skimmed the graphs, but how about some consistency ? some of the graphs feature only dual core opterons, some have a mix of dual and quad core ... pricing chart also features only dual core opterons ...
looking just at the graphs, I cannot make any conclusion ...
Part of the problem with the 54xx CPUs is not the CPUs themselves, but the FB-DIMMS. Part of the big improvement for the Nehalem in the server world is because Intel sodomized their 54xx platform, for reasons that escape most people, with the FB-DIMMs. But, it's really not mentioned except with regards to power. If the IMC (which is not an AMD innovation by the way, it's been done many times before they did it, even on the x86 by NexGen, a company they later bought) is so important, then surely the FB-DIMMs are. They both are related to the same issue - memory latency.
It's not really important though, since that's what you'd get if you bought the Intel 54xx; it's more of an academic complaint. But, I'd like to see the Nehalem tested with dual channel memory, which is a real issue. The reason being, it has lower latency while only using two channels, and for some benchmarks, certainly not all or even the majority, you might see better performance by using two (or maybe it never happens). If you're running a specific application that runs better using dual channel, it would be good to know.
Overall, though, a very good article. The first thing I mention is a nitpick, the second may not even matter if three channel performance is always better.
I was wondering if you got any feeling whether Hyperthreading scaled better on Nehalem than Netburst? And if so, do you think this is due to improvements made to HT itself in Nehalem, just do to Nehalem 4+1 instruction decoders and more execution units or because software is better optimized for multithreading/hyperthreading now? Maybe I'm thinking mostly desktop, but HT had kind of a hit or miss reputation in Netburst, and it'd be interesting to see if it just came before it's time.
Well, for one, the Nehalem is wider than the Pentium 4, so that's a big issue there. On the negative side (with respect to HT increase, but really a positive) you have better scheduling with Nehalem, in particular, memory disambiguation. The weaker the scheduler, the better the performance increase from HT, in general.
I'd say it's both. Clearly, the width of Nehalem would help a lot more than the minor tweaks. Also, you have better memory bandwidth, and in particular, a large L1 cache. I have to believe it was fairly difficult for the Pentium 4 to keep feeding two threads with such a small L1 cache, and then you have the additional L2 latency vis-a-vis the Nehalem.
So, clearly the Nehalem is much better designed for it, and I think it's equally clear software has adjusted to the reality of more computers having multiple processors.
On top of this, these are server applications they are running, not mainstream desktop apps, which might show a different profile with regards to Hyper-threading improvements.
The L1-cache and the way that the Pentium 4 decoded was an important (maybe even the most important) factor in the mediocre SMT performance. Whenever the trace cache missed (and it was quite small, something of the equivalent of 16 KB), the Pentium 4 had only one real decoder. This means that you have to feed two threads with one decoder. In other words, whenever you get a miss in the trace cache, HT did more bad than good in the Pentium 4. That is clearly is not the case in Nehalem with excellent decoding capabilities and larger L1.
And I fully agree with your comments, although I don't think mem disambiguation has a huge impact on the "usefullness" of SMT. After all, there are lots of reasons why the ample execution resources are not fully used: branches, L2-cache misses etc.
Not only that, Pentium 4 had the Replay feature to try to make up for having such a long pipeline stage architecture. When Replay went wrong, it would use resources that would be hindering the 2nd thread.
Wow...that's just ridiculous how much improvement was made, gg Intel. Can't wait to see how the 8-core EX's do, if this launch is any indication that will change the server landscape overnight.
However, one thing I would like to see compared, or slightly modified, is the power consumption figures. Instead of an average amount of power used at idle or load, how about a total consumption figure over the length of a fixed benchmark (ie- how much power was used while running SPECint). I think that would be a good metric to illustrate very plainly how much power is saved from the greater performance with a given load. I saw the chart in the power/performance improvement on the Bottom Line page but it's not quite as digestible as or as easy to compare as a straight kW per benchmark figure would be. Perhaps give it the same time range as the slowest competing part completes the benchmark in. This would give you the ability to make a conclusion like "In the same amount of time the Opteron 8384 used to complete this benchmark, the 5570 used x watts less, and spent x seconds in idle". Since servers are rarely at 100% load at all times it would be nice to see how much faster it is and how much power it is using once it does get something to chew on.
Anyway, as usual that was an extremely well done write up, covered mostly everything I wanted to see.
I think that is a very good method for determining total power consumption. Obviously this doesn't show cpu power consumption, but more importantly the overall consumption for a given unit of work.
I am trying to hard, but I do not see the difference with our power numbers. This is the average power consumption of one CPU during 10 minutes of DVD-store OLTP activity. As readers have the performance numbers, you can perfectly calculate performance/watt or per KWh. Per server would be even better (instead of per CPU) but our servers were too different.
Is it me or is page 2 of this article missing some information? The title of that 2nd page is "What Intel and AMD are Offering," but in the body of the text there are only descriptions of Intel's Xeon chips? Perhaps a new title to reflect the body, or add AMD info?
I moved the AMD vs Intel pricing data to the back of the article as the pricing info is more interesting once you have seen the results. But forgot to change the title.. fixed. Thanks.
Very nice to see a comparison over some generations of Xeon platform, including the new one (yet to be released).
I would like to see a new article with Core i7 vs Xeon 5500... to check out if my Core i7 @ 3,7GHz is good enough in Maya 2009 (Windows XP 64bit, 12GB DDR3), or if a Xeon 5500 (each at 2,4GHz, for instance) in dual processor configuration will be a much better buy.
We’ve updated our terms. By continuing to use the site and/or by logging into your account, you agree to the Site’s updated Terms of Use and Privacy Policy.
44 Comments
Back to Article
rkchary - Tuesday, June 16, 2009 - link
We've a customer who is interested in upgrading to Nehalem. He's running on Windows with Oracle database for SAP Enterprise Portals.Could you kindly let us know your recommendations please?
The approximate concurrent users would be around 3000 Portal users.
Keenly looking forward for your response and if you could state any instances of Nehalem installed in SAP environment for production usage, that would be a great deal of help.
Regards,
Chary
Adun - Thursday, April 9, 2009 - link
Hello,I understand the PHP not-enough-threads explanation as to why Dual X5570 doesn't scale up.
But, can anyone please explain why when you add another AMD Opteron 2384 the increase is from 42.9 to 63.9, while when you add another Xeon X5570 there isn't such an increase?
Thank you for the article,
Adun.
stimudent - Thursday, April 2, 2009 - link
Was it really too much effort to clean off the processor before posting a picture of it? Or were they trying to show that it was used, tested?LizVD - Friday, April 3, 2009 - link
Would you perhaps like us to draw a smiley face on it as well? ;-)GazzaF - Wednesday, April 1, 2009 - link
Well done on an excellent review using as many real-world tests as possible. The VMWare test is a real eye opener and shows how the 55xx can match double the number of CPUs from the last generation of Xeons *AND* crucially save $$$$ on licensing from Windows and MS SQL and other per-socket licensed software, plus the power saving which is again a financial saving if you hire rack space in a datacentre.I eagerly await your own in-house VM tests. Please consider also testing using Windows 2008 Hyper-V which I think doesn't have the 55xx optimisations that the latest release of VMWare has (and might not have until R2?).
Thanks for the time you put in to running the endless tests. The results make a brilliant business case for anyone wanting to upgrade their servers. You must have had the chips a good week before Intel officially launched them. :-) I do feel sorry for AMD though. I'm sure they have plenty of motivation to come back with a vengeance like they did a few years ago.
JohanAnandtech - Thursday, April 2, 2009 - link
Thanks! Good to hear from another professional. I believe the current Hyper Beta R2 already has some form of support for EPT.Our virtualization testing is well under way. I'll give an update soon on our blog page.
Lifted - Wednesday, April 1, 2009 - link
You mention octal servers from Sun and HP for VM's, but does anybody really use these systems for VM's? I can't imagine why anybody would, since you are paying a serious premium for 8 sockets vs. 2 x 4 socket servers, or even 4 x 2 socket servers. Then the redundancy options are much lower when running only a few 8 socket servers vs many 2 or 4 socket servers when utilizing v-motion, and the expansion options are obviously far less w/ NIC's and HBA's. From what I've seen, most 8 socket systems are for DB's.Veteran - Wednesday, April 1, 2009 - link
What i mentioned after reading the review is there are very few benches on benchmarks a little bit favored by AMD.For example, only 1 3DSmax test (so unusefull) at least 2 are needed
Only 1 virtualization benchmark, which is really a shame....
Virtualization is becoming so important and you guys only throw in one test?
Besides that, the review feels a bit biased towards intel, but i will check some other reviews of the xeon 5570
duploxxx - Wednesday, April 1, 2009 - link
Virtualization benchmark come from the official Vmmark scores.However there is something real strange going on in the results...
HP HP ProLiant DL370 G6
VMware ESX Build #148783 VMmark v1.1
23.96@16tiles
View Disclosure 2 sockets
8 total cores
16 total threads 03/30/09
Dell Dell PowerEdge R710
VMware ESX Build #150817 VMmark v1.1
23.55@16tiles
View Disclosure 2 sockets
8 total cores
16 total threads 03/30/09
Inspur Inspur NF5280
VMware ESX Build #148592 VMmark v1.1
23.45@17tiles
View Disclosure 2 sockets
8 total cores
16 total threads 03/30/09
Intel Intel Supermicro 6026-NTR+
VMware ESX v3.5.0 Update 4 VMmark v1.1
14.22@10 tiles
View Disclosure 2 sockets
8 total cores
16 total threads 03/30/09
So lets see all the prebuilds of esx3.5 update 4 get a real high score of 16 tiles almost as much as a 4s shanghai while Vmware performance team themselves stated that we should never see the HT core as a real cpu in Vmware (even with the new code for HT) while yet the benchmark shows a high performance increase, no not like anandtech is stating that this is due to the more available memory and its bandwith, those Vmmarks are not memory starving. Now look at the official Intel benchmark with ESX update 4, it provides 10 tiles and a healthy increase, that from a technical point of view seems much more realistic. All other marketing stuff like switching time etc, all nice, but then again is within the same line of current shanghai.
JohanAnandtech - Wednesday, April 1, 2009 - link
What kind of tests are you looking for? The techreport guys have a lot of HPC tests, we are focusing on the business apps."very few benches on benchmarks a little bit favored by AMD."
That is a really weird statement. First of all, what is a test favored by AMD?
Secondly, this new kind of testing with OLTP/OLAP testing was introduced in the Shanghai review. And it really showed IMHO that there was a completely wrong perception about harpertown vs Shanghai. Because Shanghai won in the tests that mattered the most to the market. While many tests (inclusive those of Intels) were emphasizing purely CPU intensive stuff like Blackscholes, rendering and HPC tests. But that is a very small percentage of the market, and that created the impression that Intel was on average faster, but that was absolutely not the case.
"Only 1 virtualization benchmark, which is really a shame..."
Repeat that again in a few weeks :-). We have just succesfully concluded our testing on Nehalem.
Personally I am a bit shocked about the "not enough tests" :-). Because any professional knows how hard these OLTP/OLAP tests are to set up and how much time they take. But they might not appeal to the enthousiast, I am not sure.
Veteran - Wednesday, April 1, 2009 - link
I didn't mean to offend you, because i can imagine how much time it takes to test hardware properly. And i personally think that OLTP/OLAP testing is very innovative and needed. Because otherwise people would have no idea what to buy for servers. You cannot let you server purchase be influenced with meaningless (for servers) simple benchmarks like 3D 2006/Vantage/FPS test etc.You guys always are doing a great a job at testing any piece of hardware, but it is just feeling to much biased towards Intel. For example, at the last page of this review you get a link to Intel resource Center (in the same place as the next button). If you have things like that, you are not (trying to be) objective IMO.
JohanAnandtech - Wednesday, April 1, 2009 - link
Thank you for clarifying in a very constructive way."the last page of this review you get a link to Intel resource Center"
I can't say I am happy with that link as it creates the wrong impression. But the deal is: editors don't involve in ad management, ad sales people don't get involved when it comes to content.
So all I can say is to judge our content, not our ads. And like I said, it didn't stop us from claiming that Shanghai was by far the best server CPU a few months ago. And that conclusion was not on many sites.
Veteran - Wednesday, April 1, 2009 - link
Thanks for clarrifying this matter.But ad sales people should know this creates the wrong impression. A review site (for me at least) is all about objectivity and credibility. When you place a link to Intel's Resource Center at the end of every review, it feels weird. People on forums already call Anandtech, Inteltech. And i don't think this is what you guys want.
I always liked Anandtech since when I was a kid, and I still do. You guys always have one of the most in-depth reviews (especially on the very technical side) and I like that. But you guys are gaining some very negative publicity on the net.
BaronMatrix - Tuesday, March 31, 2009 - link
Unfortunately, I don't buy from or recommend criminals.carniver - Wednesday, April 1, 2009 - link
AMDZone is the biggest joke on the internet. I just went there to see how the zealots like abinstein are still doing their damage control; just like before he went on rambling how the Penryn is still weak against Shanghai, and the old and tired excuses like how if people all bought AMD they can drop in upgrades etc etc. ZootyGray...he's the biggest joke on AMDZone. None of them had the mental capacity to accept AMD has been DEFEATED, which is disappointing but funny to say the leastduploxxx - Wednesday, April 1, 2009 - link
It's not just AMDZone, you are just the opposite. Its like in Woodcrest and conroe times, it's not because the high-end cpu is the best of all that the rest of the available cpu's in the line is by default better. It's all about price performance ratio. Like many who were buying the low-end and think they had bought the better system, well wrong bet.As mentioned before, why not test the mid range that is where the sales will be. Time to test 5520-5530 against 2380-82 after all those have the same price.
carniver - Wednesday, April 1, 2009 - link
Your argument is valid, however, it just so happens that for low end 1S systems the Penryns are doing just fine against the Shanghais, for higher end 2S systems they used to be limited by memory bandwidth and AMD pulls ahead. No more is this the case, Intel now beats AMD in their own territory.CHADBOGA - Tuesday, March 31, 2009 - link
You probably also can't afford to buy a computer, so I doubt that Intel will be too concerned with your AMDZone insanity. LOL!!!!smilingcrow - Tuesday, March 31, 2009 - link
Those grapes you are chewing on sure sound sour to me. Try listening to a few tracks by The Fun Loving Criminals to help take away the bad taste.cjcoats - Tuesday, March 31, 2009 - link
There's more to HPC applications than you indicate: environmental modeling apps, particularly, tend to be dominated by memory access patterns rather than by I/O or pure computation. Give me a ring if you'd like some help with that -- I'm local for you, in fact...gwolfman - Tuesday, March 31, 2009 - link
Why was this article pulled yesterday after it first posted?JohanAnandtech - Tuesday, March 31, 2009 - link
Because the NDA date was noon in the pacific zone and not CET. We were slightly too early...yasbane - Tuesday, March 31, 2009 - link
Hi Johan,Any chance of some more comprehensive Linux benchmarks? Haven't seen any on IT Anandtech for a while.
cheers
JohanAnandtech - Tuesday, March 31, 2009 - link
Yes, we are working on that. Our first Oracle testing is finished on the AMD's platform, but still working on the rest.Mind you, all our articles so far have included Linux benchmarking. All mysql testing for example, Stream, Specjbb and Linpack.
Exar3342 - Monday, March 30, 2009 - link
Thanks for the extremely informative and interesting review Johan. I am definitely looking forward to more server reviews; are the 4-way CPUs out later this year? That will be interesting as well.Exar3342 - Monday, March 30, 2009 - link
Forgot to mention that I was suprised HT has such an impact that it did in some of the benches. It made some huge differences in certain applications, and slightly hindered it in others. Overall, I can see why Intel wanted to bring back SMT for the Nehalem architecture.duploxxx - Monday, March 30, 2009 - link
awesome performance, but would like to see how the intel 5510-20-30 fare against the amd 2378-80-82 after all that is the same price range.It was the same with woodcrest and conroe launch, everybody saw huge performance lead but then only bought the very slow versions.... then the question is what is still the best value performance/price/power.
Istanbul better come faster for amd, how it looks now with decent 45nm power consumption it will be able to bring some battle to high-end 55xx versions.
eryco - Tuesday, April 14, 2009 - link
Very informative article... I would also be interested in seeing how any of the midrange 5520/30 Xeons compare to the 2382/84 Opterons. Especially now that some vendors are giving discounts on the AMD-based servers, the premium for a server with X5550/60/70s is even bigger. It would be interesting to see how the performance scales for the Nehalem Xeons, and how it compares to Shanghai Opterons in the same price range. We're looking to acquire some new servers and we can afford 2P systems with 2384s, but on the Intel side we can only go as far as E5530s. Unfortunately there's no performance data for Xeons in the midrange anywhere online so we can make a comparison.haplo602 - Monday, March 30, 2009 - link
I only skimmed the graphs, but how about some consistency ? some of the graphs feature only dual core opterons, some have a mix of dual and quad core ... pricing chart also features only dual core opterons ...looking just at the graphs, I cannot make any conclusion ...
TA152H - Monday, March 30, 2009 - link
Part of the problem with the 54xx CPUs is not the CPUs themselves, but the FB-DIMMS. Part of the big improvement for the Nehalem in the server world is because Intel sodomized their 54xx platform, for reasons that escape most people, with the FB-DIMMs. But, it's really not mentioned except with regards to power. If the IMC (which is not an AMD innovation by the way, it's been done many times before they did it, even on the x86 by NexGen, a company they later bought) is so important, then surely the FB-DIMMs are. They both are related to the same issue - memory latency.It's not really important though, since that's what you'd get if you bought the Intel 54xx; it's more of an academic complaint. But, I'd like to see the Nehalem tested with dual channel memory, which is a real issue. The reason being, it has lower latency while only using two channels, and for some benchmarks, certainly not all or even the majority, you might see better performance by using two (or maybe it never happens). If you're running a specific application that runs better using dual channel, it would be good to know.
Overall, though, a very good article. The first thing I mention is a nitpick, the second may not even matter if three channel performance is always better.
snakeoil - Monday, March 30, 2009 - link
oops it seems that hypertreading is not scaling very well too bad for inteleva2000 - Tuesday, March 31, 2009 - link
Bloody awesome results for the new 55xx series. Can't wait to see some of the larger vBulletin forums online benefiting from these monsters :)ssj4Gogeta - Monday, March 30, 2009 - link
huh?ltcommanderdata - Monday, March 30, 2009 - link
I was wondering if you got any feeling whether Hyperthreading scaled better on Nehalem than Netburst? And if so, do you think this is due to improvements made to HT itself in Nehalem, just do to Nehalem 4+1 instruction decoders and more execution units or because software is better optimized for multithreading/hyperthreading now? Maybe I'm thinking mostly desktop, but HT had kind of a hit or miss reputation in Netburst, and it'd be interesting to see if it just came before it's time.TA152H - Monday, March 30, 2009 - link
Well, for one, the Nehalem is wider than the Pentium 4, so that's a big issue there. On the negative side (with respect to HT increase, but really a positive) you have better scheduling with Nehalem, in particular, memory disambiguation. The weaker the scheduler, the better the performance increase from HT, in general.I'd say it's both. Clearly, the width of Nehalem would help a lot more than the minor tweaks. Also, you have better memory bandwidth, and in particular, a large L1 cache. I have to believe it was fairly difficult for the Pentium 4 to keep feeding two threads with such a small L1 cache, and then you have the additional L2 latency vis-a-vis the Nehalem.
So, clearly the Nehalem is much better designed for it, and I think it's equally clear software has adjusted to the reality of more computers having multiple processors.
On top of this, these are server applications they are running, not mainstream desktop apps, which might show a different profile with regards to Hyper-threading improvements.
It would have to be a combination.
JohanAnandtech - Monday, March 30, 2009 - link
The L1-cache and the way that the Pentium 4 decoded was an important (maybe even the most important) factor in the mediocre SMT performance. Whenever the trace cache missed (and it was quite small, something of the equivalent of 16 KB), the Pentium 4 had only one real decoder. This means that you have to feed two threads with one decoder. In other words, whenever you get a miss in the trace cache, HT did more bad than good in the Pentium 4. That is clearly is not the case in Nehalem with excellent decoding capabilities and larger L1.And I fully agree with your comments, although I don't think mem disambiguation has a huge impact on the "usefullness" of SMT. After all, there are lots of reasons why the ample execution resources are not fully used: branches, L2-cache misses etc.
IntelUser2000 - Tuesday, March 31, 2009 - link
Not only that, Pentium 4 had the Replay feature to try to make up for having such a long pipeline stage architecture. When Replay went wrong, it would use resources that would be hindering the 2nd thread.Core uarch has no such weaknesses.
SilentSin - Monday, March 30, 2009 - link
Wow...that's just ridiculous how much improvement was made, gg Intel. Can't wait to see how the 8-core EX's do, if this launch is any indication that will change the server landscape overnight.However, one thing I would like to see compared, or slightly modified, is the power consumption figures. Instead of an average amount of power used at idle or load, how about a total consumption figure over the length of a fixed benchmark (ie- how much power was used while running SPECint). I think that would be a good metric to illustrate very plainly how much power is saved from the greater performance with a given load. I saw the chart in the power/performance improvement on the Bottom Line page but it's not quite as digestible as or as easy to compare as a straight kW per benchmark figure would be. Perhaps give it the same time range as the slowest competing part completes the benchmark in. This would give you the ability to make a conclusion like "In the same amount of time the Opteron 8384 used to complete this benchmark, the 5570 used x watts less, and spent x seconds in idle". Since servers are rarely at 100% load at all times it would be nice to see how much faster it is and how much power it is using once it does get something to chew on.
Anyway, as usual that was an extremely well done write up, covered mostly everything I wanted to see.
7Enigma - Wednesday, April 1, 2009 - link
I think that is a very good method for determining total power consumption. Obviously this doesn't show cpu power consumption, but more importantly the overall consumption for a given unit of work.Nice thinking.
JohanAnandtech - Wednesday, April 1, 2009 - link
I am trying to hard, but I do not see the difference with our power numbers. This is the average power consumption of one CPU during 10 minutes of DVD-store OLTP activity. As readers have the performance numbers, you can perfectly calculate performance/watt or per KWh. Per server would be even better (instead of per CPU) but our servers were too different.Or am I missing something?
usamaah - Monday, March 30, 2009 - link
Is it me or is page 2 of this article missing some information? The title of that 2nd page is "What Intel and AMD are Offering," but in the body of the text there are only descriptions of Intel's Xeon chips? Perhaps a new title to reflect the body, or add AMD info?JohanAnandtech - Monday, March 30, 2009 - link
I moved the AMD vs Intel pricing data to the back of the article as the pricing info is more interesting once you have seen the results. But forgot to change the title.. fixed. Thanks.usamaah - Monday, March 30, 2009 - link
Cool, thank you. Next time I'll finish reading the article before I make a comment, sorry ;-) Anyway wonderful article.Ipatinga - Monday, March 30, 2009 - link
Very nice to see a comparison over some generations of Xeon platform, including the new one (yet to be released).I would like to see a new article with Core i7 vs Xeon 5500... to check out if my Core i7 @ 3,7GHz is good enough in Maya 2009 (Windows XP 64bit, 12GB DDR3), or if a Xeon 5500 (each at 2,4GHz, for instance) in dual processor configuration will be a much better buy.