Intel Woodcrest: the Birth of a New King
by Jason Clark & Ross Whitehead on July 13, 2006 12:05 AM EST- Posted in
- IT Computing
Multiple Load Points
For AnandTech Database Benchmarks, we have always focused on "real world" Benchmarks. To achieve this, we have used real applications with loads such that CPU utilization was 80-90%. Recently we discussed how most Enterprise Database Servers do not average 80-90% CPU utilization, but rather something closer to the 30-60% range. We thought it would make more sense to show performance where it is most likely going to be used, as well as the saturation numbers for the situations where the CPU is maxed. We feel this is consistent with how GPUs are reviewed, and how you might test drive a car. With GPUs, the cards are tested with varying resolutions, and anti-aliasing levels. With a car, you don't just hit the highway and see what the top end is.
We settled on six load points for testing. These load points are consistent across all platforms and are throttled from the client, independent of the platform being measured. We chose these load points as they split the load range into 6 roughly equal parts and allow us to extrapolate data between the points. The last/highest load point is a "saturation plus" load point to verify that we tested up the capability of the CPUs.
For any given load point, there is a defined number of threads. Each test is 20 minutes in duration, which includes an 8 minute warm up period followed by a 12 minute measured period. For a given load point, the client submits requests to the DB server as fast as the DB server will respond. The rate which the client is able to submit requests is measured during the final 12 minutes of the test and averaged to determine the Orders/Minute for Dell and Transactions/Minute for Forums. After much blood, sweat, and almost tears we were able to produce repeatable loads with an average deviation of 1.6%.
For each platform we ran the test 5 times for each load point and then averaged the 5 results. This was repeated for all loads, all tests, on all platforms... that is 300 test executions!!! (We won't even get into the debugging issues we had to deal with prior to the final results.) Thankfully, we managed to automate the process as much as possible when implementing the throttling mechanism for the load points.
For AnandTech Database Benchmarks, we have always focused on "real world" Benchmarks. To achieve this, we have used real applications with loads such that CPU utilization was 80-90%. Recently we discussed how most Enterprise Database Servers do not average 80-90% CPU utilization, but rather something closer to the 30-60% range. We thought it would make more sense to show performance where it is most likely going to be used, as well as the saturation numbers for the situations where the CPU is maxed. We feel this is consistent with how GPUs are reviewed, and how you might test drive a car. With GPUs, the cards are tested with varying resolutions, and anti-aliasing levels. With a car, you don't just hit the highway and see what the top end is.
We settled on six load points for testing. These load points are consistent across all platforms and are throttled from the client, independent of the platform being measured. We chose these load points as they split the load range into 6 roughly equal parts and allow us to extrapolate data between the points. The last/highest load point is a "saturation plus" load point to verify that we tested up the capability of the CPUs.
For any given load point, there is a defined number of threads. Each test is 20 minutes in duration, which includes an 8 minute warm up period followed by a 12 minute measured period. For a given load point, the client submits requests to the DB server as fast as the DB server will respond. The rate which the client is able to submit requests is measured during the final 12 minutes of the test and averaged to determine the Orders/Minute for Dell and Transactions/Minute for Forums. After much blood, sweat, and almost tears we were able to produce repeatable loads with an average deviation of 1.6%.
For each platform we ran the test 5 times for each load point and then averaged the 5 results. This was repeated for all loads, all tests, on all platforms... that is 300 test executions!!! (We won't even get into the debugging issues we had to deal with prior to the final results.) Thankfully, we managed to automate the process as much as possible when implementing the throttling mechanism for the load points.
59 Comments
View All Comments
ashyanbhog - Tuesday, July 18, 2006 - link
Quite shocking to see Anand perform such a biased benchmark and get away so easily.Is it a coincidence that Dell did not sell AMD chips in their machines to date, and benchmarks from Dell show Intel chips perform better
Can we say tuned or skewed
photoguy99 - Thursday, July 13, 2006 - link
It's just killing fan boys like Kiijibari that Intel is the best 2-way server out there now and they have to craft these elaborate scenarios to somehow justify how AMD is still great.Man give it up - Is it not enough almost every hardware site on the net has crowned Woodcrest the new 2-way champ over AMD? How much more evidence do you want?
As I've posted before I own an FX-60 now so I don't feel great that Intel will soon be selling at Wal-mart a chip that will kick ass on my carefully overclocked FX system.
But so what? It is what it is. Sure AMD are planning new things, and when and if they are benchmarked to be superior, then you can have your day again.
For now Intel *owns* AMD except a couple niche segments - get used to it.
duploxxx - Friday, July 14, 2006 - link
i think you mean conroe... not woodcrest (the server chip) you can count the reviews on 1 hand that were made... two of them were from anand here. which are still in a large discussion about comparisson etc... and this one is comming straigth from intel...nice to see the "king" is anounced by intel themselves
vaystrem - Thursday, July 13, 2006 - link
That prevented Intel's Woodcrest computers from being considered for government bids?http://www.theinquirer.net/default.aspx?article=32...">US government unit throws Intel out over RAID problems
or
http://theinquirer.net/default.aspx?article=32818">Conroe shows dodgy RAID performance anomalies
I know its 'The Inq' but since this is a server test it would be nice to see some confirmation or exploration of this issue.
drwho9437 - Thursday, July 13, 2006 - link
The charts are almost totally inscrutable for the red-green color blind population, which is something like 5% of males. Learn to use a decent color scheme or incorperate symbol shapes as well as colors. Map makers know this...forPPP - Thursday, July 13, 2006 - link
All of you ranting about comparing 3.0 GHz Woodcrest to 2.6 GHz Operton, look here:http://www.behardware.com/articles/623-1/intel-cor...">http://www.behardware.com/articles/623-1/intel-cor... and see how much better Core 2 architecture is. Core 2 Duo 2.13 GHz beats Athlon FX-62 2.8 GHz in most benchmarks. Of course architecture is not everything, especially in enterprise market. Operton has advantage thanks to HyperTransort and advantage of Woodcrest is diminished because of FSB. But the main battle will occur with Desktops and here Core 2 Duo shines. Lets hope AMD will show something intersting soon, not only prices drop. All in all we consumers will benefit from this battle.
duploxxx - Friday, July 14, 2006 - link
you are looking at desktop... don't compare desktop with server... desktop is more the mass and low profit.... server and laptop are for the profit.Locutus465 - Thursday, July 13, 2006 - link
AMD will manage to come out with a decent competetor in the next little while and we'll have real compitition in the CPU space again. I'm sure this sucks for AMD right now, and if AMD were able to rebound and deliver a competitor in the relitive near term futer for intel too. But for consumers, competition is beautiful, already we can look twards dirt cheap A64's for your low to mid-range computing needs.FesterOZ - Thursday, July 13, 2006 - link
I tried skimming back through the article but is Anand just measuring the CPU wattage or the overall wattage draw for the whole platform (i.e. cpu, northbridge,dimms?Jason Clark - Thursday, July 13, 2006 - link
Wall folks, sorry that wasn't more clear. We'll ensure we include power measurement information in future articles. We use the same procedure as we've used in previous articles with power, an extech device and we log power througout the test duration.Cheers.