Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
AMD Operating Systems BSD

FreeBSD on the Athlon64 in 64bit vs Pentium4 3.2E 74

veliath writes "Came by a comparison from about three weeks ago, between two systems running FreeBSD. One is an Athlon64 running FreeBSD in 64bit mode and the other a Pentium4 3.2E running FreeBSD in 32bit mode."
This discussion has been archived. No new comments can be posted.

FreeBSD on the Athlon64 in 64bit vs Pentium4 3.2E

Comments Filter:
  • Old news... (Score:2, Informative)

    by hmallett ( 531047 )
    This is the same article as was linked to from the FreeBSD site a few weeks ago. Everyone's probably read this already. Basically, the Athlon64 is faster.
  • HT & threads (Score:2, Insightful)

    The article says that Intel's HT doesn't improve performance much. Isn't this expected, considering that IIRC FreeBSD's kernel threads still suck and most of the programs are single threaded anyway?
    • Re:HT & threads (Score:5, Interesting)

      by Anonymous Coward on Monday April 05, 2004 @11:48AM (#8769812)
      While I don't know about FreeBSD's threads sucking as far as I could tell none of the tests would've stressed the threading system.

      The tests didn't really work to hyper-threading's advantages. Take the builds with multiple jobs running at the same time. That's more about running separate applications as separate processes and that's not what hyper-threading's advantage is because they arn't separate thread at all.

      HT is more for true multithreaded applications like Photoshop or something and none of the benchmarks were anything like that.
      • Re:HT & threads (Score:5, Informative)

        by ratboy666 ( 104074 ) <fred_weigel@[ ]mail.com ['hot' in gap]> on Monday April 05, 2004 @03:06PM (#8771930) Journal
        What HyperThreading is...

        Out of order execution takes the processor to a particular level of performance. Unfortunately, (and especially with the X86 IA), we run out of steam rather quickly, and the processor blocks waiting on registers or memory. The idea behind HT is that the processor's execution elements can then be reassigned to something else waiting in cache.

        Of course, this means we need a big fat cache, and something else to execute. Could be another thread or process, but the important thing is that the second job be independent.

        This can increase the utilization of the processor's compute elements.

        So, yes, the "builds with multiple jobs running at the same time" test makes sense.

        I would like to see a benchmark with CPU stalls and utilization summarized at the end. Can't do it myself, because I am far too cheap to replace my current system (and yes, it is an MP box - dual 200Mhz PPRO - and it still does quite nicely).

        Anyway, it does look the the Intel took a hit in this benchmark; too bad for them. I looked over the methodology -- and it looked reasonable given the scope of the project.

        Ratboy.
        • I would like to see a benchmark with CPU stalls and utilization summarized at the end. Can't do it myself, because I am far too cheap to replace my current system (and yes, it is an MP box - dual 200Mhz PPRO - and it still does quite nicely).

          If it were possible, I would have done this and measure temperatures as well... but I couldn't find a way to measure these things without interfering with the tests. It was really important to me to prevent anything from altering the test results. Of course if you h

    • Re:HT & threads (Score:5, Interesting)

      by aminorex ( 141494 ) on Monday April 05, 2004 @01:15PM (#8770714) Homepage Journal
      HT does wonders for the P4 in the bandwidth tests, because they are not taxing the execution core; they are only stressing the limits of those parts of the CPU which are replicated. In fact, I can go a step further and say that they aren't even taxing those parts in any meaningful way, because the P4 just plain has fat pipes. Forthcoming dual-channel revisions of the Athlon64 will do another leap-frog, and put that architecture's bandwidth in the lead for a while, but it hasn't happened yet.

      The real-world apps demonstrate that the 5% of die space spent on HT doesn't result in much more leveraging of the execution core, in practice. I can't imagine why anyone would care what the P4 numbers were without HT, since no one will ever run it that way now that OSen are supporting it.

      As regards FreeBSD's kernel threads, the answer is "not really" since the overwhelming bulk of the benchmarks was spent in userspace (less so for the compile benchmarks than for the crypto ones). Notice that the user time numbers favored the Athlon64 no less than did the wall time numbers.

      I think it's interesting that the synthetic benchmarks all favored the P4 (a highly academic design) while the user load tests all favored the 64.


      • FWIW, my personal simplistic non-HT vs. HT test on Windows XP showed that HT does make a difference. While, er, backing up a DVD to an XviD AVI file using AutoGK, second encode pass, non-HT took 142 minutes on my 3GHz P4 with dual channel DDR400 memory. After turning HT on, it took only 122 minutes...
        • I think the dual channel memory is what is making the difference in the case you describe. You're getting a 16% bump instead of a 5 or 6% bump because the memory channels are being used more effectively, not just the core. In a single-channel memory system, I think the HT would only make a 5-6% difference. (This is an hypothesis, derived from my mental model of HT CPU operation, which can be used to confirm or disconfirm the model.)
  • Nice comparision, but what about dual or quad processor systems? I have recently installed both FreeBSD 4.9 and 5.2.1 on (almost) identical dual-Xeon servers. Both are operating as if they had 4 processors (due to HTT). How would the Athelon, etc. stack up with this setup (seriously, I'd like to know)? Maybe HTT realy shines on multiple CPU systems, not just mon-processor? Maybe.

    BTW- FreeBSD (either version) on a brand new Dell rack-mount server, with hardware RAID, 2GB RAM, dual processor (of course) makes for a very fast server! I have them configured mostly as web servers, a number of Perl generated dynamic pages (ad serving mostly), rsync, CVS repository, Cyrus and Sendmail (w/SASL AUTH and TLS/SSL), MySQL, and a custom rsync staging/production environment. When I run top, it sure is nice to every now and then see 2 processors at almost 100% utilization, yet also show 50% idle. I have no benchmarks to report, alas these are production machines in use.

    • by Homology ( 639438 ) on Monday April 05, 2004 @01:02PM (#8770597)
      When I run top, it sure is nice to every now and then see 2 processors at almost 100% utilization, yet also show 50% idle.

      It shows that you have capacity over for starting other processes. It also shows that your system is slower that it could be. Some food for thought relating to the uses of hyperthreading.

    • by Agent Green ( 231202 ) * on Tuesday April 06, 2004 @03:31PM (#8783477)
      HTT offers little to zero benefit for properly optimized MP systems like FreeBSD. It helps with scheduling...not by giving you 4 processors of power.

      Now, if you're running 100% on 2 "processors" which happen to be the same chip on HTT, you're really not using the full potential of the machine.

      And to quote Chris Rock, "Turn that shit off!"
    • Simple answer is that an SMP opteron ('cos only 940 pin opterons work in SMP) would kick ass because they don't share memory bandwidth between processors. In fact available memory size and bandwidth scales linearly as you add processors and all available benchmarks show that overall performance scales at least twice as fast on Opteron systems as Xeon systems.

      For a great aticle comparing SMP systems check out this article at Ace's Harware [aceshardware.com]. I know that its not a BSD based comparison, but it should give you s
  • by forkazoo ( 138186 ) <wrosecrans@@@gmail...com> on Monday April 05, 2004 @12:03PM (#8769990) Homepage
    Wow, coupled with the ATI Radeon 9600ASC, I'd be the ultimate in cool, whilst getting my Nethack on.

    I mean, don't get me wrong. I'm all about benchmarks. I love fast kit. I own an Athlon64, so seeing it win even makes me feel good about myself. OTOH, the performance differences tend not to be huge, and Athlon64 doesn't win every benchmark. Wake me up when I can afford 8 GB of RAM. That's when Athlon 64 will really matter.
    • by phoenix_rizzen ( 256998 ) on Monday April 05, 2004 @02:21PM (#8771461)
      You're forgetting something very crucial here ... the Athlon64 is clocked almost 1 GHz slower than the P4 ... yet the performance difference is virtually nil. That says a lot more about the performance of the Athlon64 than anything.

      That's not a "ho-hum" benchmark to me. That's an "Intel has royally fubar'd themselves. Here's hoping their Pentium-M strategy brings them back on track."
      • by Henry V .009 ( 518000 ) on Monday April 05, 2004 @07:55PM (#8774929) Journal
        So sad to see that the parent is yet another victim of the megahertz myth.

        Imagine for a moment that a CPU maker created a chip that performed 10 times the number of operations per cycle that either Intel or AMD could achieve. But also imagine that because of the complexity, they could only get the chip to run at 50MHz. Not very useful, huh?

        Intel has gone with a design that allows them to ramp up clock speed. AMD has gone with a design that allows them to use clock cycles more efficiently.

        Both of those approaches are a perfectly good way to do things. All that matters is how fast the user's applications run in the end.
        • by obeythefist ( 719316 ) on Monday April 05, 2004 @10:17PM (#8775902) Journal
          Interesting point, but surely, Intel will be running into physics problems way faster than AMD will, because Intel are running much closer to the raw speed edge.

          Megahurtz myths aside, frequency is still frequency and there is an upper limit. The first one to hit the wall loses, by the way. So the frequency/performance aspect of intel processors is definately worth keeping in mind. This is why the Pentium-M is becoming the forefront processor-More IPC than the PIV architecture. Perhaps intel has hit the wall already?

          Likewise, one could reason that many of the tricks that Intel are using to increase frequency could be applied to AMD's architectures in the future, giving AMD much more room for growth, as intel has already exhausted many of the available technologies.
          • All chips are designed to run "at the edge" of the frequency upper limit and so AMD doesn't have an inherent advantage because they do more work per clock cycle and Intel does less work per cycle but has a higher frequency. All chip-makers hit the same physical limitations at about the same time and neither has the advantage because they run at a higher or lower frequency today.

            The primary determination of clock-speed (besides process technology of course) is the largest number of transistors and the leng
        • Both of those approaches are a perfectly good way to do things. All that matters is how fast the user's applications run in the end.

          Yes, the fact that Intel's approach will melt your heatsink means nothing at all, and should be completely disregarded.
    • actually it would matter at anything over 4 Gigs of ram.
      • Doesn't it start to matter once you go past 2GB of RAM - e.g. 3GB, 4GB?

        My limited understanding is that for many 32 bit O/Ses the kernel wants part of the address space for itself and the rest of the address space is for the currently running process.

        If you have a 2GB:2GB address space split in some cases 2GB of kernel space isn't enough, in other cases 2GB of user space isn't enough. You can do 3:1 or 1:3 splits for some O/S but what gets squished into the 1GB space?

        So it seems address space starts to m
    • Wake me up when I can afford 8 GB of RAM. That's when Athlon 64 will really matter.

      But you are forgetting a great many things.

      First of there is heat and power... The Athlon 64 runs cooler, and requires less power than a regular Athlon/Pentium. And that's only the PEAK specs.

      Dig deeper and you will find that the Athlon 64 is even better than that. It throttles down the MHz when the CPU isn't being fully utilized, so you are using even less power and creating even less heat, because surely you aren't go

  • by BrookHarty ( 9119 ) on Monday April 05, 2004 @01:04PM (#8770618) Journal
    I noticed they used the AMD64 3200, But the AMD64 3200+ only has 1/2 the cache compared to the 3400+, that extra cache should boost the build process even more.

    Toms hardware has nice review [tomshardware.com] and benchmarks for the 3400 vs the P4 3.4.

    Also anyone notice, in both articles, P4's clean house on synthetic benchmarks, but real world (build process) the AMD cleans house.
  • by FFFish ( 7567 ) on Monday April 05, 2004 @01:05PM (#8770631) Homepage
    One page, no annoying Flash advertisements, no tedious space-filling fluff, solid information.

    It's the antithesis of a Tom's review!
  • 64 bit is faster (Score:3, Informative)

    by Anonymous Coward on Monday April 05, 2004 @03:43PM (#8772382)
    In the end I think the initial point is made with this review though, and that is that 64-bit does make a difference to the "average user" as well as the power user or administrator, but that performance advantage may not be evident in all situations. When under heavy load or dealing with large blocks of data, the Athlon64 (and we can assume that the Opteron and Athlon64-FX also apply) in 64-bit mode achieves superior performance to the same machine in IA32 (x86) mode. This is not so much because of the 64-bit addressing as it is the fact that there are twice as many general-purpose registers available.
  • by Chris_Jefferson ( 581445 ) on Tuesday April 06, 2004 @08:37AM (#8778483) Homepage
    Personally I feel the much more important part of these results is not the athlon64 pentium 4, but the athlon64 on 32-bit and 64-bit code. This is a set of benchmarks I've been trying to find for some time

    If we ignore the cases where the 32-bit code has been optimised via ASM, it looks like the athlon64 is noticably faster on 64-bit code, and often much faster. This backs up what a number of people had been saying, that even if 64-bit code takes up more space the extra registers are a bonus (I'm thinking it's quite likely that gcc hasn't got around to using the various new instructions available yet)

  • I just got fed up reading article where Athlon64 are compared in 32bit mode on windows. Ok, I know that 64bits version of Windows XP is not finished yet, and that most of the "avarage bob user" will be going to use windows. But why has no benchmarking site (like Tom's Hardware) ever tried to make some {Linux64,BSD64,whatever-64} benchmark, juste to show us what benefits of the x86_64 architecture are already measurable ?
    • Re:Finally !!! (Score:3, Interesting)

      by bandrzej ( 688764 )
      Actually, there is a beta version of Windows 64 bit [microsoft.com] out for AMD and Intel users to test out. Cost nothing to download, and you can get a CD in the mail if you need to. I had problems burning from their ISO, so i opted for the CD in the mail. The biggest advantage of Win 64 bit is you get past a great deal of the memory limitations in Windows XP and 2000. I have noticed a great difference in speed between running the same AMD 64 3200+ machine in Windows XP and Windows 64 bit on a dual boot.
      • 1. No compare has been ever made between Windows and Windows 64, AFAIK
        2. This crappy beta's installer doesn't boot on my machine. And we're not the only one having problem do get it work.
  • If thats what a standard Athlon64 does to a EE P4, I'd love to see what, say, the AthlonFX-53 would be capable of...

    *salivate* If only I could afford one of the damn things...

    • If thats what a standard Athlon64 does to a EE P4, I'd love to see what, say, the AthlonFX-53 would be capable of...

      IIRC, the Athlon FX is basically an Opteron and the P4 EE is a Xeon, so it says something about the server market too.
    • Re:Holy Crap! (Score:2, Informative)

      by mobby_6kl ( 668092 )
      I can't RTFA, but from the article summary it is a regular Prescott, not ExtremeEdition. IIRC, "E" stands for Prescott, "C" would be for Northwood core.

And it should be the law: If you use the word `paradigm' without knowing what the dictionary says it means, you go to jail. No exceptions. -- David Jones

Working...