You are currently on IBM Systems Media’s archival website. Click here to view our new website.



Understanding the POWER6 power, price and performance proposition

Understanding the POWER6 power, price and performance proposition
Illustration by Ryan Etter

I'm an IBM System i programmer and systems guy. I write RPG/IV, work in data communications, integrate systems - things like that. And, of course, I work with the System i hardware. After all, my code runs on System i hardware and I connect System i LAN cards to hubs and Ethernet switches. But I don't have to get involved with the processor running System i architecture. It's just É there. It computes when needed and moves data in and out of memory when needed. The processor isn't something I think about, which is actually a testament to the i5/OS architecture and developers. These developers work hard to maintain the operating system's ease of use, which means users can access the system without a second thought about the underlying processor.

I wanted to learn more about the POWER6 technology after I read the July announcement. You can read about the announcement - and the next release of i5/OS - in the August 2007 cover story. I was interested in understanding how IBM engineers had increased the performance while lowering the actual and potential costs of the chip. I was also curious about the similarities between POWER5 and POWER6 chips, and about their differences - what made the POWER6 chip the successor to the POWER5 chip. I want to thank the engineers - they developed a marvelous chip and are justly proud of the advantages they've added to it. Here's some of the information I gathered talking with those technically proficient folks in Rochester, Minn.

Cycle Time

One of the driving forces in the development of the POWER6 chip was a considerable improvement in the cycle time. The frequency of the currently shipped POWER6 chip is 4.7 Ghz, and follow-on POWER6 chips will be even faster. Of course, the higher the frequency, the hotter the chip will run. It also requires more power, which can be costly (more about that later).

One of the industry techniques to combat the chip-heat increase is to use more processors on a chip. Using simultaneous multithreading (SMT), a feature on both POWER5 and POWER6 processors, considerably more work can be accomplished with each chip's real estate by having each core execute the instruction streams of multiple tasks. SMT is a technology in which a chip can fetch instructions from multiple threads simultaneously on the same core and can schedule the execution of those threads concurrently. The SMT performance efficiency of POWER6 technology has improved relative to POWER5 technology.

As you can imagine (or the processor designers wouldn't have been doing it since the beginning of time), improving the cycle time improves performance. But as you'll see, cycle time isn't the only game in town. It dictates the rate at which instructions flow through the execution units in the core. And for instructions to execute full out at cycle-time speeds, the instruction stream and the data it accesses must be present in the L1 cache. Failing there, to some extent, processors wait. And the effect? It's quite possible that an improvement in the cycle time by a factor of two might only be perceived as, for example, a 30-percent improvement. Cache, then, is critical to POWER6 systems' performance for most applications.

Another consideration is that main-storage access times haven't been improving at anywhere close to the rate of processor cycle times. So, while it doesn't take more time to access main storage, it does take more cycles. While this has been true for years, with the POWER6 chip's blazing cycle time, it becomes a significant factor in performance. If your application needs to spend most of its time accessing main storage, the improvement in cycle time won't be as evident. This is part of the reason that the POWER6 SMT support is more efficient than on POWER5 technology. SMT provides much of its improvement because during the time that one hardware thread is waiting on a cache miss from main storage, the other thread is able to execute its instructions at full speed; when you increase the latency of a cache miss, you increase the time that an SMT thread can be executing alone. So, in a manner of speaking, SMT is a way of hiding the extra time required to complete a cache miss by reducing the processor's idle time.

How POWER6 Works

Systems are always getting faster. But faster can mean many things. Techniques used to add more capacity to systems include:

  • Faster cycle times
  • More instruction execution pipes and more complex pipes
  • Additional processor cores per chip
  • More threads per core (i.e., SMT)
  • Additional, and multiple levels of, cache along with more intelligent cache controls
  • More processor chips
  • Increased main storage
Using simultaneous multithreading, a feature on both POWER5 and POWER6 processors, considerably more work can be accomplished with each chip's real estate.

Michael Ryan is a technical editor with IBM Systems Magazine. Michael can be reached at



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
IBMi News Sign Up Today! Past News Letters