You are currently on IBM Systems Media’s archival website. Click here to view our new website.

POWER > Infrastructure > Linux

Game Changer

Linux on POWER
Illustration by Charles Williams

By enabling never-before-possible speed of insights while expanding capabilities and flexibility, the IBM POWER9* processor represents game-changing innovation for organizations deploying Linux* workloads. The newest POWER* processor is designed with data in mind and removes system and cluster bottlenecks to ensure superior data movement for a wide range of applications and devices especially cognitive, big data and high-performance computing workloads.

The POWER9 processor delivers value to Linux users in several ways, says Dylan J. Boday, offering manager, Cognitive Systems Infrastructures. Thanks to the POWER9 processor, Linux users can build on the innovations delivered within the architecture including industry-leading interfaces. Released in late 2017, IBM Power Systems* AC922 is the first system to deliver the POWER9 processor to the market. With 2x better per core performance than x86, the POWER9 processor drives computing efficiency. “The marketplace is becoming very impatient with the lack of innovation from alternative architectures,” says Boday.

Faster I/O

The POWER9 processor is the first in the market to deliver PCI-Express 4.0 (PCIe Gen4), which along with NVLink and OpenCAPI buses gives a significant boost in data transfer speeds to I/O interfaces. PCIe Gen4 doubles the ability, compared to PCIe Gen3, to move data from the processor to PCIe Gen4 devices such as InfiniBand. Alternative architectures like x86 continue to support PCIe Gen3. IBM works with Mellanox on InfiniBand network connectivity that’s developing PCIe Gen4 value-add devices and adaptors. The combination of PCIe Gen4 and InfiniBand provides clients with the best network technology for scaling workloads, Boday notes.

Meanwhile, AC922 users can run artificial intelligence (AI) and HPC workloads faster by employing NVIDIA’s NVLink. The POWER9 processor is the only one on the marketplace to have NVLink directly embedded in the processor. This provides 5.6x the bandwidth from NVIDIA CPUs to GPUs compared to PCIe 3. The larger bandwidth is critical for Linux users who are tackling large complex projects. These advanced I/O interfaces on the POWER9 processor deliver nearly 4x the performance compared to x86 for deep-learning frameworks, Boday says.

Accessing data from memory is critical in today’s environment where speed of insights can be linked to an architecture’s memory bandwidth. This includes workloads such as in-memory databases (including SAP HANA) or large, complex AI models where the model size has outgrown accelerator local memory capacity. The superior memory bandwidth of the POWER9 processor—which is nearly 2x larger than x86—is a key differentiator, creating a balanced system and enabling data to move freely within a server or out to a network. Further, the POWER9 processor has a maximum memory bandwidth of 230 Gbps in scale-up systems and 170 Gbps in scale-out systems.

Beyond data movement advantages, the POWER9 processor delivers more computing capabilities over alternative architectures like x86, with up to 4x threads per core. When combined with the balanced systems design of I/O and memory bandwidth, this synergistic architecture yields better results for Linux workloads. “Large, complex database and cognitive workloads need a different strategy, and commodity systems no longer meet those requirements,” Boday says. “In the post-processor only era, data movement to value-add devices is going to be critical in meeting the market needs.”

CAPI and OpenCAPI Speed Communication

The POWER9 processor uses the latest CAPI and OpenCAPI bus technologies to remove performance roadblocks. CAPI allows the processor to communicate coherently with devices attached to CAPI buses, providing fast, smooth transitions with low latency and reduced compute overhead.

OpenCAPI, a consortium advancing the CAPI protocol, works to enable higher bandwidth from the processor to value-added devices. Like the OpenPOWER consortium, OpenCAPI partners work to develop new ways to get data in and out of the processor to meet the market demands. A monolithic development ecosystem no longer meets these needs. Many large system providers and individual device manufacturers are producing offerings that leverage OpenCAPI. “IBM wants people to innovate around the POWER architecture to synergistically deliver solutions that meet clients’ expectations,” Boday says.

In an industry first, the POWER9 technology delivers a coherent interface to the processor, which eliminates the overhead protocols normally associated with moving data from one device to and through the processor. This can significantly reduce the latency (i.e., the speed to transfer data between two points) for a broad range of applications and is accomplished with the 25 Gb high-speed bus. This bus allows flexibility to use NVLink protocols for the best platform for GPUs or OpenCAPI protocols to provide a high-bandwidth, low-latency bus for other value-add devices such as networking, storage and many others.

The POWER9 processor can deliver up to 300 Gb of bandwidth to GPUs or 200 Gb of bandwidth to OpenCAPI devices. “As organizations explore innovations to meet their market challenges, they will find the POWER9 processor to be a breath of fresh air, which allows them to be creative in solving these challenges,” says Boday. “Clients will be very interested in the potential 7-10x bandwidth that the POWER9 processor delivers as they deploy new memory architectures, storage and various accelerators,” notes Boday.

This potent combination lets data move freely within the system as well as out to the network over the advanced buses. Additionally, the POWER9 architecture creates flexibility so clients can use a mix of value-added devices like GPUs and OpenCAPI devices. “This opens up some very interesting combinations as you can image with the data sitting closer to accelerators allowing faster speed of insights for AI workloads,” says Boday.

NVLink and AI on POWER9

For clients seeking to delve deeper into AI applications, second generation NVLink running on the POWER9 processor yields nearly 4x the performance of those workloads on x86. NVLink enables data scientists working on AI projects to train models faster. Data scientists also can run more iterations for deep-learning projects and update them more frequently with the most relevant information. “Those are very tangible things that end users can grasp,” Boday notes. “I often refer to the speed of insights which the Power Systems portfolio delivers to the analogy of our children going to school,” he says. “If we can enable AI models to iterate faster, they will learn more at a faster rate. It’s analogous to going to school once a week versus learning continuously. We all want our children to learn constantly; we should also want our AI models to do the same,” he says.

Improved TCO

The POWER9 processor provides superior total cost of ownership due to the processor innovation, its scalability, and the ability to move data in and out to value-added devices (e.g., storage, memory and accelerators). That drives efficiency for the data center. “Leveraging data with industry-leading speed of insights is key to our clients today as this drives competitive advantages within their markets and helps them grow their business,” he says. “As the world evolves from the CPU-only era, these advanced I/O interfaces will be critical for Linux users to meet their expectations as well as those placed on them by the market.”

Clients running Linux on older IBM Power Systems servers can take advantage of the new innovations such as NVLink 2.0, PCIe Gen4 and OpenCAPI by upgrading to POWER9. To learn more, visit The POWER9 processor transforms clients’ data centers to a modern architecture whether they are currently running x86 or POWER8* processor-based infrastructure.

Today’s business environment constantly requires organizations to up their game. And the game-changing technology of the POWER9 processor makes it easier to stay on top.

Shirley S. Savage is a Maine-based freelance writer. Shirley can be reached at



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.


Advancing the Ecosystem

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store