You are currently on IBM Systems Media’s archival website. Click here to view our new website.

POWER > Systems Management > Data Management

Run Hadoop and Spark on Power Systems to Unlock Business Value

Big Data Workloads
Photo by Paul Price
 

Big data implementations have become increasingly important for businesses in a variety of industries. Manufacturing, social media, finance, medical and municipalities are all using big data to streamline functions, gain insights and boost efficiency. Many enterprises are choosing to run their big data workloads on IBM Power Systems* servers, which offer several advantages for Apache Hadoop and Apache Spark applications.

“Hadoop and Spark benefit from being run on Power Systems because of the POWER* architecture’s ecosystem, ease of use, system performance and price performance,” says Raj B. Krishnamurthy, chief architect for analytics stack design and performance, IBM Systems. “POWER technology is designed for big data, letting clients do more with less.”

IBM saw the potential for Hadoop and Spark to add value to business processes at the outset of the big data revolution. Today, enterprises use Hadoop and Spark to generate new types of analytics to solve business problems and pursue new opportunities. IBM has kept pace with client demand through its hardware and integrated software offerings, says Steve Roberts, big data offering manager, Power Systems.

IBM built initial reference architectures and integrated solutions optimized for Hadoop with the release of POWER7* servers. POWER8* technology was designed with big data in mind as it relates to memory bandwidth, memory cache and eight threads per core processing, Roberts says. With it, IBM offers fully assembled clusters with a preloaded Hadoop and Spark software stack.

The addition of Linux* OS-centric Power Systems servers with the S822L and S812L offerings in 2014 continued the big data revolution. These scale-up servers are especially cost-effective for running big data workloads, Roberts says.

Realizing that clients also needed a storage-dense model to run Hadoop and Spark, IBM delivered the LC line in 2015. The S812LC has all of the processor advantages of POWER8 servers in terms of memory, threads and cache as well as the capability to hold 14 large form factor hard disk drives or SSDs.

As Spark grows in popularity, IBM is focusing on building more optimized solutions that include larger memory configurations and higher speed disk access through SSD flash storage to support Spark’s intensive in-memory applications, Roberts says.

The POWER Advantage

Many big data systems run on clusters, requiring efficient networking and disk I/O subsystems. The combination of compute processing, network and I/O capabilities allows Power Systems servers to run Hadoop and Spark workloads more efficiently than x86, Krishnamurthy says.

Higher main memory bandwidth and larger caches on POWER servers means workloads are processed faster. “With 4x more cache, Power Systems can yield higher performance than x86,” Krishnamurthy says. Further, Java* Development Kits (JDKs), including IBM JDK and OpenJDK, are optimized for POWER technology, allowing Hadoop and Spark to operate faster.

Shirley S. Savage is a Maine-based freelance writer. Shirley can be reached at savage.shirley@comcast.net.



Advertisement

Advertisement

2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

POWER > SYSTEMS MANAGEMENT > DATA MANAGEMENT

Are You Ready for GDPR?

POWER > SYSTEMS MANAGEMENT > DATA MANAGEMENT

IBM Researchers Maximize Apache Spark’s Capabilities

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store