You are currently on IBM Systems Media’s archival website. Click here to view our new website.

MAINFRAME > Administrator > Backup and Recovery

The Right Formula

Testing proves deduplication for mainframes is worthwhile

Testing proves deduplication for mainframes is worthwhile
Illustration by Dustin Miller

This past spring many vendors and customers assembled at the SHARE user group conference in Anaheim for a roundtable discussion on the value of deduplication in a z/OS* environment. Their mission was to share information and relate experiences in four areas (see “Deduplication Considerations”).

The group concluded that results will vary, but adding a deduplication backstore to a mainframe virtual-tape solution is a winner.

Virtual Tape a Prerequisite

Introducing deduplication technology in a z/OS mainframe environment implies employing a Virtual Tape Library (VTL). Considerations about a VTL may include: How often are you constrained by a limited number of physical tape drives? Do you have to run many of your backup jobs serially? How often do you have to wait to copy data from one tape to another? What’s the advantage to running all of your backups in parallel? Employing VTL technology has its pros and cons, especially as a backup repository. Consequently, this discussion is limited to deduplication and data reduction related to mainframe sites that already employ a VTL.

Data Reduction for Backup

While some industry analysts are legitimately trying to set appropriate customer expectations, others arbitrarily question the value of mainframe deduplication, especially in conjunction with mainframe backup. Fortunately, arbitrary arguments that deduplication doesn’t help reduce z/OS backup storage don’t hold up under scrutiny of the facts, and there really aren’t any valid arguments against z/OS mainframe users applying deduplication technology.

Comparing mainframe backup-reduction percentages, deduplication alone might not demonstrate the same dramatic reductions as seen when applying deduplication technology in environments using less-sophisticated—or simple full-volume—backup solutions. Typically, z/OS data-protection solutions already offer sophisticated data-reduction techniques. They may, for example, select only files that change (incremental backup) to reduce their backstore requirement, or they only change files from disk while copying forward unchanging data from prior backups (merge backups) to virtualize the creation of new full-volume images.

Additionally, once they create disk master files and databases, z/OS applications don’t copy and rewrite whole master files. They only change portions of the data in a disk file or database at any one time. Considering this, critics then ask, “When mainframe incremental backups are only selecting new and changing data, how useful can deduplication be if it only deals with pre-existing and repeating data?” No matter how much of the data in a file is actually new, general-purpose mainframe backup utilities always copy an entire file; they back up what’s new in files along with plenty of data that didn’t change. Consequently, if a deduplication solution can recognize chunks of data within a file that it’s previously seen, it has the opportunity to reduce an incremental backup by only writing out the new chunks.

Mainframe customers report 20-to-1 reductions for z/OS backup and better than 10.5-to-1 for distributed data backup.

Thomas J. Meehan is vice president of advancing technology at INNOVATION Data Processing.



2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.


Active/Active Sites Helps Distribute Workloads During Outages

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
Mainframe News Sign Up Today! Past News Letters