You are currently on IBM Systems Media’s archival website. Click here to view our new website.

Deduplication Defined

“Data deduplication examines the data that’s in a typical backup set—where there’s often redundant data—and finds ways to reduce it,” says Victor Nemechek, ProtecTIER* deduplication offering manager at IBM. He offers the example of a PowerPoint presentation that’s been emailed to 20 people. The backup application stores all 20 copies found on the system. Data deduplication identifies data that’s already been stored and, rather than use extra disk space, can store one copy and point to it for the other instances.

Data deduplication technology also works at the byte level. “It’s any data that’s stored multiple times,” Nemechek says. “For example, let’s say there’s a corporate logo. And you might have that logo in PowerPoints, Word documents, on the Web—it’s everywhere and it doesn’t change based on where it is. Deduplication technology recognizes that it’s seen this chunk of data before and won’t write it again.”

—T.D.

Advertisement

Mainframe News Sign Up Today! Past News Letters