You are currently on IBM Systems Media’s archival website. Click here to view our new website.

MAINFRAME > Administrator > Backup and Recovery

Database Recovery - DBA Best Practices

Database Recovery - DBA Best Practices

Supporting mission-critical applications is the primary job function of the database administrator (DBA). What are the most important tasks? How do you set priorities? If there were one thing you could do today that could increase productivity, reliability and application performance, would you find the time to implement it?

DBAs need to reset their expectations and methods of database recovery by correctly setting priorities and developing and implementing best practices.

DBA Priorities

To organize thinking it helps to list your priority categories. This list can be used as a tool or sounding board to help determine which of the tasks clamoring for the DBA’s attention should be addressed first. “The Laws of Database Administration,” as I outline them, require that data be:

  • Recoverable
  • Available
  • Secure, and
  • Easily accessible

Security and performance concerns can certainly be urgent, but they are less important than the first two requirements. Recoverability is a strategic concern. If you can’t recover your data, little else matters.

Knowing this, DBAs usually implement standard database backup processes. In DB2 for z/OS, the most common method is the tablespace image copy utility. This is backward thinking. Instead, DBAs should focus on recovery processes, not backups.

Taking an image copy provides a false sense of security. Ask yourself: If you depend solely upon database backups, and recovering your application data from these backups takes several days, will your enterprise fail?

Recovery Comes First

The most critical metric for data recovery is the recovery time objective (RTO). It measures the length of time it takes to recover a set of application data to a consistent point-in-time.

The RTO is a consideration early in application development and database design. The DBA must coordinate the data model design, the application-processing needs and the recovery processes in order to meet the RTO. For example, a financial application might require its data be fully available within an hour of a disaster. This might force the DBA to consider alternative backup and recovery strategies. I strongly recommend using a best-practices methodology to develop proper recovery processes.

Best of the Best

To be successful, DBAs need to focus on how to prevent, detect and correct service disruptions; minimize and mitigate risk; and use clearly written, well-documented and centrally located processes. Here are some of the “best of the best” strategies to accomplish those goals.

Database Recovery Processes

Most shops have standard cookie-cutter, backup procedures. Few have tested the corresponding recovery processes, and fewer still have measured the processes to ensure that recovery meets the RTO. Begin by researching potential backup tactics, such as:

  • Both tablespace and index image copies
  • Data replication
  • DASD mirroring
  • DB2 concurrent copy
  • Backup System

Next, consider standards of use. Implement boilerplate Job Control Language procs and parms, get metrics on execution times and resource requirements. (One shop I know implemented a parallel recovery process from tape backups, only to have it fail when staff tried to recover more objects than it had tape drives available.) Document these as standards and include guidelines for use.

Once implemented, regularly review and test your recovery processes. Recovery times might change due to several factors, including computer hardware upgrades, installation of new peripherals such as virtual tape drives, growth in data volumes, etc.

Frequently forgotten are regularly scheduled partition backups of partitioned tablespaces. For example, consider a tablespace with monthly partitions. A partition containing 7-year-old data (and backed up 7 years ago) might not seem like an issue. However, if it is backed up on physical tape, is the tape media still readable? If you execute the recover utility, DB2 must still review its logs and internal tables to determine if any updates were applied to the partition in the past 7 years. Without prior planning, this could be a disaster in the making.

Recovery Tactics

Be sure to implement basic backup and recovery rules. These include:

  • Ensuring production data changes are logged by the DBMS
  • Retaining sufficient active and archive logs on appropriate media
  • Considering staggered or phased recovery (i.e., preset the table/index recovery order so “customer-intensive” objects are recovered first)
  • Considering options for incremental image copies and index image copies; test and time the options for possible inclusion in recovery plans.

Regularly measure recovery times for critical objects and save this data in a form that can be analyzed. Forecast the RTO of the data as data volume grows.

Critical Steps

While many tactical choices define a DBA’s daily life, these strategies are crucial because they support:

  • All aspects of the enterprise’s data
  • Offloading simple and repetitive work to the machine, freeing the DBA for more skill-based tasks
  • Industry best practices such as recoverability, reliability and data availability

Ask yourself: Do you have standard processes in place? Are they documented? Is the documentation centrally located and easily accessible? Have you done enough in each area that adds real value to your organization? Answers to these questions will help you deliver on your recoverability promise.

References

For more information on quality improvement, see the IT Service Capability Maturity Model website: www.itservicecmm.org.

 

Lockwood Lyon is an IBM Champion and DB2 for z/OS systems and database performance specialist. He has more than 20 years of experience in IT as an IMS and DB2 database administrator, systems analyst, manager, and consultant.



Advertisement

Advertisement

2019 Solutions Edition

A Comprehensive Online Buyer's Guide to Solutions, Services and Education.

MAINFRAME > ADMINISTRATOR > BACKUP AND RECOVERY

Active/Active Sites Helps Distribute Workloads During Outages

IBM Systems Magazine Subscribe Box Read Now Link Subscribe Now Link iPad App Google Play Store
Mainframe News Sign Up Today! Past News Letters