Introduction
This article is intended to cover one of the most critical responsibilities of database administrators in order to maintain data integrity. I will explain with some examples the main concepts of data backup and recovery in different database management systems.
We can find several scenarios of database failures such as media failures that occur when a server component fails and the content of data files cannot be read or it's corrupted. Then, the database administrator must perform the recovery process in order to restore the corrupted data files. We can find non-media failures such as process failure, instance failure and user error.
Before developing a recovery plan, we need to answer ourself some questions-
- What's the most important data?
- How much data can we lose? The less data we can afford to lose, the more often we should back up the data. For example, if you afford to lose data in a period of 24 hours; then, depending of database size, an acceptable solution is to do full backup every day. But, if you can not afford to lose data in a period of 20 minutes, then you must do transaction log backups every 20 minutes. You can consider other fault-tolerance solutions such as log shipping, database mirroring, RAID mirroring, archive the data files off-site to other network or tape device.
- How much downtime is acceptable to your business? How much money does your business lose for each hour of downtime? How much time is acceptable from the downtime to everything is running again? The solution is to have redundant data across several database systems using techniques such as replication, log shipping, database mirroring and failover clustering.
Now let's talk about the different methods of backup in SQL Server 2005. We can find three methods: full, transaction log and differential.
The full backup makes a full copy of the database including all changes and transaction logs as of the point of time when the backup operation ends. It's a time-consuming process.
Transaction log backup saves the transaction log's activities that have occurred since the last full or transaction log backups. After that, the transaction log's inactive portion is truncated in order to remain the transaction log files' size as small as possible.
Differential backup saves all the data and transaction logs that have changed since the last full backup. This type of backup generates files smaller than the full backup.
In order to perform a backup, you need to use the backup database statement. Please, refer to SQL Books Online.Let's do a full database backup of the AdventureWorks database to a file (see Listing 1).
backup database AdventureWorks to disk = 'c:\temp\AdventureWorks_2008_Feb.bak'
with description = 'Full database backup to AdventureWorks',
name = 'Backup for Febrary, 2008, Week 2';
Listing 1.
If you have a very large database, you can use up to 64 backup devices (tapes, files) in the backup operation in order to stripe the backup stream across the devices by written to the devices proportionally and simultaneous (see Listing 2).
backup database AdventureWorks
to disk = 'c:\temp\AdventureWorks_2008_Feb.bak',
disk = 'd:\temp\AdventureWorks_2008_Feb.bak'
with description = 'Full database backup to AdventureWorks',
name = 'Backup for Febrary, 2008, Week 2';
Listing 2.
You can also use logical devices which are references to physical backup devices. The sys.backup_devices catalog view lists all the logical devices of your database system.
To add a new backup device definition (see Listing 3)
exec sp_addumpdevice 'disk', 'AdventureWorks_Backup', 'c:\temp\AdventureWorks_2008_Feb.bak';
Listing 3.
Now, let's backup the AdventureWorks database to the AdventureWorks_Backup logical device (see Listing 4).
backup database AdventureWorks
to AdventureWorks_Backup;
Listing 4.
If you want to delete the logical device, use the stored procedure sp_dropdevice (see Listing 5).
exec sp_dropdevice 'AdventureWorks_Backup';;
Listing 5.
If you want to delete physically the .bak file as well, you need to set the DELFILE option.
In SQL Server 2005, you can also mirror a database, transaction log, files and filegroups by creating redundant copies of these objects.
Transaction logs are very important as well as the data files because they record the transaction operations done over the database system. Thus, it's very important to backup the transaction logs. Notice that a transaction log backup cannot be executed until the database has a full database backup (the database must be set to FULL or BULK_LOGGED recovery model). When the process is over, then the inactive portion of the transaction log is automatically truncated (see Listing 6).
backup log AdventureWorks
to disk = 'c:\temp\AdventureWorks_TransactionLogs_2008_Feb.bak';
Listing 6.
Finally, let's explain how to do a differential backup of the AdventureWorks database. A differential backup allows to back up all the data changes from the last full database backup (unless transaction log backup that back up data changes from the last transaction log backup or full database backup).
In order to perform this type of backup operation, we use the same backup database statement but we add the differential keyword.
Finally, we're going to restore the database from a backup using the restore statement. Unlike the backup operation, restore operation is not only (all users must be disconnected before the restore process begins). It makes sense because this process is executed when the system fails. For more information about the restore statement, please refer to the SQL Books Online. This statement has a lot of options.
Let's do some changes to the AdventureWorks database and the restore the database using the full database backup (Listing 7).
update Person.Address
set AddressLine1='2009 Madison St.'
where AddressID=1;
use master;
restore database AdventureWorks
from disk = 'c:\temp\AdventureWorks_2008_Feb.bak',
disk = 'd:\temp\AdventureWorks_2008_Feb.bak'
with file = 1, replace;
Listing 7.
After you have applied a full database backup (the database is norecovery state allowing to apply transaction log backups and differential backups), you may apply for transaction log backups in chronological order. If you want to recover the database from several transaction log backups, you to set the norecovery option. The last statement must set the recovery option to bring the database online again (see Listing 8).
restore log AdventureWorks
from disk = 'c:\temp\AdventureWorks_TransactionLogs_2008_Feb.bak'
with recovery, replace;
Listing 8.
Recovering a database from a differential backup follows the same concepts explained before. Remember to use the norecovery option.
Now let's talk about the different methods of backup in Oracle database. Unlike SQL Server, we have two type of backup in Oracle: user backup and database admin backup.
Users and developers can use the Export and Import utilities to save and retrieve objects in an external file. For example, we want to save the data associated to the table emp on the schema scott. You can run the exp command without any parameter, and then the command prompts you to enter the parameters interactively or you can run the exp command and specify the actions (see Listing 9).
exp scott/tiger tables=(emp) file=emp.dmp
Listing 9.
The imp command reads a binary dump file produced by the exp command and restores the tables and any associated objects. It follows the same philosophy of the exp command (see Listing 10).
imp scott/tiger tables=(emp) file=emp.dmp show=y
Listing 10.
Database administrators have a lot of tools to perform backup and recovery operations. You can do a cold backup such as making copies of the data files, control files, and initialization files while the database is shutdown. Another option is to realize a hot backup performed while the database is open and available to users.
In Oracle, hot backups are performed on one tablespace at a time. Thus, you need to know the name of the data files associated to the tablespace you are backing up. Let's illustrate these concepts by backing up the users tablespace. First execute the following SQL statement to retrieve the associated data files to users tablespace (see Listing 11).
select df.name "Data files"
from v$tablespace ts, v$datafile df
where ts.ts# = df.ts# and ts.name = 'USERS';
Listing 11.
Now bring the users tablespace to backup state (see Listing 12). The transaction executed against this tablespace will be applied when the backup completes correctly.
alter tablespace users begin backup;
Listing 12.
Then copy the underlying data files to a backup device and finish the hot backup (see Listing 13).
Finally, we have the Recovery Manager (RMAN) a set of tools to perform backup and recovery of database. The biggest advantages of RMAN are that it will back up used space in the database and it doesn't bring the tablespaces to backup mode instead RMAN reads the database blocks until it gets a consistent state of the database.
RMAN can do the following tasks:
- Backup all the database objects. Backup data files, control files and log files.
- Reports what objects have been backed up and what days are kept.
- Perform full backup as well as incremental backup.
- Create a duplicate database.
- Test the recovery process.
RMAN is command utility installed as part of the database installation. You can invoke this command from the Command Windows (see Listing 13).
C:\> rman target / RMAN>
Listing 13.
The last command indicates connect to the default database (orcl in our case), then RMAN indicates that is ready to receive command by the command prompt RMAN>.
RMAN can run in two modes: catalog and nocatalog. In catalog mode, the configuration information is stored in a separated database known as the RMAN catalog. The drawback of this mode is that we need to do maintenance tasks on this new database. In nocatalog mode, the configuration information is stored in the target database control file.
In RMAN, there are two major categories of backup: full backup and incremental backup.
Full backup reads the whole database and copies it into the backup device, skipping data blocks that have never been used.
Incremental backup reads the database changes that have changed since a previous backup.
In order to see the configuration information, use the show all RMAN command (see Listing 14).
RMAN> show all;
RMAN configuration parameters are:
CONFIGURE RETENTION POLICY TO REDUNDANCY 2;
CONFIGURE BACKUP OPTIMIZATION OFF; # default
CONFIGURE DEFAULT DEVICE TYPE TO DISK; # default
CONFIGURE CONTROLFILE AUTOBACKUP ON;
CONFIGURE CONTROLFILE AUTOBACKUP FORMAT FOR DEVICE TYPE DISK TO 'e:\backup\ctl_sp_bak_%F';
CONFIGURE DEVICE TYPE DISK PARALLELISM 2;
CONFIGURE DATAFILE BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE ARCHIVELOG BACKUP COPIES FOR DEVICE TYPE DISK TO 1; # default
CONFIGURE CHANNEL 1 DEVICE TYPE DISK FORMAT 'e:\backup\%U.bak' MAXPIECESIZE 4G;
CONFIGURE CHANNEL 2 DEVICE TYPE DISK FORMAT 'e:\backup\%U.bak' MAXPIECESIZE 4G;
CONFIGURE MAXSETSIZE TO UNLIMITED; # default
CONFIGURE SNAPSHOT CONTROLFILE NAME TO 'C:\ORACLE\ORA92\DATABASE\SNCFORCL.ORA'; # default
RMAN>
Listing 14.
Now let's define a full database backup using RMAN commands (see Listing 15).
RMAN> report need backup;
RMAN> RUN
{
ALLOCATE CHANNEL d1 DEVICE TYPE DISK FORMAT 'c:\temp\backup\%U';
# AS COPY is default when backing up to disk
BACKUP DATABASE PLUS ARCHIVELOG;
}
RMAN> report need backup;
RMAN> list backup summary;
Listing 15.
Conclusion
This article has covered the principles of database backup and recovery and explained them through several example written to the most important database management systems today: Microsoft SQL Server and Oracle database.