High-Availability describes the procedure you take to make your server farm resilient to hardware failures. In other words, High-Availability means that users can continue to access SharePoint if a single component in the farm fails. By contrast, disaster recovery refers to the processes you put in place to avoid loss of data when a hardware or software failure occurs. When designing High-Availability and disaster recovery strategies for a SharePoint farm, it is important to understand the various approaches required by each logical tier in the farm. High-Availability for the database tier requires understanding how SQL Server provides high availability and the associated requirements.
You can protect against the loss of database servers by adding more servers running SQL Server. You can configure multiple SQL Servers to provide high availability for databases by using SQL mirroring, SQL Server failover clustering, or AlwaysOn Availability Groups. We can see each of these options in detail now.
Using SQL Mirroring for High-Availability
SQL database mirroring is an option for making databases highly available in SQL Server 2008 R2 (in SQL Server 2012, high availability is handled through AlwaysOn Availability Groups. Database mirroring provides a redundant copy of a database, called a mirror, on an alternate SQL Server; this mirrored database is automatically kept up to date with changes occurring in the principal copy of the database. The mirrored database server is also known as the failover database server. Mirroring sends transaction log information from the server holding the principal copy of the database to the server holding the mirror, where the logs are continuously replayed, so that the mirror copy reflects the latest changes.
With SQL mirroring, only the principal SQL Server can write changes to the database at any given time.
SQL mirroring supports only one mirror to provide two copies of the database in total. SQL Mirroring supports the following three availability modes.
High-Availability mode. This mode uses synchronous mirroring so that transactions on the principal Database are only committed after the same transactions have been successfully copied to the mirror. This mode requires the use of an additional SQL Server as a witness server, that monitors both the Principal server and the mirror server, if you want to provide optional automatic failover capabilities.
High safety mode. This mode also uses synchronous mirroring, but it does not require the use of a witness server. This mode guarantees the data on the mirror, but cannot provide automatic failover and should be considered a warm standby option. This means that the solution cannot automatically take over in the event of a failure at the primary server and some reconfiguration work is required to bring the solution online.
High performance mode. This mode uses asynchronous mirroring, that allows the principal to commit transactions as soon as possible, without waiting to confirm that the mirror has received copies of the transactions. This means that the mirror may not be fully up to date with the principal, so some data may be lost at failover. However, this mode allows for a higher throughput of transactions on the principal. This mode does not require a witness server because failover is a manual process.
Failover database server
SharePoint supports automatic failover with SQL Server mirroring by using the failover database server option provided when you create new web applications, new content databases, or new service applications. If you specify a failover database server and SharePoint is unable to contact the principal SQL Server after a number of retry operations, SharePoint will automatically attempt to connect to the same database on the failover database server, that would be the mirror. SQL synchronous mirroring can be configured on SQL Server Standard edition and has no special requirements for Windows Server. You can only configure SQL asynchronous mirroring by using SQL Server Enterprise or Datacenter editions.
Using Failover Clustering for SQL High-Availability
SQL Server failover clustering is based on Windows Server Failover Clustering (WSFC). Clustering protects against the failure of the actual database server hardware, whereas mirroring protects against the failure of the data itself. Using WSFC and a shared storage platform, such as iSCSI or fiber channel-based storage, enables you to create a High-Availability instance of SQL Server. A High-Availability instance of SQL Server includes elements such as the instance network name and IP addresses, in addition to the databases running on that instance. This means that applications, such as SharePoint, that are running on the databases do not need configuration of a failover server setting. If a server in the cluster fails, the entire SQL instance, including storage access, network name, IP address and supporting Windows services, fail over to another node in the cluster and the application is able to access the databases without reconfiguration. Because of the requirement for the WSFC feature, SQL failover clustering requires that SQL Server be installed on Windows Server Enterprise or Datacenter editions. You can use SQL Server Standard edition to create a two-node failover cluster. You can use SQL Server Enterprise edition to create a failover cluster using the maximum number of nodes supported by the operating system. Windows Server 2008 R2 supports clusters up to 16 nodes; Windows Server 2012 supports clusters up to 64 nodes.
Using AlwaysOn Availability Groups for High-Availability
SQL Server 2012 offers a new high-availability mechanism called AlwaysOn Availability Groups. AlwaysOn Availability Groups provide high-availability protection for databases by managing multiple copies of a database in a manner similar to database mirroring, but significantly enhanced. When you create an AlwaysOn Availability Group, you configure an instance of SQL Server 2012 as the primary replica for that group. The primary replica hosts the read/write copies of the databases that you want to keep highly available. You can then configure between one and four SQL Server 2012 instances as secondary replicas. Each secondary replica hosts a copy of each of the databases from the primary replica that are included in the availability group. If the primary replica becomes unavailable, you can manually or automatically pass the role of primary replica to one of the secondary replicas in a process called failover. The databases that you want to make highly available in the availability group must use the full recovery model. The primary replica keeps secondary replicas up-to-date by copying transactions across to secondary copies. AlwaysOn Availability Groups support the following two availability modes.
- Synchronous-commit mode. In synchronous-commit mode, when a client updates a database, the primary replica writes the change to its local log and copies the change to its secondary replicas. The primary replica then waits for the secondary replicas to confirm that they have written the transaction to their logs before committing the transaction and sending an acknowledgement to the client. Synchronous-commit mode ensures that the databases that are part of the availability group on the primary and secondary replicas are synchronized at all times to ensure no data loss in the event of a failover. However, this process results in reduced performance while the primary replica waits for confirmation.
- Asynchronous-commit mode. In asynchronous-commit mode, when a database receives an update, the primary replica writes the change to its log and sends the updates to its secondary replicas. This mode does not wait for confirmation from the secondary replicas and that results in improved performance. However, if a failover occurs, the secondary replicas may not have all the transactions from the primary and some data may be lost. Clients connect to the databases in an availability group using an availability group listener that consists of a DNS name, a TCP port number and one or more IP addresses. Using a listener removes the need to reconfigure clients to connect to a new primary replica in the event of a failover. AlwaysOn Availability Groups support various types of failover that can be automatic or planned and may or may not guarantee zero data loss depending on whether the availability databases are running in synchronous-commit mode or asynchronous-commit mode. Automatic failover is only available in synchronous-commit mode.
Using Log Shipping for High-Availability
In SQL Server, log shipping provides a simple way to implement a warm standby solution to aid high-availability and disaster recovery scenarios. Warm standby means that the solution cannot automatically take over in the event of a failure at the primary server and some reconfiguration work is required to bring the solution online. However, a warm standby solution is much quicker to implement than restoring databases from backup and it is typically cheaper to implement than a hot standby solution, such as mirroring, that can automatically fail over. Log shipping consists of periodically backing up the transaction log and copying the backup to a secondary server that also holds a copy of the database.
The secondary server restores the log backup so that the logs are ready to process when you want to start the secondary copy. This backup, copy and restore process is automated by a series of SQL Server Agent jobs. You can specify how frequently log backups are performed and transferred to secondary servers to determine how far behind the secondary copy is. However, it is important to note that with log shipping, there is always a risk of data loss, because there is no way to guarantee transactional consistency between the primary and secondary copies. Log shipping is useful because there is not the same requirement for transactional consistency and no clustering requirements, so log shipping can be used across various subnets and often between SQL Servers in various physical locations, such as a disaster recovery site. You can also configure secondary copies to have a load delay, that prevents the secondary database from including changes within a recent time window, such as eight hours. This means that the secondary copy will always be eight hours behind the primary copy. This configuration can be useful where you want to protect the database against user error or logical corruption. In such a case, the administrator can choose to switch to the secondary copy that has not yet processed the action you want to avoid. Log shipping requires that you use the full or bulk-logged recovery models on the database. You can use log shipping with SQL Server Standard and Enterprise editions running on Windows Server Standard edition. There is no specific requirement for Windows Server Enterprise edition to configure log shipping.
High-Availability Support for SharePoint Databases
Although SQL Server supports several high-availability technologies, not all of these Technologies are suitable for SharePoint databases. Before planning your high-availability and disaster Recovery strategy for SharePoint, it is important to understand which high-availability technologies are suitable for use with your farm.