Read Windows Server 2008 R2 Unleashed Online
Authors: Noel Morimoto
service or application.
The following are best practices from this chapter:
. Purchase quality server, network hardware, and shared storage devices and HBAs that
are certified for Windows Server 2008 R2 when deployed on failover clusters.
. Deploy cluster node operating systems on fault-tolerant disk arrays.
29
. Deploy only services and applications that are certified to work on Windows Server
2008 R2 failover clusters or NLB clusters whenever possible.
. Thoroughly understand the application that will be used before determining which
clustering technology to use.
. Use Windows Server 2008 R2 failover clusters to provide system-level fault-tolerance
for mission-critical applications, such as enterprise messaging, databases, file and
print services, and other networking services.
. If iSCSI is used for shared storage, ensure that any network adapters used for iSCSI
communication are excluded from any cluster usage.
1226
CHAPTER 29
System-Level Fault Tolerance (Clustering/Network Load Balancing)
. Use NLB to provide connectivity to TCP/IP-based services, such as Remote Desktop
Services, websites, VPN services, SMTP gateways, and streaming media services.
. Rename and clearly label all network adapters on each cluster node and configure
static IPv4 and if necessary IPv6 addresses.
. Configure the appropriate power management settings for the system and network
adapters on all cluster nodes.
. Configure the appropriate cluster quorum model that is right for the deployment
and, hopefully, the recommended model.
. Use multiple network cards in each node so that one card can be dedicated to inter-
nal cluster communication (heartbeat network) while the other can be used only for
client connectivity and cluster communication.
. If failback is required, configure the failback schedule to allow failback only during
nonpeak times or after hours to reduce the chance of having a group failing back to
a node during regular business hours.
. Thoroughly test failover and failback mechanisms.
. Be sure that a majority of the nodes remain running to keep the cluster in a working
state if you’re removing a node from a cluster that leverages the Node Majority
ptg
Quorum model.
. Carefully consider backing up and restoring a cluster and do not deploy any clusters
until a tested and documented backup and recovery plan exists.
. For NLB clusters, create a port rule that allows only specific ports to the clustered IP
address and an additional rule blocking all other ports and ranges.
IN THIS CHAPTER
Backing Up the
. Understanding Your Backup and
Recovery Needs and Options
Windows Server 2008 R2
. Creating the Disaster Recovery
Solution
Environment
. Documenting the Enterprise
. Developing a Backup Strategy
. Windows Server Backup
Windows Server 2008 R2 is a very powerful and feature-
Overview
rich operating system that can provide many organizations
. Using Windows Server Backup
with the tools they require from their computer and
network infrastructure. Some of the functions a Windows
. Managing Backups Using the
Server 2008 R2 system can provide include centralized
Command-Line Utility
wbadmin.exe and PowerShell
directory services, email services, file and print services, web
Cmdlets
services, networking and VPN services, streaming media
services, and many more. Of course, before any new system,
. Backing Up Windows Server
service, or application is deployed in an organization’s
2008 R2 Role Services
ptg
computer and network infrastructure, the responsible
. Volume Shadow Copy Service
parties should understand how to set up, optimize, adminis-
(VSS)
ter, and properly back up and recover data and functional-
ity in the event of a failure.
. Windows Server 2008 R2
Startup Options
As is the case with many organizations’ computer and
network infrastructures, new services, servers, or applica-
tions are deployed before a valid backup and recovery plan
to support them have been created or tested. As a result of
this, some organizations are just not prepared when a criti-
cal business system unexpectedly fails or when disasters
strike. Lack of backup and recovery planning can result in
the unrecoverable loss of data, employees unable to perform
their job, and even the loss of revenue or customers.
To avoid this, information technology (IT) managers and
administrators who are responsible for the different aspects
of the computer and network infrastructure should create a
backup and disaster recovery plan.
This chapter provides IT decision makers and their techni-
cal staff with the information they require to start planning
and implementing viable backup strategies for a Windows
Server 2008 R2 infrastructure.
1228
CHAPTER 30
Backing Up the Windows Server 2008 R2 Environment
Understanding Your Backup and Recovery Needs
A key to creating a valuable backup and recovery plan is to have a clear understanding of
how the computer and network infrastructure is configured, as well as having an under-
standing of how the business operates and utilizes the infrastructure. This discovery
process involves mapping out both the computer and network systems in place and also
documenting and understanding the business processes that depend on the infrastructure.
For example, an organization might process incoming orders from field sales representa-
tives via fax transmissions of contracts that are accepted by a Windows Server 2008 R2 fax
service. If the fax service is not available, no orders are processed. This is just a simple
example of when downtime of a Windows server can directly affect business operations.
Understanding which systems and services are most important to the business can help IT
staff set the order or prioritize which systems will be recovered first, in the event of a
large-scale disaster.
Identifying the Different Services and Technologies
Each deployed role, role service, feature, or application provided by a Windows Server
ptg
2008 R2 system provides a key system function, which in many cases is critical to the
organization. Each application, role service, role, and feature installed on a Windows
Server 2008 R2 system should be identified and documented so the IT group can have a
clear view of the complexity of the environment as backup and recovery plans are being
developed. It is very common for server and web-based applications to require special
backup and restore procedures, and these are especially important to identify for disaster
recovery purposes.
Identifying Single Points of Failure
A single point of failure is a device, application, or service on a computer and networking
infrastructure that provides an exclusive function with no redundancy. A common single
point of failure in smaller organizations is a network switch that provides the connectivity
between all of the servers, client workstations, firewalls, wireless access points, and routers
on a network. Within a Windows Server 2008 R2 Active Directory infrastructure as an
example, Active Directory Domain Services (AD DS) inherently comes with its own set of
single points of failure, with its Flexible Single Master Operations (FSMO) roles. These
roles provide an exclusive function to the entire Active Directory forest or just a single
domain, and if the designated domain controller hosting that role fails, these hosted
FSMO roles become unavailable. Even though the FSMO roles are single points of failure,
recovering a domain controller can be very simple and painless if proper backup and
recovery planning is performed. For more information on FSMO roles, refer to Chapter 7,
“Active Directory Infrastructure.”
Understanding Your Backup and Recovery Needs and Options
1229
Evaluating Different Disaster Scenarios
Before a backup and disaster recovery plan can be formulated, IT managers and adminis-
trators should meet with the business owners to discuss and decide on which types of fail-
ures or disasters should be planned for. This section of the chapter provides a high-level
description of common disaster scenarios to consider. Of course, planning for every disas-
ter scenario is nearly impossible or, more commonly, will exceed an organization’s backup
and recovery budget, but discussing the likelihood of each scenario and evaluating how
the scenario can impact the business is necessary.
Physical Disaster
A physical disaster is anything that can keep employees or customers from reaching their
desired office or store location. Examples include natural disasters such as floods, fires,
earthquakes, hurricanes, or tornadoes that can destroy an office. A physical disaster can
also be a physical limitation, such as a damaged bridge or highway blockage caused by a
car accident. When only physical access is limited or restricted, a remote access solution
could reestablish connectivity between users and the corporate network. Refer to Chapter
24, “Server-to-Client Remote Access and DirectAccess,” for more information in this area.
Power Outage or Rolling Blackouts
Power outages can occur at any time unexpectedly. Some power outages are caused by bad
ptg
weather and other natural disasters, but other times they can be caused by high power
consumption that causes system overloads. When power systems are overloaded, rolling
blackouts may occur. A rolling blackout is when a power company shuts off power to
certain power subscribers or areas of service, so that it maintains power to critical services,
such as fire departments, police departments, hospitals, and traffic lights. The rolling part
of rolling blackouts is that the blackout is managed; after a predetermined amount of the
time, the power company will shut down a different power grid and restore power to a
previously shutdown grid. Of course, during power outages, many businesses are unable to
function because the core of their work is conducted on computers or even telephone
systems that require power to function.
Network Outage
Organizations that share data and applications between multiple offices and require access
to the Internet as part of their daily business operations are susceptible to network outages
that can cause severe loss of employee productivity and possibly revenue. Network outages
can affect just a single computer, the entire office, or multiple offices depending on the
cause of the outage. IT staff must take network outages into consideration when creating
the backup and recovery plans.
30
Hardware Failures
Hardware failures seem to be the most common disaster encountered and coincidentally
are the most common type of problem organizations plan for. Server hardware failures
include failed motherboards, processors, memory, network interface cards, network cables,
fiber cables, disk and HBA controllers, power supplies, and, of course, the hard disks in the
local server or in a storage area network (SAN). Each of these failures can be dealt with
differently, but to provide system- or server-level redundancy, key services should be
1230
CHAPTER 30
Backing Up the Windows Server 2008 R2 Environment
deployed in a redundant cluster configuration, such as is provided with Windows Server
2008 R2, Enterprise Edition Failover Clustering, or Network Load Balancing (NLB).
Hard Drive Failure
Hard drives are indeed the most common type of computer- and network-related hard-
ware failure organizations have to deal with. Windows Server 2008 R2 supports hot-swap-
pable hard drives and two types of disks: basic disks, which provide backward
compatibility, and dynamic disks, which allow software-level disk arrays to be configured
without a separate hardware-based disk array controller. Also, both basic and dynamic
disks, when used as data disks, can be moved to other servers easily to provide data or disk
capacity elsewhere if a system hardware failure occurs and the data on these disks needs to
be made available as soon as possible. Windows Server 2008 R2 also contains tools to
provision, connect, and configure storage located on a SAN and can easily mount VHD
files as operating system disks using Disk Manager or diskpart.
NOTE
If hardware-level RAID is configured, the controller card stores the disk array configura-
tion and the manufacturer should be contacted to provide the necessary tools or docu-
mentation necessary to back up, restore, rebuild, or re-create the configuration should
ptg
a controller failure occur or if the disk needs to be moved to a different machine with
the same type of controller.
Software Corruption
Software corruption can occur at many different levels. Operating system files could be
corrupted, antivirus software can interfere with the writing of a file or database causing