Oracle Essentials Oracle Database 11g (51 page)

In Oracle releases prior to Version 7.3, the database wasn’t available until all uncommitted transactions rolled back. Although a DBA could control the checkpoint frequency and therefore control the time required for the rollforward phase of instance recovery, the number of uncommitted transactions at the time of the crash varied tremendously so the time needed for rollback could not really be accurately controlled or predicted. In a busy OLTP system, there are typically a fair number of in-flight transactions requiring rollback after a crash. This situation led to variable and unpredictable times for crash recovery.

The solution to this problem,
deferred rollback
, was introduced in Oracle 7.3. Oracle opens the database after the rollforward phase of recovery and performs the rollback of uncommitted transactions in the background. This process reduces database downtime and helps to reduce the variability of recovery times by deferring the rollback phase.

But what if a user’s transaction begins working in a database block that contains some changes left behind by an uncommitted transaction? If this happens, the user’s transaction will trigger a foreground rollback to undo the changes and will then proceed when rollback is complete. This action is transparent to the user—he doesn’t receive error messages or have to resubmit the transaction.

Oracle8
i
further optimized the deferred rollback process by limiting the rollback triggered by a user transaction to the block in which the transaction is interested. For example, suppose there is a large uncommitted transaction that affected 500 database blocks. Prior to Oracle8
i
, the first user transaction that touched one of those 500 blocks would trigger a foreground rollback and absorb the overhead of rolling back the entire transaction. Leveraging fast-start rollback, the user’s transaction would roll back only the changes to the block in which it was interested. New transactions would not have to wait for the complete rollback of large uncommitted transactions.

System Failure

|

261

Today, rollback management is made simpler by automated features in the database. For example, as of Oracle Database 10
g
, automatic undo retention tuning occurs controlling the amount of undo information held in rollbacks. The Redo Logfile Size Advisor determines the optimal smallest redo logfile size based on the FAST_START_MTTR_TARGET setting and database statistics gathered.

Protecting Against System Failure

There are a variety of approaches youcan take to help protect your system against the ill effects of system crashes and failures, including the following:

• Providing component redundancy

• Using Real Application Clusters

• Using Transparent Application Failover software services

Component Redundancy

As basic protection, the various hardware components that make up the database server itself must be fault-tolerant.
Fault-tolerance
, as the name implies, allows the overall hardware system to continue to operate even if one of its components fails.

This, in turn, implies redundant components and the ability to detect component failure and seamlessly integrate the failed component’s replacement. The major system components that should be fault-tolerant include the following:

• Disk drives

• Disk controllers

• CPUs

• Power supplies

• Cooling fans

• Network cards

• System buses

Disk failure is the largest area of exposure for hardware failure, since disks have the shortest mean times to failure of any of the components in a computer system. Disks also present the greatest variety of redundant solutions, so discussing that type of failure in detail should provide the best example of how high availability can be implemented with hardware.

Disk redundancy

Although the mean time to failure of an individual disk drive is very high, the ever-increasing number of disks used for today’s very large databases results in more frequent failures. Protection from disk failure is usually accomplished using RAID

262

|

Chapter 11: Oracle and High Availability

(Redundant Array of Inexpensive Disks) technology. The use of redundant storage has become common for systems of all sizes and types for two primary reasons: the real threat of disk failure and the proliferation of packaged, relatively affordable RAID solutions.

RAID technology uses one of two concepts to achieve redundancy:
Mirroring

The actual data is duplicated on another disk in the system.

Striping with parity

Data is striped on multiple disks, but instead of duplicating the data itself for redundancy, a mathematical calculation termed
parity
is performed on the data and the result is stored on another disk. You can think of parity as the sum of the striped data. If one of the disks is lost, youcan reconstruct the data on that disk using the surviving disks and the parity data. The lost data represents the only unknown variable in the equation and can be derived. You can conceptual-ize this as a simple formula:

A + B + C + D = E

in which A–D are data striped across four disks and E is the parity data on a fifth disk. If youlose any of the disks, youcan solve the equation to identify the missing component. For example, if youlose the B drive youcan solve the formula as:

B = E – A – C – D

There are a number of different disk configurations or types of RAID technology, which are formally termed
levels
. The basics of RAID technology were introduced in

Chapter 7,
but
Table 11-2
summarizes the most relevant levels of RAID in a bit more detail, in terms of their cost, high availability, and the way Oracle uses each RAID

level.

Which RAID Levels Should You Use with Oracle?

Some people say that you should never use RAID-5 for an Oracle database because of the degraded write performance of this level of RAID. RAID-1 and RAID-0+1 offer better performance, but at double the cost of disk storage. RAID-5 offers a cheaper and reasonable solution, provided that you can meet performance requirements despite the extra write overhead for maintaining parity data. Use these generic guidelines to help determine the appropriate uses of different RAID levels:

• Use RAID-1 for redo log files.

• Use RAID-5 for database files, provided that the write overhead is acceptable and adequate I/O is available.

• Use RAID-1 or RAID-0+1 for database files if RAID-5 write overhead is unacceptable.

Protecting Against System Failure

|

263

Table 11-2. RAID levels relevant to high availability

Level

Disk configuration

Cost

Comments

Oracle usage

0

Simple striping, no

Same cost as unpro-

The term RAID-0 is used

Striping simplifies administra-

redundancy

tected storage.

to describe striping,

tion for Oracle datafiles. Suitable

which increases read

for all types of data for which

and write throughput.

redundancy isn’t required.

However, this is not

really RAID, as there is

no actual redundancy.

1

Mirroring

Twice the cost of

Same write perfor-

Lack of striping adds complexity

unprotected storage.

mance as a single disk.

of managing a larger number of

Read performance may

devices for Oracle. Often used for

improve through servic-

redo logs, since the I/O for redo

ing reads from both

is typically relatively small

copies.

sequential writes. Striped arrays

are more suited to large I/Os or

to multiple smaller, random I/Os.

0+1

Striping and mir-

Twice the cost of

Best of both worlds—

Same usage as RAID-0, but

roring

unprotected storage.

striping increases read

provides protection from disk

and write performance

failure.

and mirroring for

redundancy avoids

“read-modify-write”

overhead of RAID-5.

5

Striping with rotat-

Storage capacity is

Parity data is spread

Cost-effective solution for all Ora-

ing or distributed

reduced by 1/
N
,

across all disks, avoiding

cle data except redo logs.

parity

where
N
is the num-

the potential bottleneck

Degraded write performance

ber of disks in the

found in some other

must be taken into account. Pop-

array. For example,

types of RAID arrays.

ular for reads where adequate I/O

the storage is

Striping increases read

is provided. Write penalties may

reduced by 20%, or

performance. Maintain-

slow loads and index builds.

1/5 of the total disk

ing parity data adds

Often avoided for high-volume

storage, for a 5-disk

additional I/O, decreas-

OLTP due to write penalties.

array.

ing write performance.

Some storage vendors, such as

For each write, the asso-

EMC, have proprietary solutions

ciated parity data must

(RAID-S) to minimize parity over-

be read, modified, and

head on writes.

written back to disk. This

is referred to as the

“read-modify-write”

penalty.

264

|

Chapter 11: Oracle and High Availability

Figure 11-3 illustrates the disk configurations for various RAID levels.

RAID-O: Simple Striping, No Redundancy

DATA

DATA

DATA

DATA

RAID-1: Simple Mirroring

DATA

DATA

RAID-0+1: Striping and Mirroring

64 KB

64 KB

64 KB

64 KB

64 KB

64 KB

64 KB

64 KB

RAID-5: Striping with Distributed Parity

A

B

C

D

PARITY

PARITY

A

B

C

D

D

PARITY

A

B

C

C

D

PARITY

A

B

Figure 11-3. RAID levels commonly used with an Oracle database
Automatic Storage Management

Oracle Database 10
g
and more recent database releases include Automatic Storage Management (ASM). We introduced ASM in
Chapter 5
and described its manageability considerations. ASM enables youto create a pool of storage on disk groups and then manages the placement of database files on the storage. ASM features enable it to replace non-Oracle file systems and logical volume managers for files managed by the Oracle database. An ASM instance manages each of the disks in the disk group, and one ASM instance is provided for each database node in a RAC

environment.

ASM provides “Striping and Mirroring Everything” (SAME) for many types of disks, including “Just a Bunch of Disks” (JBOD) arrays. You can specify groups of disks, and designate a failure group to be used in the result of a disk failure. Mirroring can also be set up on a per-file basis, and you can specify one or two mirrors. ASM

includes the ability to detect disk “hot spots” and redistribute data to avoid disk bottlenecks, as well as the capability of adding disks to a disk group without any interruption in service. DBAs add the disks to disk groups or remove disks from disk groups using Oracle Enterprise Manager.

Protecting Against System Failure

|

265

Stored data is automatically rebalanced when disks are added or removed. When a drive fails, remirroring to remaining drives is automatic. These features make ASM

ideal for managing a database storage grid and allow youto use cheaper disk systems while obtaining higher levels of availability and performance.

Oracle Database 11
g
introduced a fast mirror resynchronization capability enabling faster recovery from transient failures. ASM can now be set to only resynchronize changed ASM disk extents for limited duration failures.

Simple Hardware Failover

Oracle recovers automatically from a system crash. The automatic recovery protects data integrity, critical in a relational database, but also results in downtime as the database recovers from a crash. When a hardware failure occurs, the ability to quickly detect a system crash and initiate recovery is crucial to minimizing the associated downtime.

When an individual server fails, the instance running on that node fails as well.

Depending on the cause, the failed node may not return to service quickly or be noticed immediately. Either way, companies that wish to protect their systems from the failure of a node typically employ a cluster of machines to achieve simple hardware
failover
. Failover is the ability of a surviving node in a cluster to assume the responsibilities of a failed node. Although failover doesn’t directly address the issue of the reliability of the underlying hardware, automated failover can reduce the downtime from hardware failure.

The concept is very simple: a combination of software and hardware “watches” over the cluster. Typically, this monitoring is done by regularly checking a
heartbeat
, which is a message sent between machines in the cluster. If Machine A fails, Machine B will detect the failure through the loss of the heartbeat and will execute scripts to take over control of the disks, assume Machine A’s network address, and restart the processes that failed with Machine A. From an Oracle database perspective, the entire set of events is identical to an instance crash followed by an instance recovery.

Other books

Tangled Passion by Stanley Ejingiri
Tasting Notes by Cate Ashwood
Shatnerquake by Burk, Jeff
Zoo II by James Patterson
The Sea of Adventure by Enid Blyton
Fool Me Twice by Brandman, Michael
Such Is My Beloved by Morley Callaghan
Katie and the Mustang #1 by Kathleen Duey