Oracle Essentials Oracle Database 11g (34 page)

Shortly after the initial release of Oracle Database 11
g
, Oracle exposed additional storage management capabilities in the database. These features are especially useful in configuring storage for Oracles’s Information Appliances, described in Chapter 8.

How Oracle I/O and Striped Arrays Interact

In almost all large databases, disk striping increases disk I/O rates without adding too heavy an administrative burden for managing a large number of datafiles across many individual disks. The disks may be organized into RAID arrays using a volume manager on the database server, a dedicated I/O subsystem, or a combination of both.

If you are using an Oracle release without ASM, when you set up striped disk arrays, youcan set the
chunk size
used to stripe across the disks. The chunk size is the amount of data written to one disk before moving to the next disk in the array.

Understanding the interaction between different stripe chunk sizes and the two sizes of Oracle I/O is critical in maximizing your I/O performance.

Consider an Oracle database with an 8 KB data block size and the DB_FILE_

MULTIBLOCK_READ_COUNT initialization parameter set to 32. There will be two sizes of I/O by Oracle: a single 8 KB data block and a 256 KB multiblock read (32 times 8 KB). Suppose you configure a four-disk array for use by Oracle with a chunk size of 64

KB so that the 256 KB of data will be spread across the four drives, with 64 KB on each.

168

|

Chapter 7: Oracle Performance

Each 8 KB I/O will hit one spindle, as the 8 KB will lie within one 64 KB chunk.*

Striping can increase performance for small I/Os by maximizing concurrency: each disk can service a different I/O. The multiblock I/Os of 256 KB may hit all four disks.

If the chunk size were 256 KB instead of 64 KB, on average each 256 KB I/O call would hit one disk. In this case, the multiblock I/O will require fewer I/O calls with a larger chunk size on the disks. In either case, a single disk will clearly satisfy single-data-block I/O calls. Striping can increase I/O rates for large reads by driving multiple disks with a single I/O call, as illustrated with a 64 KB chunk size and a 256 KB multiblock I/O.

Figure 7-4
illustrates the interaction of different-sized Oracle I/Os with arrays striped using different chunk sizes.

One

One

256-KB I/O

Oracle 256-KB I/O Call

Oracle 256-KB I/O Call

256-KB I/O

drives all

hits

four disks

one disk

Chunk Size

256 KB

256 KB

256 KB

256 KB

64 KB

64 KB

64 KB

64 KB

Oracle

Oracle

Each 8-KB

Oracle

Oracle

8-KB I/O

8-KB I/O

I/O hits one

8-KB I/O

8-KB I/O

Call

Call

disk

Call

Call

Figure 7-4. Oracle I/O and chunk size interaction

Oracle and Parallelism

The ability to parallelize operations is one of the most important features of the Very Large Database (VLDB). Database servers with multiple CPUs, which are called
symmetric multiprocessing
(SMP) machines, are the norm today for most database servers, and the ability to perform operations in parallel also works well with multicore CPU chips. As performance demands increase and data volumes continue to grow, you will increasingly need to use multiple processors, cores and disks to reduce

* It’s difficult to say exactly what will occur due to the alignment of the stripe-chunk boundaries with Oracle data blocks, but to illustrate the single versus multiple disk point, let’s assume the simple case—they line up!

For a more detailed discussion of striping issues, see the document “Configuring Oracle Server for VLDB,” by Cary Millsap, formerly of Oracle Corporation and now with Hotsos (see
Appendix B).
Anyone who is tempted is welcome to perform detailed testing for all the permutations of stripe chunk size and Oracle I/O.

If you happen to perform this extensive testing, please tell all the rest of us what you find!

Oracle and Parallelism

|

169

the time needed to complete a given task. Oracle supports parallelism within a single SMP server and parallelism across multiple nodes, using Oracle Real Application Clusters. Executing a SQL statement in parallel will consume more of the machine resources—CPU, memory, and disk I/O—but complete the overall task faster.

Parallelism affects the amount of memory and CPU resources used to execute a given task in a fairly linear fashion—the more parallel processes used, the more resources consumed for the composite task. Each parallel execution process has a Program Global Area (PGA) that consumes memory and performs work. Each parallel execution process takes its own slice of CPU, but more parallel processes can reduce the total amount of time spent on disk I/O, which is the place in which bottlenecks can most readily appear.

Two types of parallelism are possible within an Oracle database:
Block-range parallelism

Driven by ranges of database blocks

Partition-based parallelism

Driven by the number of partitions or subpartitions involved in the operation The following sections describe these types of parallelism.

Block-Range Parallelism

In 1994, Oracle 7.1 introduced the ability to dynamically parallelize table scans and a variety of scan-based functions. This parallelism was based on the notion of
block
ranges
, in which the Oracle server would understand that each table contained a set of data blocks that spanned a defined range of data. Oracle7 implemented block-range parallelism by dynamically breaking a table into pieces, each of which was a range of blocks, and then used multiple processes to work on these pieces in parallel. Oracle’s implementation of block-range parallelism was unique in that it didn’t

require physically partitioned tables, described in Chapter 4, to achieve parallelism.

With block-range parallelism, the client session that issued the SQL statement transparently becomes the parallel execution coordinator, dynamically determining block ranges and assigning them to a set of parallel execution (PE) processes. Once a PE

process has completed an assigned block range, it returns to the coordinator for more work. Not all I/O occurs at the same rate, so some PE processes may process more blocks than others. This notion of “stealing work” allows all processes to participate fully in the task, providing maximum leverage of the machine resources.

Block-range parallelism scales linearly based on the number of PE processes, provided you have adequate hardware resources. The key to achieving scalability with parallelism lies in hardware basics. Each PE process runs on a CPU and requests I/O

to a device. If you have enough CPUs reading enough disks, parallelism will scale. If the system encounters a bottleneck on one of these resources, scalability will suffer.

170

|

Chapter 7: Oracle Performance

For example, four CPUs reading two disks will not scale much beyond the two-way scalability of the disks and may even sink below this level if the additional CPUs cause contention for the disks. Similarly, 2 CPUs reading 20 disks will not scale to a 20-fold performance improvement. The system hardware must be balanced for parallelism to scale.

Most large systems have far more disks than CPUs. In these systems, parallelism results in a randomization of I/O across the I/O subsystem. This is useful for concurrent access to data as PE processes for different users read from different disks at different times, resulting in I/O that is distributed across the available disks.

A useful analogy for dynamic parallelism is eating a pie. The pie is the set of blocks to be read for the operation, and the goal is to eat the pie as quickly as possible using a certain number of people. Oracle serves the pie in helpings, and when a person finishes his first helping, he can come back for more. Not everyone eats at the same rate, so some people will consume more pie than others. While this approach in the real world is somewhat unfair, it’s a good model for parallelism because if everyone is eating all the time, the pie will be consumed more quickly. The alternative is to give each person an equal serving and wait for the slower eaters to finish.

Figure 7-5 illustrates the splitting of a set of blocks into ranges.

One pie

Served in helpings

PE

Process

PE

Process

Each

section is

a range of

blocks

Result: Everyone eats all the time

Figure 7-5. Dynamic block-range parallelism

Parallelism for Tables and Partitions of Tables

With
partitioned tables
, introduced in Oracle8, an operation may involve one, some, or all of the partitions of a partitioned table. There is essentially no difference in how block-range parallelism dynamically splits the set of blocks to be read for a regular table as opposed to a partitioned table. Once the optimizer has determined which
Oracle and Parallelism

|

171

partitions should be accessed for the operation, all the blocks of all partitions involved are treated as a pool to be broken into ranges.

This assumption by the optimizer leads to a key consideration for using parallelism and partitioned tables. The degree of parallelism (i.e., the number of parallel execution processes used for the table as a whole) is applied to the set of partitions that will be used for an operation. The optimizer will eliminate the use of partitions that do not contain data an operation will use. For instance, if one of the partitions for a table contains ID numbers below 1,000, and if a query requests ID numbers between 1,100 and 5,000, the optimizer understands that this query will not access this partition.

Since Oracle9
i
, youcan also partition tables based on a list of specific valu es,

although this type of partitioning is typically used to partition tables for maintenance operations. As explained in
Chapter 4,
Oracle has continued to add more choices in ways to implement partitioning.

If you expect that your queries will use partition elimination or pruning and you plan on using parallelism, you should stripe each partition over a sufficient number of drives to scale effectively. This will ensure scalability regardless of the number of partitions accessed. This striping can be achieved manually through the use of multiple datafiles on multiple disks, through the use of striped arrays, or through a combination of both approaches.

What Can Be Parallelized?

Oracle can parallelize far more than simple queries. The list of operations that can be parallelized using block-range parallelism includes the following:

• Tablespace creation

• Index creation and rebuilds

• Online index reorganizations and rebuilds

• Index-organized table reorganizations and movements

• Table creation, such as summary creation using CREATE TABLE...AS SELECT

• Partition-maintenance operations, such as moving and splitting partitions

• Data loading

• Integrity constraints imposing

• Statistics gathering (automatically gathered since Oracle Database 10
g
)

• Backups and restores (including very large files in Oracle Database 11
g
)

• DML operations (INSERT, UPDATE, DELETE)

• Query processing operations

• OLAP aggregate (as of Oracle Database 10
g
)

172

|

Chapter 7: Oracle Performance

Oracle can also provide the benefits of parallelism to individual processing steps for queries. The specific features of query processing that may be parallelized include:

• Table scans

• Nested loops

• Sort merge joins

• Hash joins

• Bitmap star joins

• Index scans

• Partition-wise joins

• Anti-joins (NOT IN)

• SELECT DISTINCT

• UNION and UNION ALL

• ORDER BY

• GROUP BY

• Aggregations

• Import

• User-defined functions

Degree of parallelism

An Oracle instance has a pool of parallel execution (PE) processes that are available to the database users. Controlling the number of active PE processes was an important task in older Oracle database releases; too many PE processes would overload the machine, leading to resource bottlenecks and performance degradation. A high degree of parallelism will also force full-table scans and this may or may not be appropriate.

Figure 7-6 illustrates transparent parallelism within and between sets of PE processes
.

SQL

PE

PE

process

process

Coordinator

PE

PE

Coordinator

process

process

PE

PE

process

process

Results

• Coordinator allocates PE processes and divides task into subtasks

• Each “set” of PE processes performs a different task (e.g., sorting, joining)

• Results are “pipelined” from one set of PE processes to the next
Figure 7-6. Intra-operation and inter-operation parallelism
Oracle and Parallelism

|

173

Determining the optimal degree of parallelism in the presence of multiple users and varying workloads proved challenging. For example, a degree of 8 for a query would provide excellent performance for 1 or 2 users, but what if 20 users queried the same table? This scenario called for 160 PE processes (8 PEs for each of the 20 users), which could overload the machine.

Other books

The Feral Child by Che Golden
Shade's Fall by Jamie Begley
Dying to Read by Lorena McCourtney
1944 - Just the Way It Is by James Hadley Chase
Brewster by Mark Slouka
Princess by Sapphire Knight