Pro Oracle Database 11g Administration (48 page)

Read Pro Oracle Database 11g Administration Online

Authors: Darl Kuhn

Tags: #Oracle DBA

LOBs provide a way to manage very large files. Oracle has another feature, partitioning, which allows you to manage very large tables and indexes. Partitioning is covered in detail in the next chapter.

268

CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER

Term Meaning

Partition pruning

Elimination of unnecessary partitions. Oracle detects which partitions need to be accessed by a SQL statement and removes (prunes) any partitions that aren’t needed.

Partition-wise join

Join executed in partition-sized pieces to improve performance by executing many smaller tasks in parallel rather than one, large task in sequence.

Local partitioned index

Index that uses the same partition key as its table.

Global partitioned index

Index that doesn’t use the same partition key as its table.

Global nonpartitioned

Regular index created on a partitioned table. The index itself isn’t

indexed

partitioned.

If you work with mainly small online transaction processing (OLTP) databases, you probably don’t need to create partitioned tables and indexes. However, if you work with large OLTP databases or in data-warehouse environments, you can most likely benefit from partitioning. Partitioning is a key to designing and building scalable and highly available database systems.

What Tables Should Be Partitioned?

Following are some rules of thumb for determining whether to partition a table. In general, you should consider partitioning for

• Tables that are over 2GB in size.

• Tables that have more than 10 million rows, when SQL operations are getting slower as more data is added.

• Tables you know will grow large. It’s better to create a table as partitioned rather than rebuild it as partitioned after performance begins to suffer as the table grows.

• Tables that have rows that can be divided in a way that facilitates parallel operations like loading, retrieval, or backup and recovery.

• Tables for which you want to archive the oldest partition on a periodic basis, and tables from which you want to drop the oldest partition regularly as data becomes stale.

One rule is that any table over 2GB in size is a potential candidate for partitioning. Run this query to show the top space-consuming objects in your database:

select * from (

select

owner

,segment_name

,segment_type

,partition_name

,sum(extents) num_ext

270

CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER

,sum(bytes)/1024/1024 meg_tot

from dba_segments

group by owner, segment_name, segment_type, partition_name

order by sum(extents) desc)

where rownum <= 10;

Here’s a snippet of the output from the query:

OWNER SEGMENT_NAME SEGMENT_TYPE PARTITION_NAME NUM_EXT MEG_TOT

------- -------------------- ---------------- -------------- -------- --------

REP_MV REG_QUEUE_REP_ARCH TABLE 29,218 29,218

REP_MV REG_QUEUE_REP TABLE 15,257 15,257

STAR2 F_INSTALLATIONS TABLE PARTITION INST_P_6 4,092 4,092

This output shows that a few large tables in this database may benefit from partitioning. For this database, if there are performance issues with these large objects, then partitioning may help.

If you’re running the previous query from SQL*Plus, you need to apply some formatting to the columns to reasonably display the output within the limited width of your terminal: set lines 132

col owner form a10

col segment_name form a20

col partition_name form a15

In addition to looking at the size of objects, if you can divide your data so that it facilitates operations such as loading data, querying, backups, archiving, and deleting, you should consider using partitioning. For example, if you work with a large table that contains data that is often accessed by a particular time range—such as by day, week, month, or year—it makes sense to consider partitioning.

A large table size combined with a good business reason means you should consider partitioning.

Keep in mind that there is more setup work and maintenance when you partition a table. However, as mentioned earlier, it’s much easier to partition a table during setup than it is to convert it after it’s grown to an unwieldy size.

■
Note
Partitioning is an extra-cost option that is available only with the Oracle Enterprise Edition. You have to decide based on your business requirements whether partitioning is worth the cost.

Creating Partitioned Tables

Oracle provides a robust set of methods for dividing tables and indexes into smaller subsets. For example, you can divide a table’s data by date ranges, such as by month or year. Table 12–2 gives an overview of the partitioning strategies available.

271

CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER

Table 12–2.
Partitioning Strategies

Partition Type

Description

Range

Allows partitioning based on ranges of dates, numbers, or characters.

List

Useful when the partitions fit nicely into a list of values, like state or region codes.

Hash

Allows even distribution of rows when there is no obvious partitioning key.

Composite

Allows combinations of other partitioning strategies.

Interval

Extends range partitioning by automatically allocating new partitions when new partition key values exceed the existing high range.

Reference

Useful for partitioning a child table based on a parent table column.

Virtual

Allows partitioning on a virtual column.

System

Allows the application inserting the data to determine which partition should be used.

The following subsections show examples of each partitioning strategy. In addition, you learn how to place partitions into separate tablespaces; to take advantage of all the benefits of partitioning, you need to understand how to assign a partition to its own tablespace.

Partitioning by Range

Range partitioning is frequently used. This strategy instructs Oracle to place rows in partitions based on ranges of values such as dates or numbers. As data is inserted into a range-partitioned table, Oracle determines which partition to place a row in based on the lower and upper bound of each range partition.

The range-based partition key is defined by the PARTITION BY RANGE clause in the CREATE TABLE

statement. This determines which column is used to determine which partition a row belongs in. You’ll see some examples shortly.

Each range partition requires a VALUES LESS THAN clause that identifies the non-inclusive value of the upper bound of the range. The first partition defined for a range has no lower bound. Any value less than the first partition's VALUES LESS THAN clause are inserted into the first partition. For partitions other than the first partition, the lower bound of a range is determined by the upper bound of the previous partition.

Optionally, you can create a range-partitioned table’s highest partition with the MAXVALUE clause.

Any row inserted that doesn’t have a partition key that falls in any lower ranges is inserted into this topmost MAXVALUE partition.

272

CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER

Using a NUMBER for the Partition Key Column

Let’s look at an example to illustrate the previous concepts. Suppose you’re working in a data-warehouse environment where you typically have a fact table that stores information about an event (such as registrations, sales, downloads, and so forth). In this scenario the fact table contains a number column that represents a date. For example, the value 20110101 represents January 1, 2011. You want to partition the fact table based on this number column. This SQL statement creates a table with three partitions based on a range of numbers:

create table f_regs

(reg_count number

,d_date_id number

)

partition by range (d_date_id)(

partition p_2010 values less than (20110101),

partition p_2011 values less than (20120101),

partition p_max values less than (maxvalue)

);

When creating a range-partitioned table, you don’t have to specify a MAXVALUE partition. However, if you don’t specify a partition with the MAXVALUE clause, and you attempt to insert a row that doesn’t fall in any other defined ranges, you receive an error such as

ORA-14400: inserted partition key does not map to any partition

When you see that error, you have to add a partition that accommodates the partition-key value being inserted or add a partition with the MAXVALUE clause.

■
Tip
If you’re using Oracle Database 11g or higher, consider using an interval partitioning strategy in which partitions are automatically added by Oracle when the high range value is exceeded. This topic is covered a bit later in the section “Creating Partitions on Demand.”

You can view information about the partitioned table you just created by running the following query:

select

table_name

,partitioning_type

,def_tablespace_name

from user_part_tables

where table_name='F_REGS';

Here’s a snippet of the output:

TABLE_NAME PARTITION DEF_TABLESPACE_NAME

-------------------- --------- -------------------------

F_REGS RANGE USERS

273

CHAPTER 11 ■ PARTITIONING: DIVIDE AND CONQUER

EOF

if [ $? -ne 0 ]; then

mailx -s "Partition range issue: f_regs" [email protected] <

check f_regs high range.

EOF

else

echo "f_regs ok"

exit 0

Ensure that you don’t inadvertently add data to a production table with a script like this. You have to modify this script carefully to match your table and high-range partition-key column.

Using a TIMESTAMP for the Partition Key Column

You may have noticed that the example in the previous section created the column D_DATE_ID as a NUMBER data type instead of a DATE data type for the F_REGS table. One technique sometimes employed in data-warehouse environments uses an intelligent surrogate key for the primary key of the D_DATES

dimension. It’s intelligent because the key number represents a date. This lets you partition the fact table (F_REGS) on a range of numbers that always represent a date. Sometimes data-warehouse architects find this type of partitioning easier to work with than DATE- or TIMESTAMP-based fields.

To illustrate the previous point, the following example creates the F_REGS table with a TIMESTAMP

data type for the D_DATE_DTT column:

create table f_regs

(reg_count number

,d_date_dtt timestamp

)

partition by range (d_date_dtt)(

partition p_2010 values less than (to_date('01-jan-2011','dd-mon-yyyy')), partition p_2011 values less than (to_date('01-jan-2012','dd-mon-yyyy')), partition p_max values less than (maxvalue)

);

As shown in this code, I recommend that you use the TO_DATE function with a format mask to be sure there is no ambiguity about how the date should be interpreted. This technique is every bit as valid as using a NUMBER field for the partition key. Just keep in mind that whoever designs the data-warehouse tables may have a strong opinion about which technique to use.

One slight variation on this example leaves out the TO_DATE function. You must then ensure that the date string you use matches a date format recognized by Oracle. For example: create table f_regs

(reg_count number

,d_date_dtt timestamp

)

partition by range (d_date_dtt)(

partition p_2010 values less than ('01-jan-11'),

partition p_2011 values less than ('01-jan-12'),

partition p_max values less than (maxvalue)

);

275