Read Oracle Essentials Oracle Database 11g Online
Authors: Rick Greenwald
When a user wants to write data back to a distributed database, the issue becomes a bit more complicated. As we’ve mentioned before, a transaction is an atomic logical unit of work that typically contains one or more SQL statements. These statements write data to a database and must either be committed or rolled back as a unit. Distributed transactions can take place across multiple database servers. When distributed transactions are committed via the SQL COMMIT statement, Oracle uses a two-phase commit protocol to ensure transaction integrity and consistency across multiple sys-
tems. This protocol is further described in this chapter’s “Two-Phase Commits” section.
Access to and from Non-Oracle Databases
Oracle’s Transparent Gateways (illustrated in
Figure 13-1)
are Oracle software products that provide users with access to non-Oracle databases via Oracle SQL. Oracle SQL is automatically translated into the SQL of the target database, allowing applications developed for Oracle to be used against non-Oracle databases. You can also use native SQL syntax for the target database, which can be sent directly to the target without translation. Oracle datatypes such as NUMBER, CHAR, and DATE
306
|
Chapter 13: Oracle Distributed Databases and Distributed Data
are converted into the datatypes of the target. Oracle data dictionary views are provided for target data store objects. As with Oracle databases, heterogeneous databases can be linked to Oracle through database links to create a distributed database.
The gateways can be deployed in a two-tier architecture in the Oracle database or in a middle tier (Oracle Application Server).
Results
Oracle SQL
Other Database SQL
Client
Oracle Database
Other Database
with Transparent
Server
Gateway
Figure 13-1. Typical configuration and use of Transparent Gateways
There are four basic types of database connectivity provided:
Open Database Connectivity
Generic ODBC and OLE DB interfaces are free and are bundled with the Oracle database. Open Systems Gateways provide access to Informix, Microsoft SQL
Server, Sybase, and other databases on Unix and Windows platforms. These interfaces and gateways leverage Heterogeneous Services included in the Oracle database, which determine optimal SQL strategies for the remote site. Additionally, starting with Oracle Database 10
g
, Oracle’s OLAP Option provides OLE
DB for OLAP (ODBO) enabling access from a variety of analysis tools.
Transparent Gateways
Transparent Gateways exist for dozens of non-Oracle data stores. Mainframe Integration Gateways provide access to DB2 on mainframes. Enterprise Integration Gateways provide access to IBM AS/400 and via IBM Distributed Relational Database Architecture (DRDA) connections. Finally, Oracle offers the EDA/SQL
Gateways for a number of other sources. Transparent Gateway performance improved in Oracle8 by moving Heterogeneous Services from the Transparent Gateway layer into the database kernel. Performance was further improved in the Oracle8
i
release with the introduction of multithreading for these services, in Oracle9
i
with multithreaded agent support, and in Oracle Database 11
g
with parallel data retrieval from non-Oracle databases. Oracle Database 10
g
added support for remote functions in non-Oracle databases embedded in SELECT
statements. Oracle Database 11
g
adds new Gateways to connect to Adabas, IMS, and VSAM data stores.
Procedural Gateways
Procedural Gateways implement remote procedure calls (RPCs) to applications built on non-Oracle data stores. The Gateway for APPC, the standard IBM protocol for RPCs, is used when Oracle applications need procedural access to
Accessing Multiple Databases As a Single Entity
|
307
applications built on CICS, DB2, IMS, VSAM, and other data stores on the mainframe and applications that use SNA LU6.2 to communicate to the mainframe. The Oracle Procedural Gateway for IBM MQSeries allows Oracle-based applications to exchange messages with applications that communicate via MQSeries message queues. Both are included with the Oracle Enterprise Integration Gateways.
Access Manager
An access manager provides access to Oracle from non-Oracle based applications. The Oracle Access Manager for AS/400 resides on the AS/400 and provides AS/400 applications written in RPG, C, or COBOL access to Oracle running on any platform. You can access Oracle from these applications through ANSI-standard SQL or through Oracle DML or DDL. Because PL/SQL is also supported, AS/400 applications can call Oracle stored procedures. TCP/IP and LU6.2 are supported for connectivity (via Oracle Net). The Oracle Access Manager for AS/400 is included with the Oracle Enterprise Integration Gateways.
Two-Phase Commits
One of the biggest issues associated with the use of distributed databases is the difficulty of guaranteeing the same level of data integrity for updates to distributed databases. Because a transaction that writes data to multiple databases must depend on a network for the transmission of information, it is inherently more susceptible to lost information than a single Oracle instance on a single machine. And since a transaction must guarantee that all writes occur, this increased instability could adversely affect data integrity.
The standard solution for this problem is to use two message-passing phases as part of a transaction commit; hence, the protocol used is referred to as a
two-phase commit
. The main database first polls each of the participants to determine if they are ready; if they are, the transactional updates are tentatively sent to them. In the second phase, if all the participants are in agreement that the messages have properly been received, the changes are committed. If any of the nodes involved in the transaction cannot verify receipt of the changes, the transactions are rolled back to their original state.
For example, if a transaction is to span databases on machines A, B, and C, in the first phase of the commit operation, each of the databases is sent the appropriate transactional update. If each of these machines acknowledges that it has received the update, the second phase of the update executes the COMMIT command. By separating the transmission of the data for the update from the actual COMMIT
operation, a two-phase commit greatly decreases the possibility of distributed data losing its integrity.
Youcan compare this approach to a single-phase update in which the COMMIT
command is sent along with the transactional update information. There is no way of knowing whether the update ever reached all the machines, so any sort of interruption in the delivery of the update to any of the machines would cause the data to be in
308
|
Chapter 13: Oracle Distributed Databases and Distributed Data
an inconsistent state. When a transaction involves more than one machine, the possibility of the loss of an update to one of the machines increases greatly, which, in turn, mandates the use of the two-phase commit protocol. Of course, since the two-phase commit protocol requires more messaging to be passed between machines, a two-phase commit can take longer than a standard commit; however, the corresponding gain in all-important data integrity more than makes up for the decrease in performance.
Transaction Processing Monitors
In 1991, the X/Open standards group defined an open systems standard interface through which transaction processing (TP) monitors could communicate with XA-compliant resource managers, such as the Oracle RDBMS and other XA-compliant databases. Several popular TP monitors that support XA are in production, including BEA Tuxedo and IBM’s CICS and Encina.
Oracle added an Oracle Manager for Microsoft Transaction Server (MTS) to Oracle8
i
for Windows NT. Since that time, Microsoft has superseded its COM architecture with the .NET architecture. Release 2 of Oracle9
i
added .NET support enabling .NET transactional applications to use Oracle as a resource manager.
We have mentioned TP monitors in previous chapters in connection with their role in online transaction processing. Among their other duties, TP monitors assure that transactions between multiple applications and resources complete properly. As noted previously, Oracle provides its own two-phase commit protocol for distributed transactions, a capability once available only with a TP monitor. Standalone TP
monitors are also used less frequently today for workload management (see
Figure 13-2), as this capability is now built into middle-tier applications.
Application Server
with Transaction
Processing Monitor
Clients
Database
Servers
Figure 13-2. Application server with TP monitor
Accessing Multiple Databases As a Single Entity
|
309
If you are still considering the use of TP monitors, you likely have one of these scenarios:
• Migration of legacy applications (usually originally written using CICS and COBOL for a mainframe) to CICS on Unix or Windows NT
• Need for two-phase commits between Oracle and other XA-compliant databases
Moving Data Between Distributed Systems
The previous section discussed the use of multiple database servers acting together as part of a single logical database for users. The following situations call for the contents of a database to be duplicated and moved between systems:
• When data available locally eliminates network bandwidth issues or contention for system resources
• When mobile database users can take their databases with them and operate disconnected from the network
• When redundant databases can help to deliver higher levels of reliability, as each database can be used as a backup for other databases
In many grid implementations, the ability to share resources across the grid can also require data to be replicated to multiple servers within the grid.
The biggest issue facing users of multiple identical or similar databases is how to keep the data on all of the servers in sync as the data is changed over time. As a user inserts, updates, or deletes data on one database, you need to have some way to get this new data to the other databases. In addition, youwill have to deal with the possible data-integrity issues that can crop up if the changes introduced by distributed users contend with each other.
Oracle offers a number of strategies to address this situation. With Oracle9
i
Release 2, these strategies were rolled into a single component, Oracle Streams. However, the different strategies within Streams still have their own characteristics, which are discussed separately in the following sections.
Advanced Replication
The copying and maintaining of database tables among multiple Oracle databases on distributed systems is known as
replication
. Changes that are applied at any local site are propagated automatically to all of the remote sites. These changes can include updates to data or changes to the database schema. Replication is frequently implemented to provide faster access for local users at remote sites or to provide a disaster-recovery site in the event of loss of a primary site. Oracle’s Advanced Replication features support both asynchronous replication and synchronous replication.
Oracle also supports heterogeneous replication to DB2 through its Replication Services, bundled in the Mainframe Integration Gateways.
310
|
Chapter 13: Oracle Distributed Databases and Distributed Data
Replication services have been in the Oracle database for a long time, but have been continually evolving. Oracle8 moved execution of replication triggers to the database kernel and enabled automatic parallelization of data replication to improve performance. Oracle8
i
added replication triggered by changes to selected rows or columns of a table. Oracle9
i
replication added support for object datatypes and multitier updateable materialized views. Release 2 of Oracle9
i
added log-based replication via Oracle Streams. Although Oracle continued to support the previous generation Advanced Replication in newer database releases, we recommend that for new implementations youuse Streams for replication. Nevertheless, we’ll describe replication basics here and some of the features of Advanced Replication for completeness before we cover Streams.
Asynchronous replication
is the storage of changes locally for subsequent forwarding to a remote site. Some types of asynchronous replication include read-only snapshots replicated from a single updateable master table and updateable snapshots that, though disconnected, can also be updated.
In the Standard Edition of Oracle, youcan have only one master site, which replicates changes to its child sites. In the Enterprise Edition, multiple master sites can exist and updates can take place at any of these sites. The updates to these sites must be
synchronized
, meaning that an update is not completed until all of the target sites have been updated; otherwise, conflicts can remain unresolved.
Conflicts can occur when more than one site updates the same data element during the same replication interval. Changes are propagated using deferred remote procedure calls (RPCs) based on events or at points in time when connectivity is available or communications costs are minimal.
Several conflict-resolution routines provided with Enterprise Edition can be automatically used to resolve replication conflicts. An administrator can simply choose which conflict-resolution strategy he wishes to use for a particular replication. For updates that may affect a column or groups of columns, standard resolution choices include the following: