Oracle Clusterware processes for 10g
Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.
Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user
Event manager daemon (evmd) —A background process that publishes events that crs creates.
Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.
RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.
Cluster Synchronization Services (ocssd) — Manages cluster node membership and runs as the oracle user; failure of this process results in cluster restart.
Cluster Ready Services (crsd) — The crs process manages cluster resources (which could be a database, an instance, a service, a Listener, a virtual IP (VIP) address, an application process, and so on) based on the resource's configuration information that is stored in the OCR. This includes start, stop, monitor and failover operations. This process runs as the root user
Event manager daemon (evmd) —A background process that publishes events that crs creates.
Process Monitor Daemon (OPROCD) —This process monitor the cluster and provide I/O fencing. OPROCD performs its check, stops running, and if the wake up is beyond the expected time, then OPROCD resets the processor and reboots the node. An OPROCD failure results in Oracle Clusterware restarting the node. OPROCD uses the hangcheck timer on Linux platforms.
RACG (racgmain, racgimon) —Extends clusterware to support Oracle-specific requirements and complex resources. Runs server callout scripts when FAN events occur.
Oracle Clusterware Components
Voting Disk — Oracle RAC uses the voting disk to manage cluster membership by way of a health check and arbitrates cluster ownership among the instances in case of network failures. The voting disk must reside on shared disk.
Oracle Cluster Registry (OCR) — Maintains cluster configuration information as well as configuration information about any cluster database within the cluster. The OCR must reside on shared disk that is accessible by all of the nodes in your cluster
Oracle database background processes specific to RAC
•LMS—Global Cache Service Process
•LMD—Global Enqueue Service Daemon
•LMON—Global Enqueue Service Monitor
•LCK0—Instance Enqueue Process
•LMS—Global Cache Service Process
•LMD—Global Enqueue Service Daemon
•LMON—Global Enqueue Service Monitor
•LCK0—Instance Enqueue Process
* Lock monitor (LMON) process: The LMON process
monitors all instances in a cluster to detect the failure of an instance. It
then facilitates the recovery of the global locks held by the failed instance.
It is also responsible for reconfiguring locks and other resources when
instances leave or are added to the cluster (as they fail and come back online,
or as new instances are added to the cluster in real time).
* Lock manager daemon (LMD) process: The LMD process handles lock manager service requests for the global cache service (keeping the block buffers consistent between instances). It works primarily as a broker sending requests for resources to a queue that is handled by the LMSn processes. The LMD handles global deadlock detection/resolution and monitors for lock timeouts in the global environment.
* Lock manager server (LMSn) process: In a RAC environment, each instance of Oracle is running on a different machine in a cluster, and they all access, in a read-write fashion, the same exact set of database files. To achieve this, the SGA block buffer caches must be kept consistent with respect to each other. This is one of the main goals of the LMSn process In earlier releases of Oracle Parallel Server (OPS) this was accomplished via a ping. That is, if a node in the cluster needed a read-consistent view of a block that was locked in exclusive mode by another node, the exchange of data was done via a disk flush (the block was pinged). This was a very expensive operation just to read data. Now, with the LMSn, this exchange is done via very fast cache-to-cache exchange over the clusters¿ high-speed connection. You may have up to ten LMSn processes per instance.
Its primary job is to transport blocks across the nodes for cache-fusion requests. If there is a consistent-read request, the LMS process rolls back the block, makes a Consistent-Read image of the block and then ship this block across the HSI (High Speed Interconnect) to the process requesting from a remote node.
* Lock manager daemon (LMD) process: The LMD process handles lock manager service requests for the global cache service (keeping the block buffers consistent between instances). It works primarily as a broker sending requests for resources to a queue that is handled by the LMSn processes. The LMD handles global deadlock detection/resolution and monitors for lock timeouts in the global environment.
* Lock manager server (LMSn) process: In a RAC environment, each instance of Oracle is running on a different machine in a cluster, and they all access, in a read-write fashion, the same exact set of database files. To achieve this, the SGA block buffer caches must be kept consistent with respect to each other. This is one of the main goals of the LMSn process In earlier releases of Oracle Parallel Server (OPS) this was accomplished via a ping. That is, if a node in the cluster needed a read-consistent view of a block that was locked in exclusive mode by another node, the exchange of data was done via a disk flush (the block was pinged). This was a very expensive operation just to read data. Now, with the LMSn, this exchange is done via very fast cache-to-cache exchange over the clusters¿ high-speed connection. You may have up to ten LMSn processes per instance.
Its primary job is to transport blocks across the nodes for cache-fusion requests. If there is a consistent-read request, the LMS process rolls back the block, makes a Consistent-Read image of the block and then ship this block across the HSI (High Speed Interconnect) to the process requesting from a remote node.
* Lock (LCK0) process: This process is very similar in functionality to the LMD process described earlier, but it handles requests for all global resources other than database block buffers.
* Diagnosability daemon (DIAG) process: The DIAG process is used exclusively in a RAC environment. It is responsible for monitoring the overall ‘health’ of the instance, and it captures information needed in the processing of instance failures.
To
ensure that each Oracle RAC database instance obtains the block that it needs
to satisfy a query or transaction, Oracle RAC instances use two processes, the
Global Cache Service (GCS) and the Global Enqueue Service (GES). The GCS and
GES maintain records of the statuses of each data file and each cached block
using a Global Resource Directory (GRD). The GRD contents are distributed
across all of the active instances.
Private Interconnect
Clusterware uses the private interconnect for cluster synchronization (network heartbeat) and daemon communication between the the clustered nodes. This communication is based on the TCP protocol.
RAC uses the interconnect for cache fusion (UDP) and inter-process communication (TCP). Cache Fusion is the remote memory mapping of Oracle buffers, shared between the caches of participating nodes in the cluster. Virtual IP (VIP) in Oracle RAC
Without using VIPs or FAN, clients connected to a node that died will often wait for a TCP timeout period (which can be up to 10 min) before getting an error. As a result, you don't really have a good HA solution without using VIPs.
When a node fails, the VIP associated with it is automatically failed over to some other node and new node re-arps the world indicating a new MAC address for the IP. Subsequent packets sent to the VIP go to the new node, which will send error RST packets back to the clients. This results in the clients getting errors immediately.
Nodes are supported in a RAC Database
10g Release 2, support 100 nodes in a cluster using Oracle Clusterware, and 100 instances in a RAC database.
The following processes are unique to a RAC environment. You
will not see them otherwise.
The additional RAC centric processes are DIAG, LCK, LMON, LMDn, and
LMSn processes. We will give a brief description of each and discuss how they
interact in a RAC environment next.
DIAG: This is a diagnostic
daemon. It constantly monitors the health of the instances across the RAC and
possible failures on the RAC. There is one per instance.
LCK: This lock process manages
requests that are not
cache-fusion requests. Requests like row cache requests and library cache
requests. Only a single LCK process is allowed for each instance.
LMD: The Lock Manager Daemon.
This is also sometimes referred to as the GES (Global Enqueue Service) daemon
since its job is to manage the global enqueue and global resource access. It
also detects deadlocks and monitors lock conversion timeouts.
LMON: The Lock Monitor Process.
It is the GES monitor. It reconfigures the lock resources adding or removing
nodes. LMON will generate a trace file every time a node reconfiguration takes
place. It also monitors the RAC cluster wide and detects a node’s demise and
trigger a quick reconfiguration.
LMS: This is the Lock Manager
Server Process or the LMS process, sometimes also called the GCS (Global Cache
Services) process. Its primary job is to transport blocks across the nodes for
cache-fusion requests. If there is a consistent-read request, the LMS process
rolls back the block, makes a Consistent-Read image of the block and then ship
this block across the HSI (High Speed Interconnect) to the process requesting
from a remote node. LMS must also check constantly with the LMD background
process (or our GES process) to get the lock requests placed by the LMD
process. Up to 10 such processes can be generated dynamically.
A
Real Application Clusters database has the same processes as single-instance
Oracle databases such as process monitor (PMON), database writer (DBWRn), log writer (LGWR), and so on. There are also additional
Real Application Clusters-specific processes as shown in Figure 3-1.
The exact names of these processes and the trace files that they create are
platform-dependent.
- Global
Cache Service Processes (LMSn), where n ranges from 0 to 9 depending on the amount of
messaging traffic, control the flow of messages to remote instances and
manage global data block access. LMSn processes also
transmit block images between the buffer caches of different instances.
This processing is part of the Cache Fusion feature.
- The Global
Enqueue Service Monitor (LMON) monitors global enqueues and resources
across the cluster and performs global enqueue recovery operations.
Enqueues are shared memory structures that serialize row updates.
- The Global
Enqueue Service Daemon (LMD) manages global enqueue and global resource
access. Within each instance, the LMD process manages incoming remote
resource requests.
- The Lock
Process (LCK) manages non-Cache Fusion resource requests such as library
and row cache requests.
- The
Diagnosability Daemon (DIAG) captures diagnostic data about process
failures within instances. The operation of this daemon is automated and
it updates an alert log file to record the activity that it performs.
The
ONS Daemon Explained In RAC/CRS environment
=====================================
=====================================
Purpose of the ons
daemon
The Oracle Notification Service daemon is an daemon started by the CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon is an daemon started by the CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service (ONS) daemon is an daemon started by the CRS clusterware as part of the nodeapps. There is one ons daemon started per clustered node.
The Oracle Notification Service daemon receive a subset of published clusterware events via the local evmd and racgimon clusterware daemons and forward those events to application subscribers and to the local listeners.
This in order to facilitate:
a. the FAN or Fast Application Notification feature or allowing applications to respond to database state changes.
b.
the 10gR2 Load Balancing Advisory, the feature that permit load balancing
accross different rac nodes dependent of the load on the different nodes. The
rdbms MMON is creating an advisory for distribution of work every 30seconds and
forward it via racgimon and ONS to listeners and applications.
Launching the ons daemon
ons daemon is started as part of the nodeapps in the $ORA_CRS_HOME environment with user oracle, i.e.
crs_stat -p ora.<hostname>.ons | grep ACTION_SCRIPT
ACTION_SCRIPT=/u01/app/oracle/product/crs/bin/racgwrap
crs_getperm ora.hostname.ons
Name: ora.hostname.ons
owner:oracle:rwx,pgrp:dba:r-x,other::r--,
The command used by the clusterware to start/stop/ping the ons is 'onsctl start', 'onsctl stop' and 'onsctl ping'.
It is possible to start/stop the ons daemon on one node via the clusterware commands:
crs_start ora.<hostname>.ons
crs_stop ora.<hostname>.ons
for debugging purposes.
The Global Services Daemon
ons daemon is started as part of the nodeapps in the $ORA_CRS_HOME environment with user oracle, i.e.
crs_stat -p ora.<hostname>.ons | grep ACTION_SCRIPT
ACTION_SCRIPT=/u01/app/oracle/product/crs/bin/racgwrap
crs_getperm ora.hostname.ons
Name: ora.hostname.ons
owner:oracle:rwx,pgrp:dba:r-x,other::r--,
The command used by the clusterware to start/stop/ping the ons is 'onsctl start', 'onsctl stop' and 'onsctl ping'.
It is possible to start/stop the ons daemon on one node via the clusterware commands:
crs_start ora.<hostname>.ons
crs_stop ora.<hostname>.ons
for debugging purposes.
The Global Services Daemon
The Global Services Daemon (GSD)
runs on each node with one GSD process per node. The GSD coordinates with the
cluster manager to receive requests from clients such as the DBCA, EM, and the
SRVCTL utility to execute administrative job tasks such as instance startup or
shutdown. The GSD is not an Oracle instance background process and is therefore
not started with the Oracle instance.
Global Resource Directory with
Distributed Architecture
The
GCS and GES maintain a Global Resource Directory to
record information about resources. The Global Resource Directory resides in
memory, is distributed throughout the cluster, and is available to all active
instances. In this distributed architecture, each node participates in the
management of information in the directory. This distributed scheme provides fault tolerance and enhanced
runtime performance.
The
GCS and GES ensure the integrity of the Global Resource Directory even if
multiple nodes fail. The shared database is always accessible if at least one
instance is active after recovery is completed. The fault tolerance of the
resource directory also enables Real Application Clusters instances to start
and stop at any time, in any order.
What is GRD?
GRD stands for
Global Resource Directory. The GES and GCS maintains records of the statuses of
each datafile and each cahed block using global resource directory.This process
is referred to as cache fusion and helps in data integrity.
Give Details on Cache Fusion:-
Give Details on Cache Fusion:-
Oracle RAC is
composed of two or more instances. When a block of data is read from datafile
by an instance within the cluster and another instance is in need of the same block,it
is easy to get the block image from the insatnce which has the block in its SGA
rather than reading from the disk. To enable inter instance communication
Oracle RAC makes use of interconnects. The Global Enqueue Service(GES) monitors
and Instance enqueue process manages the cahce fusion. Give Details on
Components in RAC must reside in shared storage
All datafiles,
controlfiles, SPFIles, redo log files must reside on cluster-aware shred
storage.
Interconnect network
An interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access.
Cluster interconnect is used by the Cache fusion for inter instance communication.
An interconnect network is a private network that connects all of the servers in a cluster. The interconnect network uses a switch/multiple switches that only the nodes in the cluster can access.
Cluster interconnect is used by the Cache fusion for inter instance communication.
FAN
Fast application
Notification as it abbreviates to FAN relates to the events related to
instances,services and nodes.This is a notification mechanism that Oracle RAc
uses to notify other processes about the configuration and service level
information that includes service status changes such as,UP or DOWN
events.Applications can respond to FAN events and take immediate action.