Oracle Environment

Jump to: navigation, search

Kernel Parameters

Oracle Global Customer Support officially recommends a maximum for SHMMAX of "1/2 of physical RAM".

The maximum size of a shared memory segment is limited by the size of the available user address space. On 64-bit systems, this is a theoretical 264bytes. So the theoretical limit for SHMMAX is the amount of physical RAM that you have. However, to actually attempt to use such a value could potentially lead to a situation where no system memory is available for anything else. Therefore a more realistic physical limit for SHMMAX would probably be "physical RAM - 2G".

In an Oracle RDBMS application, this physical limit still leaves inadequate system memory for other necessary functions. Therefore, the common Oracle maximum for SHMMAX that you will often see is "1/2 of physical RAM". Operators may erroneously think that that setting the SHMMAX as recommended limits the total SGA, which is untrue. Setting the SHMMAX as recommended only causes a few more shared memory segments to be used for whatever total SGA that you subsequently configure in Oracle.

Modify your kernel settings in /etc/sysctl.conf as follows. If the current value for any parameter is higher than the value listed in this table, do not change the value of that parameter. Range values (such as net.ipv4.ip_local_port_range) must match exactly.

kernel.shmall                = physical RAM size / pagesize
kernel.shmmax                = 1/2 of physical RAM.
kernel.shmmni                = 4096
kernel.sem                   = 250 32000 100 128
fs.file-max                  = 512 x processes (for example 6815744 for 13312 processes)
fs.aio-max-nr                = 1048576
net.ipv4.ip_local_port_range = 9000 65500
net.core.rmem_default        = 262144
net.core.rmem_max            = 4194304
net.core.wmem_default        = 262144
net.core.wmem_max            = 1048576

Set shmall equal to the sum of all the SGAs on the system, divided by the page size. The SGA values can be calculated with a one line script:

# su - oracle
$ SGA=`echo "show sga"|sqlplus -s / as sysdba|grep "^Total System"|awk '{print $5}'`; PAGE=`getconf PAGE_SIZE`; echo "$SGA/$PAGE" | bc

Userspace Setup

Oracle groups and user(s): normally two groups, 'oinstall' and 'dba' and one user, 'oracle'). The 'oracle' user has a hefty custom environment configured for all the variables needed. This is why we 'su - oracle' and not just a 'su oracle' when needing to run sqlplus - a full login shell with all variables initialized is required.

Add the following settings to /etc/security/limits.conf for the 'oracle' user (adjust as needed):

oracle soft nproc 2047
oracle hard nproc 16384
oracle soft nofile 1024
oracle hard nofile 65536
oracle soft stack 10240

Automatic Storage Management (ASM)

Oracle supplies it's own kernel module for ASM use which the DBA team will install; ASM can be summarized by three main points:

  • Direct I/O to storage (bypasses kernel buffering)
  • Solves 4k block size limitation of ext3
  • Cluster aware filesystem on raw devices (RAC)

Unlike other kernel modules using DKMS, Oracle provides pre-compiled binaries for very specific versions of the Red Hat Enterprise kernel and can be found via their landing page below. They have a great intro article on learning more about how it actually works.

Real Application Cluster (RAC)

An Oracle RAC is typically 2+ machines in a cluster with shared storage, although it is possible to configure a single-node RAC.

RAC Networking

Oracle RAC needs a private network between servers to use; in some cases using Jumbo Frames and/or 10G switches. The Oracle RAC nodes use this private network link for inter-node communication of large amounts of data using UDP, terabytes of both RX and TX traffic per month is not uncommon for highly active cluster nodes. It is common for the RAC nodes to use 8k UDP packets on this private network as they pass table data back and forth to stay in sync.

Server A

  • bond0 (eth0 / eth1) -
  • bond1 (eth2 / eth3) - (RAC Interconnect)

Server B

  • bond0 (eth0 / eth1) -
  • bond1 (eth2 / eth3) - (RAC Interconnect)

Shared Storage

The shared storage should have a minimum of 7 LUNs presented to the servers; 5x 1G control and 2+ xxxG data. More may be used to further spread out the data for better performance; RAID-10 is the suggested design across as many spindles as possible.

  • 5x 1G Raw Control LUNs
    • 2x LUNs for OCR
      • Oracle Cluster Registry: The OCR stores the details of the cluster configuration including the names and current status of the database, associated instances, services, and node applications such as the listener process.
    • 3x LUNs for Voting
      • CSS Voting Disks are used to determine which nodes are currently available within the cluster. An odd number is always used.
  • 2+ xxxG ASM Data LUNs
    • 1 LUN has one set of data, control, redo
    • 1 LUN has one set of data, control, redo, archivelog

It is common the Server Parameter File (SPFILE) is stored on the ASM disks and there is no Flash Recovery Area (FRA) unless specifically requested. The FRA is typically twice as large as the Data LUNs.

Virtual IP Setup

A RAC requires 5 additional IP addresses from the same subnet as the NAT IPs to be used by Oracle; they are not configured on the servers in the traditional fashion.

  • The two VIP addresses are considered legacy for 11gR2; their use is deprecated. It's possible that an Oracle client may have an older JDBC driver that talks only "VIP" style.
  • The three SCAN addresses are round-robin returned by a standard NS resolve on the client end; one node listener ("TNS LIstener") has two of the IPs configured, the other node has one IP. When an Oracle client does a NS lookup on the SCAN DNS name they connect to the TNS Listener on that IP; the TNS Listener is a load balancer and may actually communicate with another node in the cluster (so not always his local node).
  • The placement of how the client resolves the listener IPs should be done in such a way that we don't insert a point of failure. For instance it's a bad idea to host the DNS record for SCAN across a site-to-site VPN link; if that link goes down then no clients can connect! It is a best practice that the DNS record for the SCAN be hosted in such a way that it's redundant and reachable via multiple paths.

During configuration the IPs needs to be in DNS:

  • Forward and reverse list the Linux server hostnames in DNS with their primary public IPs.
  • DNS list the SCAN name with all three SCANx IP addresses; you're in effect creating a round-robin lookup in DNS for the same single-named record, not creating unique records for each IP. IN A IN A IN A