Monthly Archives: May 2011

How to configure IPMI to work with Oracle RAC

We know there are several kinds of kill/eviction in Oracle RAC CSS component (Cluster Sync Service).
instance kill -> node member kill
The relationship between these two kinds of kill regarding CSS is:
A member kill escalation. For example, database LMON process may request CSS to remove an instance from the cluster via the instance eviction mechanism. If this times out it could escalate to a node kill.

If node kill even hang in some of the situation, how Oracle RAC it? This comes to the topic today I would like to introduce: IPMI
Since Oracle Database 11gR2, IPMI is integrated with Oracle RAC and with the configuration below, you can make it to work with Oracle RAC and trigger the node eviction when needed.

  • How to configure IPMI
  • 1) Log in as root.

    2) Verify that ipmitool can communicate with the BMC using the IPMI driver by using the command bmc info, and looking for a device ID in the output. For example:

    # ipmitool bmc info
    Device ID : 32

    If ipmitool is not communicating with the BMC, then configuring the BMC and ensure that the IPMI driver is running.

    3) Enable IPMI over LAN using the following procedure

    Determine the channel number for the channel used for IPMI over LAN. Beginning with channel 1, run the following command until you find the channel that displays LAN attributes (for example, the IP address):

    # ipmitool lan print 1

    IP Address Source : 0×01
    IP Address : 140.87.155.89

    Turn on LAN access for the channel found. For example, where the channel is 1:

    # ipmitool -I bmc lan set 1 access on

    4) Configure IP address settings for IPMI using the static IP addressing procedure:

    Using static IP Addressing

    If the BMC shares a network connection with ILOM, then the IP address must be on the same subnet. You must set not only the IP address, but also the proper values for netmask, and the default gateway. For example, assuming the channel is 1:

    # ipmitool -I bmc lan set 1 ipaddr 192.168.0.55
    # ipmitool -I bmc lan set 1 netmask 255.255.255.0
    # ipmitool -I bmc lan set 1 defgw ipaddr 192.168.0.1
    

    Note that the specified address (192.168.0.55) will be associated only with the BMC, and will not respond to normal pings.

    5) Establish an administration account with a username and password, using the following procedure (assuming the channel is 1):

    Set BMC to require password authentication for ADMIN access over LAN. For example:

    # ipmitool -I bmc lan set 1 auth ADMIN MD5,PASSWORD
    List the account slots on the BMC, and identify an unused slot (a User ID with an empty user name field). For example:

    # ipmitool channel getaccess 1
    . . . 
    User ID              : 4
    User Name            :
    Fixed Name           : No
    Access Available     : call-in / callback
    Link Authentication  : disabled
    IPMI Messaging       : disabled
    Privilege Level      : NO ACCESS
    . . .
    

    Assign the desired administrator user name and password and enable messaging for the identified slot. (Note that for IPMI v1.5 the user name and password can be at most 16 characters). Also, set the privilege level for that slot when accessed over LAN (channel 1) to ADMIN (level 4). For example, where username is the administrative user name, and password is the password:

    
    # ipmitool user set name 4 username
    # ipmitool user set password 4 password
    # ipmitool user enable 4
    # ipmitool channel setaccess 1 4 privilege=4
    # ipmitool channel setaccess 1 4 link=on
    # ipmitool channel setaccess 1 4 ipmi=on
    

    6) Verify the setup using the command lan print 1. The output should appear similar to the following. Note that the items in bold text are the settings made in the preceding configuration steps, and comments or alternative options are indicated within brackets []:

    # ipmitool lan print 1
    Set in Progress         : Set Complete
    Auth Type Support       : NONE MD2 MD5 PASSWORD
    Auth Type Enable        : Callback : MD2 MD5
                            : User     : MD2 MD5
                            : Operator : MD2 MD5
                            : Admin    : MD5 PASSWORD
                            : OEM      : MD2 MD5
    IP Address Source       : DHCP Address [or Static Address]
    IP Address              : 192.168.0.55
    Subnet Mask             : 255.255.255.0
    MAC Address             : 00:14:22:23:fa:f9
    SNMP Community String   : public
    IP Header               : TTL=0x40 Flags=0x40 Precedence=… 
    Default Gateway IP      : 192.168.0.1
    Default Gateway MAC     : 00:00:00:00:00:00
    .
    .
    .
    # ipmitool channel getaccess 1 4
    Maximum User IDs     : 10
    Enabled User IDs     : 2
     
    User ID              : 4
    User Name            : username [This is the administration user]
    Fixed Name           : No
    Access Available     : call-in / callback
    Link Authentication  : enabled
    IPMI Messaging       : enabled
    Privilege Level      : ADMINISTRATOR
    

    Verify that the BMC is accessible and controllable from a remote node in your cluster using the bmc info command. For example, if node2-ipmi is the network host name assigned the IP address of node2′s BMC, then to verify the BMC on node node2 from node1, with the administrator account username, enter the following command on node1:

    $ ipmitool -H node2-ipmi -U username lan print 1
    You are prompted for a password. Provide the IPMI password.

    If the BMC is correctly configured, then you should see information about the BMC on the remote node. If you see an error message, such as Error: Unable to establish LAN session, then you must check the BMC configuration on the remote node.

    Repeat this process for each cluster member node.

    Below is a demo in my environment on how to set it up.

    
    # ipmitool bmc info
    Device ID                 : 32
    Device Revision           : 1
    Firmware Revision         : 3.0
    IPMI Version              : 2.0
    Manufacturer ID           : 42
    Manufacturer Name         : Sun Microsystems
    Product ID                : 18177 (0x4701)
    Device Available          : yes
    Provides Device SDRs      : no
    Additional Device Support :
        Sensor Device
        SDR Repository Device
        SEL Device
        FRU Inventory Device
        IPMB Event Receiver
        IPMB Event Generator
        Chassis Device
    Aux Firmware Rev Info     : 
        0x03
        0x20
        0x00
        0x00
    # ipmitool -I bmc lan set 1 ipaddr 10.137.17.12
    Setting LAN IP Address to 10.137.17.12
    # ipmitool -I bmc lan set 1  netmask 255.255.252.0
    Setting LAN Subnet Mask to 255.255.252.0
    # ipmitool -I bmc lan set 1  defgw ipaddr 10.137.16.1
    Setting LAN Default Gateway IP to 10.137.16.1
    # ipmitool -I bmc lan set 1 auth ADMIN MD5,PASSWORD
    # ipmitool channel getaccess 1
    Get User Name (id 1) failed: Invalid data field in request
    # ipmitool user set name  5 crsusr
    Set User Name command failed (user 5, name crsusr): Unknown (0x5)
    # ipmitool user set password 5 cdcora
    # ipmitool user enable 5
    # ipmitool channel setaccess 1 5 privilege=4
    # ipmitool channel setaccess 1 5 link=on
    # ipmitool channel setaccess 1 5 ipmi=on
    # ipmitool lan print 1
    Set in Progress         : Set Complete
    Auth Type Support       : NONE MD2 MD5 PASSWORD 
    Auth Type Enable        : Callback : MD2 MD5 PASSWORD 
                            : User     : MD2 MD5 PASSWORD 
                            : Operator : MD2 MD5 PASSWORD 
                            : Admin    : MD5 PASSWORD 
                            : OEM      : 
    IP Address Source       : Static Address
    IP Address              : 10.137.17.12
    Subnet Mask             : 255.255.252.0
    MAC Address             : 00:21:28:11:bd:0f
    SNMP Community String   : public
    IP Header               : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
    BMC ARP Control         : ARP Responses Disabled, Gratuitous ARP Disabled
    Gratituous ARP Intrvl   : 5.0 seconds
    Default Gateway IP      : 10.137.16.1
    Default Gateway MAC     : 00:00:00:00:00:00
    Backup Gateway IP       : 0.0.0.0
    Backup Gateway MAC      : 00:00:00:00:00:00
    802.1q VLAN ID          : Disabled
    802.1q VLAN Priority    : 0
    RMCP+ Cipher Suites     : 2,3,0
    Cipher Suite Priv Max   : XXXXXXXXXXXXXXX
                            :     X=Cipher Suite Unused
                            :     c=CALLBACK
                            :     u=USER
                            :     o=OPERATOR
                            :     a=ADMIN
                            :     O=OEM
    # ipmitool channel getaccess 1 5
    Maximum User IDs     : 20
    Enabled User IDs     : 10
    
    User ID              : 5
    User Name            : crsusr
    Fixed Name           : No
    Access Available     : call-in / callback
    Link Authentication  : enabled
    IPMI Messaging       : enabled
    Privilege Level      : ADMINISTRATOR
    # ipmitool -H  10.137.17.12 -U crsusr lan print 1
    Password: 
    Set in Progress         : Set Complete
    Auth Type Support       : NONE MD2 MD5 PASSWORD 
    Auth Type Enable        : Callback : MD2 MD5 PASSWORD 
                            : User     : MD2 MD5 PASSWORD 
                            : Operator : MD2 MD5 PASSWORD 
                            : Admin    : MD5 PASSWORD 
                            : OEM      : 
    IP Address Source       : Static Address
    IP Address              : 10.137.17.12
    Subnet Mask             : 255.255.252.0
    MAC Address             : 00:21:28:11:bd:0f
    SNMP Community String   : public
    IP Header               : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
    BMC ARP Control         : ARP Responses Disabled, Gratuitous ARP Disabled
    Gratituous ARP Intrvl   : 5.0 seconds
    Default Gateway IP      : 10.137.16.1
    Default Gateway MAC     : 00:00:00:00:00:00
    Backup Gateway IP       : 0.0.0.0
    Backup Gateway MAC      : 00:00:00:00:00:00
    802.1q VLAN ID          : Disabled
    802.1q VLAN Priority    : 0
    RMCP+ Cipher Suites     : 2,3,0
    Cipher Suite Priv Max   : XXXXXXXXXXXXXXX
                            :     X=Cipher Suite Unused
                            :     c=CALLBACK
                            :     u=USER
                            :     o=OPERATOR
                            :     a=ADMIN
                            :     O=OEM
    # cd /u01/app/11.2.0/grid/bin
    

    After this, you need to use crsctl command to set the correspond ipmiaddr and admin user:

    
    [Thu May 26 07:00:18][crsusr@05:~]
    $ cd /u01/app/11.2.0/grid/bin
    [Thu May 26 07:00:40][crsusr@05:/u01/app/11.2.0/grid/bin]
    $ crsctl set css ipmiaddr 10.137.17.12
    CRS-4229: The IPMI information change was successful
    $ 
    [Thu May 26 07:00:26][crsusr@05:/u01/app/11.2.0/grid/bin]
    $ crsctl set css ipmiadmin crsusr
    IPMI BMC password:
    CRS-4229: The IPMI information change was successful
    $
    

    Here, you can check the ocssd.log to verify that it really works, Here is the example in my environment.

    
    2011-05-26 07:00:43.703: [    CSSD][5]clssnmSendIPMIReq: clssnmAuthSendReqThread spawned successfully, for the first time - nmreq 101825530
    2011-05-26 07:00:43.703: [    CSSD][70]clssscUpdateEventValue: IPMIInfo State  val 0, changes 8
    2011-05-26 07:00:43.777: [    CSSD][70]clssscUpdateEventValue: IPMIInfo State  val 0, changes 9
    2011-05-26 07:00:43.777: [    CSSD][70]clssnmnodeTest: IPMI Admin Node selected is 2 and my nodenum is 1
    2011-05-26 07:00:43.777: [    CSSD][70]clssscUpdateEventValue: IPMIInfo State  val 1, changes 10
    2011-05-26 07:00:44.069: [    CSSD][48]clssscUpdateEventValue: IPMIInfo State  val 2, changes 11
    2011-05-26 07:00:44.069: [    CSSD][70]clssscWaitChangeEventValue: ev(IPMIInfo State) changed to 2 from 1
    2011-05-26 07:00:44.069: [    CSSD][70]clssnmAuthSendReqThread: IPMI Cookie Validation succeeds and in success event change for nmreq 101825530
    2011-05-26 07:00:44.069: [    CSSD][70]clssscUpdateEventValue: IPMIInfo State  val 6, changes 12
    2011-05-26 07:00:44.085: [    CSSD][48]clssscUpdateEventValue: IPMIInfo State  val 7, changes 13
    2011-05-26 07:00:44.085: [    CSSD][70]clssscWaitOnEventValue: after IPMIInfo State  val 7, eval 7 waited 16
    2011-05-26 07:00:44.086: [    CSSD][70]clssscUpdateEventValue: IPMIInfo State  val 0, changes 14
    2011-05-26 07:01:07.489: [    CSSD][72]clssnkipmiPing:      00001:Sent IPMI ping msg, max RT timeout=250 msec
    2011-05-26 07:01:07.492: [    CSSD][72]clssnkipmiPing:      00004:IPMI pong message successfully recvd
    2011-05-26 07:01:07.774: [    CSSD][72]clssnmAuthHandleReqThread: IPMI Cookie Validation succeeds for request from node 3 named dnagad08
    2011-05-26 07:01:11.314: [    CSSD][74]clssnkipmiPing:      00000:Sent IPMI ping msg, max RT timeout=250 msec
    2011-05-26 07:01:11.325: [    CSSD][74]clssnkipmiPing:      00012:IPMI pong message successfully recvd
    2011-05-26 07:01:11.728: [    CSSD][74]clssnkipmiTalkToBMC: IPMI outbound SSN too low, discarding
    2011-05-26 07:01:11.929: [    CSSD][74]clssnmAuthHandleReqThread: IPMI Cookie Validation succeeds for request from node 3 named dnagad08
    $ 
    
    
    2011-05-26 07:01:08.006: [    CSSD][25]clssnkipmiTrMsg:     06 00 ff 07 02 4e 5f 00 00 1e bf 5d 23 bf 44 1a fb 9d 09 ed 6e b6 28 e7 80 ec 01 df 7c 09 81 1c 63 20 10 3b 00 04 91
    2011-05-26 07:01:08.006: [    CSSD][25]clssnkipmiTrMsgApp:  00216:4e 5f 00 00:RSP:MD5 :0010:SETSESPRIVLVL:00:
    2011-05-26 07:01:08.006: [    CSSD][25]clssnkipmiDestroySes: Start
    2011-05-26 07:01:08.006: [    CSSD][25]clssnkipmiTrMsg:     06 00 ff 07 02 01 00 00 00 1e bf 5d 23 41 da 08 41 05 a5 dd d2 f1 f2 48 89 39 5d 60 a9 0b 20 18 c8 81 14 3c 1e bf 5d 23 d2
    2011-05-26 07:01:08.006: [    CSSD][25]clssnkipmiTrMsgApp:  00216:01 00 00 00:REQ:MD5 :0014:SESCLOSE     :  :
    2011-05-26 07:01:08.094: [    CSSD][25]clssnkipmiTrMsg:     06 00 ff 07 02 4f 5f 00 00 1e bf 5d 23 80 dd c0 a5 9e ad 8c 7d 82 72 81 7c c5 9e bd d2 08 81 1c 63 20 14 3c 00 90
    2011-05-26 07:01:08.094: [    CSSD][25]clssnkipmiTrMsgApp:  00304:4f 5f 00 00:RSP:MD5 :0014:SESCLOSE     :00:
    2011-05-26 07:01:08.095: [    CSSD][25]clssnkipmiTermCtx:   Start
    2011-05-26 07:01:08.095: [    CSSD][25]clssnkipmiValidate:  Successful validate using method 2
    2011-05-26 07:01:11.931: [    CSSD][25]clssnmRcfgMgrThread: initiating reconfig  for modified ipmi cookie distribution 
    [Thu May 26 07:04:13][crsusr@dnagad08:/u01/app/11.2.0/grid/log/dnagad08/cssd]
    $ 
    
    
    

    I do a test by hang ocssd.bin, cssdagent and cssdmonitor to check that IPMI works as expected and will find that:

    
    
    
    2011-05-26 07:05:16.013: [    CSSD][43]clssnkipmiKillNode:  Power off detected. Powering on BMC at IP address 10.137.17.13
    2011-05-26 07:05:16.013: [    CSSD][43]clssnkipmiPwrOn:     Start
    
    

    This indicate that the IPMI is working and after Misscount the node was evicted by IPMI.

    Thanks

    How to enable flash cache in 11.2

    The following are my setting for your reference.

    alter system set db_flash_cache_file =’/oraTB/orahome/dbs/flash_file01.dbf’ sid=’bh1′  scope=spfile;
    alter system set db_flash_cache_file =’/oraTB/orahome/dbs/flash_file02.dbf’ sid=’bh2′  scope=spfile;
    alter system set db_flash_cache_file =’/oraTB/orahome/dbs/flash_file03.dbf’ sid=’bh3′  scope=spfile;
    alter system set db_flash_cache_file =’/oraTB/orahome/dbs/flash_file04.dbf’ sid=’bh4′  scope=spfile;

    alter system set db_flash_cache_size = 10240M sid=’bh1′ scope=spfile;
    alter system set db_flash_cache_size = 512M sid=’bh2′ scope=spfile;
    alter system set db_flash_cache_size = 20480M sid=’bh3′ scope=spfile;
    alter system set db_flash_cache_size = 1024M sid=’bh4′ scope=spfile;