Archive

Posts Tagged ‘rac’

RAC Team Hiring (Beijing)

October 13th, 2009 ricky.zhu No comments

RAC Team Hiring – 2 opening
一个是有关ACFS的,另外一个是虚拟化,有Linux/Unix管理经验和虚拟化经验优先。
欢迎推荐或者自荐

Email: winston.huang@oracle.com

1) (Senior)Member of Technical Staff-RAC
Product Development

* Beijing, China

KEY OBJECTIVE:

Responsible for implementing, maintaining, and enhancing test scripts, plan, and methodologies that ensure exhaustive testing of all assigned software areas to ensure software quality by exposing defects, and verifying resolutions.

This is the golden opportunity to learn Oracle within Oracle, the challenging position provides training to lay the foundation of RAC and database High Availability. Engineer will expose to latest Oracle technologies on varies platforms. Those experiences are extremely valuable toward the building solid foundations of RAC and High availability systems for future system architecture integration and consulting work.

SPECIFIC RESPONSIBILITIES:

1. Review the new features, design basic function test cases and write automatic test scripts.
2. Work closely with development teams to report, analyze and verify the bugs.
3. Integration test on different platform including AIX, Solaris, HPUX and Windows.
4. Design destructive test and long running stress test cases and scripts.

KNOWLEDGE & SKILLS REQUIREMENTS:

Professional background:
The candidate must either have a minimum of 2 years in the enterprise software industry or master degree in Computer Science or related fields. Prior experiences as a member of Customer Support, Technical Consulting, Product Development or QA teams are preferred.

Technical Background:
1. Excellent script languages skill (shell, perl)
2. Must have ample experiences in any one of the following platforms: Solaris 64, IBM/AIX, HPUX
3. Knowledge of the cluster file system will be a plus.
4. Knowledge of the storage will be a plus.
5. Experience designing and executing destructive or performance tests is a plus.
6. Experience with software development lifecycle or software testing is a plus.

2)(Senior)Member of Technical Staff-RAC
Product Development

* Beijing, China

KEY OBJECTIVE:

Responsible for implementing, maintaining, and enhancing test scripts, plan, and methodologies that ensure the ORACLE RAC product can work well in virtualization environment.

This is the golden opportunity to learn Oracle within Oracle, the challenging position provides training to lay the foundation of RAC and database High Availability. Engineer will expose to latest Oracle technologies on varies platforms. Those experiences are extremely valuable toward the building solid foundations of RAC and High availability systems for future system architecture integration and consulting work.

SPECIFIC RESPONSIBILITIES:

1. Review the certification test plan depends on the different virtualization product.
2. Design test plan and test cases for different virtualization product.
3. Design destructive test and long running stress test cases and scripts.
4. Work closely with development teams to report, analyze and verify the bugs.
5. Verify the RAC product can work well on different virtualization environment. The project will focus on the different virtualization products.
6. Review all test results.

KNOWLEDGE & SKILLS REQUIREMENTS:

Professional background:
The candidate must either have a minimum of 2 years in the enterprise software industry or master degree in Computer Science or related fields. Prior experiences as a member of Customer Support, Technical Consulting, Product Development or QA teams are preferred.

Technical Background:
1. Excellent script languages skill (shell, perl)
2. Must have ample experiences in any one of the following platforms: Linux, Solaris 64, IBM/AIX, HPUX and Windows.
3. Must well know one of the virtualization products.
4. Knowledge of the storage will be a plus.
5. Knowledge of the ORACLE RAC is a plus.
6. Experience designing and executing destructive or performance tests is a plus.
7. Experience with product certification lifecycle is a strong plus.

Categories: 数据库 Tags:

Oracle Database 11gR2日志篇之CRS日志

September 30th, 2009 ricky.zhu No comments

前面陆续介绍了11gR2的后台进程资源的管理。今天概要介绍下11gR2的日志和问题定位。

经历过几个重要的release和持续的改进,9iR2,10gR1, 10gR2, 11gR1到最近的11gR2,Oracle Clusteware和RAC也变得越来越成熟,这其中就包括日志的规范性,既然在11gR2有这么多的资源和后台进程,那么对应他们的日志都在哪里,出了问题一般如何去定位呢? 这快内容分三个部分介绍如下:

首先是CRS相关的问题

在11.2中,CRS相关的日志比较集中,都位于CRS_HOME/log/nodename下面

bash-2.05$ pwd
/u01/app/cluster/crs/log/node1
bash-2.05$ tree
|——-admin
|——-agent
|       |——-crsd
|       |       |——-ora_oc4j_type_crsusr
|       |       |       `——-ora_oc4j_type_crsusr.log
|       |       |——-oraagent_crsusr
|       |       |       |——-oraagent_crsusr.l01
|       |       |       |——-oraagent_crsusr.l02
|       |       |       |——-oraagent_crsusr.l03
|       |       |       |——-oraagent_crsusr.l04
|       |       |       |——-oraagent_crsusr.l05
|       |       |       |——-oraagent_crsusr.l06
|       |       |       |——-oraagent_crsusr.l07
|       |       |       |——-oraagent_crsusr.l08
|       |       |       |——-oraagent_crsusr.l09
|       |       |       |——-oraagent_crsusr.l10
|       |       |       |——-oraagent_crsusr.log
|       |       |       |——-oraagent_crsusr.pid
|       |       |       `——-oraagent_crsusrOUT.log
|       |       `——-orarootagent_root
|       |               |——-orarootagent_root.l01
|       |               |——-orarootagent_root.l02
|       |               |——-orarootagent_root.l03
|       |               |——-orarootagent_root.l04
|       |               |——-orarootagent_root.l05
|       |               |——-orarootagent_root.l06
|       |               |——-orarootagent_root.l07
|       |               |——-orarootagent_root.l08
|       |               |——-orarootagent_root.l09
|       |               |——-orarootagent_root.l10
|       |               |——-orarootagent_root.log
|       |               |——-orarootagent_root.pid
|       |               |——-orarootagent_root.trc
|       |               `——-orarootagent_rootOUT.log
|       `——-ohasd
|               |——-oraagent_crsusr
|               |       |——-oraagent_crsusr.l01
|               |       |——-oraagent_crsusr.l02
|               |       |——-oraagent_crsusr.l03
|               |       |——-oraagent_crsusr.l04
|               |       |——-oraagent_crsusr.l05
|               |       |——-oraagent_crsusr.l06
|               |       |——-oraagent_crsusr.log
|               |       |——-oraagent_crsusr.pid
|               |       `——-oraagent_crsusrOUT.log
|               |——-oracssdagent_root
|               |       `——-oracssdagent_root.log
|               |——-oracssdmonitor_root
|               |       `——-oracssdmonitor_root.log
|               `——-orarootagent_root
|                       |——-orarootagent_root.log
|                       |——-orarootagent_root.pid
|                       `——-orarootagent_rootOUT.log
|——-client
|       |——-clscfg.log
|       |——-crsctl.log
|       |——-crsctl.trc
|       |——-gpnp_6718.log
|       |——-gpnp_6718.trc
|       |——-gpnptool_6431.log
|       |——-gpnptool_6431.trc
|       |——-gpnptool_6437.log
|       |——-gpnptool_6437.trc
|       |——-oclskd.log
|       |——-ocrcheck_11679.log
|       |——-ocrcheck_11679.trc
|       |——-ocrcheck_12475.log
|       |——-ocrcheck_27081.log
|       |——-ocrcheck_27194.log
|       |——-ocrcheck_27231.log
|       |——-ocrcheck_27869.log
|       |——-ocrcheck_28042.log
|       |——-ocrcheck_28090.log
|       |——-ocrcheck_28253.log
|       |——-ocrcheck_28287.log
|       |——-ocrcheck_564.log
|       |——-ocrconfig_11772.log
|       |——-ocrconfig_12493.log
|       |——-ocrconfig_27045.log
|       |——-ocrconfig_6259.log
|       |——-ocrconfig_6807.log
|       |——-ocrconfig_7963.log
|       |——-ocrdump_11617.log
|       |——-ocrdump_11617.trc
|       |——-ocrdump_12442.log
|       |——-oifcfg.log
|       |——-oifcfg.trc
|       |——-oifcfg1.trc
|       |——-olsnodes.log
|       `——-olsnodes.trc
|——-crsd
|       |——-core
|       |——-crsd.l01
|       |——-crsd.l02
|       |——-crsd.l03
|       |——-crsd.l04
|       |——-crsd.l05
|       |——-crsd.log
|       |——-crsd.trc
|       `——-crsdOUT.log
|——-cssd
|       |——-cssdOUT.log
|       |——-ocssd.l01
|       |——-ocssd.log
|       `——-ocssd.trc
|——-ctssd
|       |——-octssd.l01
|       |——-octssd.l02
|       |——-octssd.log
|       `——-octssd.trc
|——-diskmon
|       |——-diskmon.log
|       `——-diskmonOUT.log
|——-evmd
|       |——-evmd.log
|       |——-evmd.trc
|       `——-evmdOUT.log
|——-gipcd
|       `——-gipcd.log
|——-gnsd
|       |——-gnsd.log
|       |——-gnsd.trc
|       `——-gnsdOUT.log
|——-gpnpd
|       |——-gpnpd.log
|       |——-gpnpd.trc
|       |——-gpnpdOUT.log
|       `——-sun880-1.pid
|——-mdnsd
|       `——-mdnsd.log
|——-ohasd
|       |——-ohasd.log
|       |——-ohasd.trc
|       `——-ohasdOUT.log
|——-racg
|       |——-racgeut
|       |——-racgevtf
|       |——-racgmain
|       `——-evtf.log
|——-srvm
|       |——-eonsOUT.log
|       |——-eonsd.trc.10
|       |——-eonsd.trc.11
|       |——-eonsd.trc.12
|       |——-eonsdOUT.log
|       |——-eonsd_0.log
|       |——-eonsd_0.log.1
|       |——-eonsd_0.log.1.lck
|       `——-eonsd_0.log.lck
`——-alertnode1.log
bash-2.05$

 

секс мишки

这里需要重点介绍的就是agent这个子目录,这里面记录所有的has管理的资源的日志,包括crsd/ohasd资源,其中有些具有root权限的是分开管理的,所有就有四个子目录。

因为在11.2中,所有资源的操作都是间接的通过agent frame(在log中可以看到AGFW的字样)来完成的,所以如果要查看相关资源的start/stop/check等相关的日志,就需要根据crsd日志里面的相关时间点找到对应的agent日志。

比如下面这段就是从crsd oraagent日志中grep到的db停止的过程

2009-09-30 03:05:30.923: [    AGFW][140] Executing command: stop for resource: ora.orcldb.db 1 1
2009-09-30 03:05:30.925: [ora.orcldb.db][140] [stop] clsn_agent::stop {
2009-09-30 03:05:30.925: [ora.orcldb.db][140] [stop] InstAgent::stop {
2009-09-30 03:05:30.926: [ora.orcldb.db][140] [stop] Agent::flagUsrOraOpiIsSet(false)
2009-09-30 03:05:30.926: [ora.orcldb.db][140] [stop] Agent::valueOfAttribIs attrib: REASON compare value: dependency attribute value: user
2009-09-30 03:05:30.927: [ora.orcldb.db][140] [stop] Agent::valueOfAttribIs returns 0
2009-09-30 03:05:30.928: [ora.orcldb.db][140] [stop] Gimh::check condition (GIMH_NEXT_NUM) 9 exists
2009-09-30 03:05:30.928: [ora.orcldb.db][140] [stop] InstAgent::stop  shutdown mode: 3
2009-09-30 03:05:30.928: [ora.orcldb.db][140] [stop] DbAgent::preStopCbk {
2009-09-30 03:05:30.929: [ USRTHRD][140] Thread:[RLB:orcldb] stop {
2009-09-30 03:05:30.929: [ USRTHRD][140] Thread:[RLB:orcldb] stop {
2009-09-30 03:05:31.304: [ USRTHRD][38] Thread:[RLB:orcldb] DbAgent::Rlb::run stopping
2009-09-30 03:05:31.346: [ USRTHRD][140] Thread:[RLB:orcldb] stop }
2009-09-30 03:05:31.346: [ USRTHRD][140] Thread:[RLB:orcldb] stop }
2009-09-30 03:05:31.346: [ USRTHRD][140] Thread:[RLB:orcldb] stop {
2009-09-30 03:05:31.346: [ USRTHRD][140] Thread:[RLB:orcldb] stop {
2009-09-30 03:05:31.346: [ USRTHRD][140] Thread:[RLB:orcldb] stop }
2009-09-30 03:05:31.346: [ USRTHRD][140] Thread:[RLB:orcldb] stop }
2009-09-30 03:05:31.346: [ USRTHRD][140] Thread:[RLB:orcldb] stop {
2009-09-30 03:05:31.346: [ USRTHRD][140] Thread:[RLB:orcldb] stop }
2009-09-30 03:05:31.352: [ USRTHRD][140] Thread:[EonsSub FAN] stop {
2009-09-30 03:05:32.010: [ USRTHRD][140] Thread:[EonsSub FAN] stop }
2009-09-30 03:05:32.010: [ USRTHRD][140] Thread:[EonsSub FAN] stop {
2009-09-30 03:05:32.010: [ USRTHRD][140] Thread:[EonsSub FAN] stop }
2009-09-30 03:05:32.012: [ora.orcldb.db][140] [stop] DbAgent::preStopCbk }
2009-09-30 03:05:32.014: [ora.orcldb.db][140] [stop] makeConnectStr = (DESCRIPTION=(ADDRESS=(PROTOCOL=beq)(PROGRAM=/u01/app/base/product/11g/bin/oracle)(ARGV0=oracleorcldb1)(ENVS=‘ORACLE_HOME=/u01/app/base/product/11g,ORACLE_SID=orcldb1′)(ARGS=‘(DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))’))(CONNECT_DATA=(SID=orcldb1)))
2009-09-30 03:05:32.016: [ora.orcldb.db][140] [stop] InstConnection::connectInt: server not attached
2009-09-30 03:05:32.445: [ora.orcldb.db][140] [stop] connect successful
2009-09-30 03:05:38.319: [ora.orcldb.db][140] [stop] DbAgent::stopCbk {
2009-09-30 03:05:49.056: [ora.orcldb.db][140] [stop] DbAgent::stopCbk }
2009-09-30 03:06:00.456: [ora.orcldb.db][140] [stop] InstAgent::stop: }
2009-09-30 03:06:00.457: [ora.orcldb.db][140] [stop] clsn_agent::stop }
2009-09-30 03:06:00.457: [    AGFW][140] Command: stop for resource: ora.orcldb.db 1 1 completed with status: SUCCESS
bash-2.05$

   <ul style="display:none"><li><a href="http://kaxsash.co.cc/main/v_vozraste_porno.html">в возрасте порно</a></li></ul> <u style="display:none"><a href="http://piklity.ucoz.com">порно фото наталии орейро</a></u>  

11.2中新引入的mdnsd, gipcd, gpnpd, gnsd, ctssd等都在对应的日志目录中进行记录,其他相关的日志跟之前的release相差不多,就不重点介绍。

如果要看完整的ohasd stack启动过程,可以从ohasd.log得到,这个所有的日志入口,里面记录了各个agent的启动过程。

Categories: 数据库 Tags:

Oracle Database 11gR2 Clusterware之资源

September 15th, 2009 ricky.zhu No comments
.!.

上篇简单介绍了在11.2中新引入的后台进程,今天介绍11.2 CRS中的资源。

相当于11.1和10g,11.2的资源也增加了不少。首先在11.2中资源进行了分类:HAS资源和CRS资源。这里的CRS本身也是属于HAS的资源,在之前11.1中介绍到的nodeapps(包括vip, ons and gsd),listener,asm,rdbms等在11.2中都是属于CRS资源,那么新引入的HAS又包含哪些资源呢?

先简单学习下查看ohasd和crsd资源的命令:

$ crsctl stat res  -init -t
——————————————————————————–
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.asm
      1        ONLINE  ONLINE       staig07                  Started            
ora.crsd
      1        ONLINE  ONLINE       staig07                                      
ora.cssd
      1        ONLINE  ONLINE       staig07                                      
ora.cssdmonitor
      1        ONLINE  ONLINE       staig07                                      
ora.ctssd
      1        ONLINE  ONLINE       staig07                  ACTIVE:0            
ora.diskmon
      1        ONLINE  ONLINE       staig07                                      
ora.drivers.acfs
      1        ONLINE  ONLINE       staig07                                      
ora.evmd
      1        ONLINE  ONLINE       staig07                                      
ora.gipcd
      1        ONLINE  ONLINE       staig07                                      
ora.gpnpd
      1        ONLINE  ONLINE       staig07                                      
ora.mdnsd
      1        ONLINE  ONLINE       staig07                                      
[crsusr@node1 sshsetup]$
 

这里简单说明其中的几个:
ctssd是11.2新增加的时间同步的资源和进程,是对ntp服务的一种增强。
acfs就是外界广为传说的ASM file system对应的资源。
gpnpd就是11.2最重要的feature,也就是即插即用对应的资源。
mdnsd是对应于SCAN和GNS的资源。

相对于与ohasd的资源,crsd的资源就多了去了。下面简单看一下:

 crsctl stat res -t
——————————————————————————–
NAME           TARGET  STATE        SERVER                   STATE_DETAILS      
——————————————————————————–
Local Resources
——————————————————————————–
ora.DATA1.dg
               ONLINE  ONLINE       staig07                                      
               ONLINE  ONLINE       staig10                                      
               ONLINE  ONLINE       staig12                                      
               ONLINE  ONLINE       staig13                                      
ora.DATA2.dg
               ONLINE  ONLINE       staig07                                      
               ONLINE  ONLINE       staig10                                      
               ONLINE  ONLINE       staig12                                      
               ONLINE  ONLINE       staig13                                      
ora.LISTENER.lsnr
               ONLINE  ONLINE       staig07                                      
               ONLINE  ONLINE       staig10                                      
               ONLINE  ONLINE       staig12                                      
               ONLINE  ONLINE       staig13                                      
ora.asm
               ONLINE  ONLINE       staig07                  Started            
               ONLINE  ONLINE       staig10                  Started            
               ONLINE  ONLINE       staig12                  Started            
               ONLINE  ONLINE       staig13                  Started            
ora.eons
               ONLINE  ONLINE       staig07                                      
               ONLINE  ONLINE       staig10                                      
               ONLINE  ONLINE       staig12                                      
               ONLINE  ONLINE       staig13                                      
ora.gsd
               OFFLINE OFFLINE      staig07                                      
               OFFLINE OFFLINE      staig10                                      
               OFFLINE OFFLINE      staig12                                      
               OFFLINE OFFLINE      staig13                                      
ora.net1.network
               ONLINE  ONLINE       staig07                                      
               ONLINE  ONLINE       staig10                                      
               ONLINE  ONLINE       staig12                                      
               ONLINE  ONLINE       staig13                                      
ora.ons
               ONLINE  ONLINE       staig07                                      
               ONLINE  ONLINE       staig10                                      
               ONLINE  ONLINE       staig12                                      
               ONLINE  ONLINE       staig13                                      
ora.registry.acfs
               ONLINE  ONLINE       staig07                                      
               ONLINE  ONLINE       staig10                                      
               ONLINE  ONLINE       staig12                                      
               ONLINE  ONLINE       staig13                                      
——————————————————————————–
Cluster Resources
——————————————————————————–
ora.LISTENER_SCAN1.lsnr
      1        ONLINE  ONLINE       staig10                                      
ora.LISTENER_SCAN2.lsnr
      1        ONLINE  ONLINE       staig12                                      
ora.LISTENER_SCAN3.lsnr
      1        ONLINE  ONLINE       staig07                                      
ora.oc4j
      1        ONLINE  ONLINE       staig13                                      
ora.orcldb.db
      1        ONLINE  ONLINE       staig07                  Open                
      2        ONLINE  ONLINE       staig10                  Open                
      3        ONLINE  ONLINE       staig12                  Open                
      4        ONLINE  ONLINE       staig13                  Open                
ora.scan1.vip
      1        ONLINE  ONLINE       staig10                                      
ora.scan2.vip
      1        ONLINE  ONLINE       staig12                                      
ora.scan3.vip
      1        ONLINE  ONLINE       staig07                                      
ora.staig07.vip
      1        ONLINE  ONLINE       staig07                                      
ora.staig10.vip
      1        ONLINE  ONLINE       staig10                                      
ora.staig12.vip
      1        ONLINE  ONLINE       staig12                                      
ora.staig13.vip
      1        ONLINE  ONLINE       staig13                                      
[crsusr@staig07 sshsetup]$
 

对于CRSD资源,在11.2中也进行了重新分类,包括cluster-wide范围的资源,比如SCAN,scan listener, VIP等等,还有一些是local (node-wide)的资源,比如network(注意,11.2把网络也作为了一种资源),eons,asm,diskgroup(11.2把ASM diskgroup也作为了一种资源)等等。

这些资源之间也有相互的依赖关系,比如DG resource依赖于ASM,VIP依赖于network。这些可以从资源的详细属性看出,列出资源的详细属性,用下面的命令:

 crsctl stat res ora.DATA2.dg -p
NAME=ora.DATA2.dg
TYPE=ora.diskgroup.type
ACL=owner:crsusr:rwx,pgrp:oinstall:rwx,other::r–
ACTION_FAILURE_TEMPLATE=
ACTION_SCRIPT=
AGENT_FILENAME=%CRS_HOME%/bin/oraagent%CRS_EXE_SUFFIX%
ALIAS_NAME=
AUTO_START=never
CHECK_INTERVAL=300
CHECK_TIMEOUT=600
DEFAULT_TEMPLATE=
DEGREE=1
DESCRIPTION=CRS resource type definition for ASM disk group resource
ENABLED=1
LOAD=1
LOGGING_LEVEL=1
NLS_LANG=
NOT_RESTARTING_TEMPLATE=
OFFLINE_CHECK_INTERVAL=0
PROFILE_CHANGE_TEMPLATE=
RESTART_ATTEMPTS=5
SCRIPT_TIMEOUT=60
START_DEPENDENCIES=hard(ora.asm) pullup(ora.asm)
START_TIMEOUT=900
STATE_CHANGE_TEMPLATE=
STOP_DEPENDENCIES=hard(intermediate:ora.asm)
STOP_TIMEOUT=180
UPTIME_THRESHOLD=1d
USR_ORA_ENV=
USR_ORA_OPI=false
USR_ORA_STOP_MODE=
VERSION=11.2.0.1.0

[crsusr@staig07 sshsetup]$
 

这里可以看出,DG2有一个hard依赖于asm,并有一个pullup关系。关于start、stop依赖关系以及资源的属性,以后在做介绍。

Categories: 数据库 Tags: ,

Oracle Database 11gR2 Clusterware之后台进程

September 9th, 2009 ricky.zhu No comments

Oracle Database 11gR2发布已经一周的时间了,相信很多朋友已经抢先体验了一把,在这个release中引入了一些新feature,其中Clusterware更是改名为Grid Infrastructure。Clusterware相对于之前的10gR2 和11gR1在架构上也做了很大的改进和改动,包括引入GPnP (即插即用),SCAN(Simple Client Access Name)等等,本节简单介绍一下Clusterware的进程。

环境是OEL5,先看一下:

[~]$ ps -cafe |grep d.bin  | grep -v grep
root     27807     1 TS   21 Aug27 ?        00:08:18 /u01/app/cluster/crs/bin/ohasd.bin reboot
crsusr   30643     1 TS   24 Aug27 ?        03:41:53 /u01/app/cluster/crs/bin/mdnsd.bin
crsusr   30654     1 TS   21 Aug27 ?        00:00:28 /u01/app/cluster/crs/bin/gipcd.bin
crsusr   30667     1 TS   24 Aug27 ?        00:01:05 /u01/app/cluster/crs/bin/gpnpd.bin
crsusr   30718     1 RR  139 Aug27 ?        01:16:51 /u01/app/cluster/crs/bin/ocssd.bin
root     30950     1 TS   21 Aug27 ?        00:04:32 /u01/app/cluster/crs/bin/octssd.bin
crsusr   31065     1 TS   24 Aug27 ?        00:00:24 /u01/app/cluster/crs/bin/oclskd.bin
root     31084     1 TS   21 Aug27 ?        00:08:22 /u01/app/cluster/crs/bin/crsd.bin reboot
crsusr   31104     1 TS   24 Aug27 ?        00:01:30 /u01/app/cluster/crs/bin/evmd.bin
root     31128     1 TS   24 Aug27 ?        00:00:23 /u01/app/cluster/crs/bin/oclskd.bin
[~]$
 

可以看到,相对之前的release,增加了不少的进程。

熟悉10g和11gR1的朋友就会发现,在11gR2中,之前熟悉的crsd.bin, ocssd.bin和evmd.bin 依然存在,但是新增加了一个ohasd.bin,这是新的入口,在/etc/inittab中也从之前的三项变成了现在的一项:

h1:35:respawn:/etc/init.d/init.ohasd run >/dev/null 2>&1 </dev/null

其他新引入的比较重要的进程包括gpnpd.bin, mdnsd.bin 和gipcd.bin以及控制时间同步的ctssd.bin,还有两个oclskd.bin进程。

所有的进程都是通过ohasd.bin spawn起来,而后修改spawn id为1,细心的朋友可以已经发现,这里面,ocssd.bin依然是一个RT(realtime mode)进程,这点较之之前的版本并没有变,说明ocssd.bin依然是一个fatal进程,这个process挂了,节点就会发生reboot,这也是需要重点保证的。

除了这些d.bin进程之外,可以看到其他也有一些新进程:

[ ~]$ ps -cafe |grep "u01" | grep -v grep | grep -v d.bin
crsusr   26134     1 TS   23 Aug27 ?        00:01:30 /u01/app/cluster/crs/bin/tnslsnr LISTENER -inherit
root     29493     1 TS   21 Aug27 ?        00:07:55 /u01/app/cluster/crs/bin/orarootagent.bin
crsusr   30627     1 TS   21 Aug27 ?        00:51:52 /u01/app/cluster/crs/bin/oraagent.bin
root     30682     1 RR  139 Aug27 ?        00:04:38 /u01/app/cluster/crs/bin/cssdmonitor
root     30697     1 RR  139 Aug27 ?        00:04:45 /u01/app/cluster/crs/bin/cssdagent
crsusr   30699     1 TS   21 Aug27 ?        00:00:29 /u01/app/cluster/crs/bin/diskmon.bin -d -f
crsusr   31200 31104 TS   24 Aug27 ?        00:00:00 /u01/app/cluster/crs/bin/evmlogger.bin -o /u01/app/cluster/crs/evm/log/evmlogger.info -l /u01/app/cluster/crs/evm/log/evmlogger.log
crsusr   31446     1 TS   21 Aug27 ?        01:10:59 /u01/app/cluster/crs/bin/oraagent.bin
root     31654     1 TS   21 Aug27 ?        02:22:44 /u01/app/cluster/crs/bin/orarootagent.bin
crsusr   31689     1 TS   21 Aug27 ?        00:00:00 /u01/app/cluster/crs/opmn/bin/ons -d
crsusr   31690 31689 TS   22 Aug27 ?        00:00:20 /u01/app/cluster/crs/opmn/bin/ons -d
 

这里面大家看到多了一些agent进程,比如oraagent, orarootagent,cssdagent进程等等,这些agent分别负责各自的resource,并执行一些相关的start/stop/check/clean脚本任务,类似于之前release的action script的作用。cssdagent和cssdmonitor在这里就是负责前面说到的ocssd.bin进程的。

更多的信息请参考Oracle Database 11gR2的文档

download flatliners free

Categories: 数据库 Tags: ,

Troubleshoot CRS 10.2.0.4 on EL5 (1)

August 4th, 2009 ricky.zhu 1 comment

昨天浪费了大半天时间升级10.2.0.1 to 10.2.0.4,平台是RHEL5,Kernel信息:
Linux xxx 2.6.18-8.0.0.4.1.el5 #1 SMP Tue Jun 5 23:09:11 EDT 2007 x86_64 x86_64 x86_64 GNU/Linux

考虑到目前可能会有不少同学都要升级10201到10204或者更高的patchset,所以把昨天遇到的问题和troubleshooting的方法总结在这里:

install 10201 clustereare on /u01/app/cluster/crs
then 10201 RAC software on /u01/app/base/product/11g

upgrade 10204 cluster first,结束的时候,提示需要在每个节点按照顺序执行 crsctl stop crs; $CRS_HOME/install/root102.sh

问题就出在这里,有的时候,root102.sh在有些节点可能因为一些原因失败,最常见的就是timeout,这个时候cssd起来了,但是crsd无法正常启动。这个时候的状态是:
已经运行root102.sh的节点已经完成了升级,crs的software版本(crsctl query crs softwareversion)和active版本(crsctl query crs activeversion)都已经是10204了,但是别的节点software version是10204,但是activeversion仍然是10201

失败了以后,要重新在失败的节点上执行root102.sh,不能直接再次运行,需要作一些修改,恢复到运行root102.sh之前的状态,需要作如下修改:
1)从没有运行过root102.sh的节点上,打包 $CRS_HOME/install/patch102目录,并放在$CRS_HOME/install目录下 (运行过root102.sh一次之后,这个目录就不见了,所以为了稳妥起见,在运行最后一个节点的root102.sh之前,把这个目录备份一次,否则你就没地方找去了)
2)修改这个目录的属性为 :o install,注意用 chown -Rf,确保子目录的权限也得到修改
3)改名或者删除CRS_HOME/install目录下生成的prepatch10204目录,make.log和files10204.log,这些都是root102.sh脚本运行的中间产物
4)修改$CRS_HOME/install目录下的文件属性为之前的用户和组。
5)把所有的clusterware相关的进程杀掉,可以用 ps -ef | grep -e d.bin 查看这些进程。
6)再次运行root102.sh

如果用上面的方法运行,依然无法成功的话,那么就把其余节点的crs stack 用crsctl stop crs停止,然后重复上面的步骤,再次运行,直到成功。

因为patchset是没法回退的,所以如果root102.sh一直无法成功的话,最差的办法,就是卸载之前的10201+10204 (both crs and rac),然后重新进行安装,这是我们最不希望看到的,所以按照之前的方法,多尝试几次。就会成功的。如果不幸,在这期间,你的节点重启了,也不要紧,节点回来以后,还是按照这个方法进行。最后升级成功:

[root@xxxxxx install]# pwd
/u01/app/cluster/crs/install
[root@xxxxxx install]# ls -lrt
total 260
-rwxr-xr-x  1 ractest oinstall     0 Feb 23  2005 install.incl
-rwxr-xr-x  1 ractest oinstall    38 Apr 20  2005 install.excl
-rw-rw-r–  1 ractest oinstall  2808 Jul 14  2005 templocal
-rwxr-xr-x  1 ractest oinstall  4408 Apr 20  2006 rootaddnode.sbs
-rwxr-xr-x  1 ractest oinstall  1119 Oct 10  2007 cmdllroot.sh
-rw-rw—-  1 ractest oinstall   651 Aug  3 06:09 paramfile.crs
-rw-rw—-  1 ractest oinstall    42 Aug  3 06:10 cluster.ini
-rw-rw—-  1 ractest oinstall   179 Aug  3 06:10 envVars.properties
-rwxr-xr-x  1 ractest oinstall 17916 Aug  3 06:59 rootupgrade
-rwxr-xr-x  1 ractest oinstall  3642 Aug  3 06:59 rootinstall
-rwxr-xr-x  1 ractest oinstall 12842 Aug  3 06:59 rootdelete.sh
-rwxr-xr-x  1 ractest oinstall  3963 Aug  3 06:59 rootdeletenode.sh
-rwxr-xr-x  1 ractest oinstall  8261 Aug  3 06:59 rootdeinstall.sh
-rwxr-xr-x  1 ractest oinstall 32954 Aug  3 06:59 rootconfig
-rwxr-xr-x  1 ractest oinstall 24798 Aug  3 06:59 root102.sh
-rwxr-xr-x  1 ractest oinstall  5668 Aug  3 06:59 preupdate.sh
-rw-rw-r–  1 ractest oinstall 10019 Aug  3 06:59 rootlocaladd
drwxr-xr-x 32 ractest oinstall  4096 Aug  3 08:54 prepatch10204
-rw-r–r–  1 root    root     67039 Aug  3 08:54 files10204.log
-rw-rw—-  1 ractest oinstall  8025 Aug  3 08:55 make.log
drwxrwx—  2 ractest oinstall  4096 Aug  4 01:47 checkpoints
[root@xxxxxx install]#

 <u style="display:none"><a href="http://zipalotn.at.ua/news/2010-01-04-37">сын трахает смотреть</a></u>  <div style="display:none"><a href="http://kissboom.at.ua/news/2010-01-04-31">показать порно мать сын секс порно</a></div>

порно ракам

Categories: 数据库 Tags: ,

如何在RHEL5上安装Oracle RAC 10g

May 26th, 2009 ricky.zhu 6 comments

前阵子,在RHEL5或者OEL5上安装Oracle Clusterware 10.2.0.1遇到了不少问题,其中的原因主要是因为Oracle RAC 10.2.0.1发布的时候RHEL5还没出来,那时的redhat才是RHEL4,在Suse Linux SLES10上也有同样的问题。

问题出现刚开始安装的时候和最后一个节点运行root.sh的时候。其中主要的问题是三个:

Issue#1: To install 10gR2, you must first install the base release, which is 10.2.0.1. As these version of OS are newer, you should use the following command to invoke the installer:

$ runInstaller -ignoreSysPrereqs // This will bypass the OS check //

Issue#2: At end of root.sh on the last node vipca will fail to run with the following error:


Oracle CRS stack installed and running under init(1M)
Running vipca(silent) for configuring nodeapps
/home/oracle/crs/oracle/product/10/crs/jdk/jre//bin/java: error while loading
shared libraries: libpthread.so.0: cannot open shared object file:
No such file or directory

Issue#3: After working around Issue#2 above, vipca will fail to run with the following error if the VIP IP’s are in a non-routable range [10.x.x.x, 172.(16-31).x.x or 192.168.x.x]:

# vipca
Error 0(Native: listNetInterfaces:[3])
[Error 0(Native: listNetInterfaces:[3])]

原因是这样的:
These releases of the Linux kernel fix an old bug in the Linux threading that Oracle worked around using LD_ASSUME_KERNEL settings in both vipca and srvctl, this workaround is no longer valid on OEL5 or RHEL5 or SLES10 hence the failures.

对于问题一,比较容易解决,只需要runInstaller的时候忽略检查即可。
问题二的解决方法是:

To workaround Issue#2 above, edit vipca (in the CRS bin directory on all nodes) to undo the setting of LD_ASSUME_KERNEL. After the IF statement around line 120 add an unset command to ensure LD_ASSUME_KERNEL is not set as follows:


if [ "$arch" = "i686" -o "$arch" = "ia64" -o "$arch" = "x86_64" ]
then
LD_ASSUME_KERNEL=2.4.19
export LD_ASSUME_KERNEL
fi
unset LD_ASSUME_KERNEL <<<== Line to be added

问题三的解决方法:

To workaround issue#3 (vipca failing on non-routable VIP IP ranges, manually or during root.sh), if you still have the OUI window open, click OK and it will create the “oifcfg” information, then cluvfy will fail due to vipca not completed successfully, skip below in this note and run vipca manually then return to the installer and cluvfy will succeed. Otherwise you may configure the interfaces for RAC manually using the oifcfg command as root, like in the following example (from any node):

/bin # ./oifcfg setif -global eth0/192.168.1.0:public
/bin # ./oifcfg setif -global eth1/10.10.10.0:cluster_interconnect
/bin # ./oifcfg getif
eth0 192.168.1.0 global public
eth1 10.10.10.0 global cluster_interconnect

然后在手工运行vipca添加nodeapps resource即可。

详细的情况记录在Oracle notes: 414163.1。

Categories: 数据库 Tags: ,

Oracle CRS/RAC Utilities-Deinstall tool

March 11th, 2009 ricky.zhu 1 comment
.!.

今天要介绍的是另外一个Clusterware、RAC的工具-Deinstall
下载地址


Deinstallation tool又叫clusterdeconfig tool是一款用于完整卸载Clusterware和RAC的工具,无论之前是一次成功或者失败的安装,利用它都可以完全的清除掉,deinstallation too不仅仅可以卸载cluster中所有节点的CRS和RAC软件,而且可以卸载共享的文件,数据文件和OCR。在Windows平台中,deinstallation tool甚至可以删除注册表中的相关信息,实在是每个DBA居家旅行之必备。

只是这么好的工具可能在生产环境用的机会不是很多。
关于deinstallation tool的官方文档地址在此

The clusterdeconfig tool removes and deconfigures all of the software and shared files that are associated with an Oracle Clusterware or Oracle RAC Database installation. The clusterdeconfig tool removes the software and shared files from all of the nodes in a cluster.

Use the clusterdeconfig tool to prepare a cluster to reinstall Oracle Clusterware and Oracle Database software after a successful or failed installation. The tool removes software, clusterware and database files, and the global configuration across all of the nodes in a cluster environment that could hinder a subsequent installation. On Windows-based systems, the tool removes Windows Registry entries. The clusterdeconfig tool also removes Oracle Clusterware that was installed to support Oracle RAC or to provide failover capabilities for third-party software.

The clusterdeconfig tool restores your cluster to its state prior to the installation, enabling you to perform a new installation. You can also use Oracle Cluster Verification Utility (CVU) to determine the cause of any problems that may have occurred during an installation so that you can correct the errors.


The clusterdeconfig tool will not remove third-party software that depends on Oracle Clusterware. In addition, the clusterdeconfig tool does not warn you about third-party software dependencies on Oracle Clusterware or Oracle Database homes prior to removing the respective homes.

Categories: 数据库 Tags: ,

Oracle CRS/RAC Utility-OSTool

March 10th, 2009 ricky.zhu 1 comment

Oracle Provide a standalone tool named IPD-OSTool –
Oracle Instantaneous Problem Detection – OS Tool (IPD/OS)

This tool is designed to detect and analyze operating system (OS) and cluster resource related degradation and failures in order to bring more explanatory power to many issues that occur in clusters where Oracle Clusterware and Oracle RAC are running such as node eviction. It tracks the OS resource consumption at each node, process, and device level continuously. It collects and analyzes the cluster-wide data. In real time mode, when thresholds are hit, an alert is shown to the operator. For root cause analysis, historical data can be replayed to understand what was happening at the time of failure.

目前这个工具只提供Linux平台32bit和64的下载。需要在cluster中每个节点进行安装。可以提供CLI命令行和GUI图形界面两种模式。图形界面虽然没有eygle介绍的spotlight那么炫,但是毕竟是Oracle自己的东西,最重要的是可以免费在OTN上下载。下载地址在此


用户手册

2009-4-10 最新更新,IPD/OS提供Windows版本下载 ,包括Windows 32位和64位

下载地址

Categories: 数据库 Tags: ,