Archive

Posts Tagged ‘Solaris’

11.2.0.1.0 on Solaris.Sparc64 and Solaris.X64 released

November 10th, 2009 ricky.zhu No comments

Oracle Database 11.2.0.1.0 for Solaris已经发布,现在起可以从OTN上下载了。距离Linux版本的发布到现在已经快2个多月的时间了,这次率先发布Solaris.Sparc平台,足以显现Oracle对Solaris这个平台的重视程度了。

其实上个月OOW2009上Larry宣布的Oracle Database Machine V2也是基于Solaris.Sparc硬件架构的。

11.2.0.1.0现在可用的平台包括Linux X86,Linux X64 and Solaris.Sparc64。 Solaris.Sparc64的大小在2.3G左右。

下载地址

2009-11-26,Oracle Database 11gR2 (11.2.0.1.0) for Solaris.X64 released。这是在11gR1 (11.1.0.6.0) 之后的又一个重要的release,在11gR1中是不支持Solaris.X64平台的,现在11gR2已经率先在AIX, HPI主流平台之前就发布Solaris.X64版本足以显现Oracle对Solaris平台的重视程度。下载地址

Categories: 数据库 Tags: ,

Solaris Run States Introduction

October 27th, 2008 ricky.zhu No comments
.!.

divx 007 from russia with love Solaris运行级别表示系统的运行状态,每个level具体运行哪些服务和进程是由/etc/rc#.d目录下面的脚本决定的。举例来说,在有SunCluster的RAC环境下,对Oracle UDLM (ORCLudlm) 进行升级就需要先进入单用户模式,卸载老的ORCLudlm,然后安装新版本。 这个时候就需要boot -s
默认的服务的运行级别列表如下:

* 0: The system is at the PROM monitor (ok>) or security monitor (>) prompt. It is safe to shut down the system when it is at this init state.
* 1, s or S: This state is known as “single-user” or “system administrator” mode. Root is the only user on the system, and only basic kernel functions are enabled. A limited number of filesystems (usually only root and /usr) are mounted. This init state is often used for sensitive functions (such as kernel libc patches) or while troubleshooting a problem that is keeping the system from booting into multiuser mode.
* 2: Multiple users can log in. Most system services (except for NFS server and printer resource sharing) are enabled.
* 3: Normal operating state. NFS and printer sharing is enabled, where appropriate.
* 4: Usually undefined.
* 5: Associated with the boot -a command. The system is taken to init 0 and an interactive boot is started.
* 6: Reboot. This state takes the system to init state 0 and then to the default init state (usually 3, but can be redefined in the /etc/inittab file).
Read more…

Categories: 主机 Tags:

Solaris rsh connection refused resolved

September 22nd, 2008 ricky.zhu No comments

这个问题困扰了我好几个月,今天终于搞定了。

一个Solaris10的cluster,四个节点,此处以1,2,3,4代替,所有节点之间ssh和rsh都是通的,但是1-1,2-1,3-1,4-1的rsh不通,这里所说的通就是不用输入密码即可访问其他的节点,比如1-2,即在1节点执行rsh 2 date即可显示2节点的当前时间。

其实要配置从1-2节点的rsh,有一些必要的步骤,简单罗列如下:
Read more…

Categories: 主机 Tags:

Service Management Facility快速入门

August 26th, 2008 ricky.zhu No comments

在解决前面scstat问题的时候,仔细的研读了一下这个sun的官方帮助,加深了对Solaris的服务的理解。在Solaris中,可以利用svcs 和svcadm 命令进行查看和修改、重启服务。下面就转载一下Service Management Facility这篇快速入门

简介

过去,UNIX 操作系统包含一组服务:这些服务是与任何交互式用户登录都不关联的软件程序,用于侦听和响应请求以执行特定的任务(如传送电子邮件、响应 ftp 请求,或允许执行远程命令)。这些传统服务通常是一些单独的应用程序,它们作为单个进程执行,在系统引导时启动,并在系统启动和运行时持续执行,可处理接收到的任何请求。

如今,管理员必须管理一系列服务,这些服务的作用已经超出了此原始模型的作用范围。Sun 推出了 Service Management Facility(SMF,服务管理工具),以简化这些系统服务的管理。SMF 是 Solaris 操作系统的一项新功能,为每个 Solaris 系统上的服务和服务管理创建支持的、统一的模型。它是 Solaris 10 中预测性自我修复技术的核心部分,为软件和硬件故障以及管理错误提供自动恢复功能。

在本指南中,我们将介绍 SMF 的功能及优势,指出 Solaris 中显著更新的部分,并说明如何使用 SMF 完成典型的管理任务。可以在 Sun 的 BigAdmin Web 站点上找到 SMF 及预测性自我修复功能的详细指南。

功能

Service Management Facility 已经改进了 Solaris 管理模型的几个方面。一些最显著的更新包括:

* 服务由可以进行查看(使用新的 svcs(1) 命令)和管理(使用 svcadm(1M) 和 svccfg(1M))的一级对象表示。
* 无论失败的服务是由管理员错误、软件错误导致,还是受无法更正的硬件错误的影响,这些服务都将按照相关性顺序自动重新启动。
* 可以获取有关配置错误或行为异常的服务的详细信息,包括对服务未运行的原因的说明(使用 “svcs -x”),以及每个服务单独的持久性日志文件。
* 引导过程中出现的问题比较容易调试,因为在启动故障期间可以控制引导详细程度,记录服务启动消息,以及提供更可靠的控制台访问。
* 自动拍摄服务配置快照,从而更容易备份、恢复和撤消对服务所做的更改。
* 可以使用受支持的工具 (svcadm(1M)) 启用和禁用服务,从而允许更改不受升级和修补程序的影响而保留原样。
* 管理员可以更容易地将任务安全地委派给非超级用户,这些任务包括配置、启动、停止或重新启动服务(如 smf_security(5) 手册页中所述)。
* 根据服务的相关性并行启动服务,可以更快地引导大型系统。

Read more…

Categories: 主机 Tags:

scstat unexpected error问题及解决

August 26th, 2008 ricky.zhu 7 comments

最近一直很少更新博客,熟悉的朋友都知道,我在忙一个重要的release,Oracle Database 11g的第一个patchset – 11.1.0.7,这个patchset应该很快就要发布的,敬请期待。

今天在测试的时候,Solaris的节点再一次出现问题,最近服务器的问题不断,先是DLM问题,后面是QFS问题,现在居然是服务出现依赖关系,启动异常,SunCluster命令scstat返回异常结果:unexepcted error

根据google搜到的结果,在Sun的官方网站找到一个类似的问题

根据提示,检查一下svcs -x 的输出,
bash-2.05$ svcs -x
svc:/network/nfs/client:default (NFS client)
State: offline since August 25, 2008 10:33:46 PM PDT
Reason: Start method is running.
See: http://sun.com/msg/SMF-8000-C4
See: mount_nfs(1M)
See: /var/svc/log/network-nfs-client:default.log
Impact: 18 dependent services are not running. (Use -v for list.)

svc:/application/print/server:default (LP print server)
State: disabled since August 25, 2008 10:31:10 PM PDT
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: lpsched(1M)
Impact: 2 dependent services are not running. (Use -v for list.)

svc:/system/cluster/cl-svc-cluster-milestone:default (Synchronizing the cluster userland services)
State: disabled since August 25, 2008 10:32:38 PM PDT
Reason: Temporarily disabled by an administrator.
See: http://sun.com/msg/SMF-8000-1S
Impact: 1 dependent service is not running. (Use -v for list.)

svc:/application/stosreg:default (Service Tag OS Registry Inserter)
State: maintenance since August 25, 2008 10:33:39 PM PDT
Reason: Method failed.
See: http://sun.com/msg/SMF-8000-8Q
See: stclient(1M)
See: /var/svc/log/application-stosreg:default.log
Impact: This service is not running.

svc:/network/stdiscover:default (Service Tag discovery probe)
State: maintenance since August 25, 2008 10:33:45 PM PDT
Reason: Restarter svc:/network/inetd:default gave no explanation.
See: http://sun.com/msg/SMF-8000-9C
See: in.stdiscover(1M)
Impact: This service is not running.

svc:/network/stlisten:default (Service Tag Discovery Listener)
State: maintenance since August 25, 2008 10:33:45 PM PDT
Reason: Restarter svc:/network/inetd:default gave no explanation.
See: http://sun.com/msg/SMF-8000-9C
See: in.stlisten(1M)
Impact: This service is not running.
bash-2.05$

发现服务的依赖不对,启动console进入单用户模式,操作了几个服务,重新禁用和启用,重启,居然好了。
记录一下。

Categories: 主机 Tags:

SunCluster ucmmd问题解决过程

March 7th, 2008 ricky.zhu No comments

最近,经常遇到SunCluster中有一个节点ucmm起不来的问题,现象就是scstat -g输出的结果显示ucmmd is not running,十分郁闷
STIT的弟兄们帮助解决了几次,但是也不知所以然,这次刚解决了,又坏了,恼火。

search了Sun的网站,找到了scswitch的用法,仔细看了一遍,然后用了两个命令搞定了。

1.先用ucmmd把ucmm的process重新启动一次:

#/usr/cluster/lib/ucmm/ucmmd -r /usr/cluster/lib/ucmm/ucmm_reconf

2. 然后用scswitch把相关的group resource offline/online一次,结果OK

# /usr/cluster/bin//scswitch -R -h xxx -g rac-framework-rg
#xxx is the node name

scswitch的用法记录一下,以备后患

scswitch(1M)

scswitch– perform ownership and state change of resource groups and disk device groups in Sun Cluster configurations

SYNOPSIS

scswitch -c -h node[,...] -j resource[,...] -f flag-name
scswitch {-e| -n} [-M] -j resource[,...]
scswitch -F {-g resource-grp[,...]| -D device-group[,...]}
scswitch -m -D device-group[,...]
scswitch -Q [ -g resource-grp[,...]]
scswitch -R -h node[,...] -g resource-grp[,...]
scswitch -S -h from-node [ -K continue_evac]
scswitch {-u| -o} -g resource-grp[,...]
scswitch -z -g resource-grp[,...] -h node[,...]
scswitch -z -g resource-grp[,...]
scswitch -z
scswitch -z -D device-group[,...] -h node
scswitch -Z [-g resource-grp[,...]]
Read more…

Categories: 主机 Tags:

High availability cluster

February 19th, 2008 ricky.zhu No comments

摘自于维基百科

High-availability cluster
From Wikipedia, the free encyclopedia

High-availability clusters (also known as HA Clusters or Failover Clusters) are computer clusters that are implemented primarily for the purpose of improving the availability of services which the cluster provides. They operate by having redundant computers or nodes which are then used to provide service when system components fail. Normally, if a server with a particular application crashes, the application will be unavailable until someone fixes the crashed server. HA clustering remedies this situation by detecting hardware/software faults, and immediately restarting the application on another system without requiring administrative intervention, a process known as Failover. As part of this process, clustering software may configure the node before starting the application on it. For example, appropriate filesystems may need to be imported and mounted, network hardware may have to be configured, and some supporting applications may need to be running as well.

HA clusters are often used for critical databases, file sharing on a network, business applications, and customer services such as electronic commerce websites.

HA cluster implementations attempt to build redundancy into a cluster to eliminate single points of failure, including multiple network connections and data storage which is multiply connected via Storage area networks.

HA clusters usually use a heartbeat private network connection which is used to monitor the health and status of each node in the cluster. One subtle, but serious condition every clustering software must be able to handle is split-brain. Split-brain occurs when all of the private links go down simultaneously, but the cluster nodes are still running. If that happens, each node in the cluster may mistakenly decide that every other node has gone down and attempt to start services that other nodes are still running. Having duplicate instances of services may cause data corruption on the shared storage.
Node configurations
ha cluster
Read more…

Categories: 生活点滴 Tags:

Solaris启动过程透析

February 14th, 2008 ricky.zhu No comments

Solaris在三大UNIX平台(Solaris, AIX, HPUX)中是相对来说比较稳定的一个,虽然启动过程跟Linux大致相同,但是在细节方面还是有一些差异的地方,转载一篇Solaris启动过程分析的文章,帮助自己更好的学习和了解Solaris这个优秀的平台。全文分四个部分:简单介绍,启动,进程和inetd

在Sparc平台下,Solaris系统中有一个类似PC BIOS的芯片程序(EEPROM OpenBoot)负责识别分区、文件系统和加载内核,在Solaris 2.6之后的版本中,默认的内核文件存放在/platform/`arch`/kernel/unix位置,`arch`指令是指明系统的硬件体系,目前一般是i86pc(Intel IA32)或sun4u(Sun UntraSparc)。
  
  在Intel体系中,因为没有eeprom firmware,所以系统提供了一个模拟eeprom的引导程序,来负责内核的定位和加载,这个程序是工作在实模式下的,系统必须要给他提供一个 fat12/16格式的boot分区,在系统引导完成之后可以在/boot/solaris下找到他的配置文件。(Solaris IA使用默认内核/kernel/unix)
  
  整个系统启动过程如下:
  =====================================
  init 0 openboot模式 -> (引导内核,加载硬件驱动) 可以选择从cdrom引导进入维护模式
  |
  V
  init 1 单用户模式 -> (加载/分区) 登陆进入维护模式,或按Ctrl+D进入多用户模式
  |
  V
  init 2 网络工作站模式 -> (连接网络,运行网络工作站服务) 运行/etc/rc2脚本连接网络
  | |
  | ->-> 启动S69inet服务,运行部分inetd网络服务
  V
  init 3 网络服务器模式 -> (运行各种网络服务) 运行/etc/rc3脚本启动网络服务器
  
  
   Read more…

Categories: 主机 Tags: