博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Oracle RAC内部错误:ORA-00600[keltnfy-ldmInit]一例
阅读量:7223 次
发布时间:2019-06-29

本文共 9920 字,大约阅读时间需要 33 分钟。

一套SUNOS上的2节点10.2.0.2 RAC系统日前出现ORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []内部错误,错误发生时系统操作人员误使用hostname命令修改了1号主机的主机名,之后陆续出现以上ora-00600错误,同时操作系统日志显示RAC CSS进程意外终止,具体日志如下:
================== OS Message=====================Jan 10 11:15:10 cupd25k-a root: [ID 702911 user.error] Cluster Ready Services completed waiting on dependencies.Jan 10 11:15:16 cupd25k-a root: [ID 702911 user.error] Duplicate Oracle CLSMON found. Killing and restarting it.Jan 10 11:15:16 cupd25k-a root: [ID 702911 user.error] Oracle CSS daemon failed to start up. Check CRS logs for diagnostics.Jan 10 11:15:16 cupd25k-a root: [ID 702911 user.error] Oracle CLSMON terminated with unexpected status 137. Respawning/* 这里的Duplicate Oracle CLSMON found 因该指的是OCLSMON进程,"In Oracle 10.2.0.2 and above there is an additional process called OCLSOMONwhich monitors the CSS daemon for hangs or scheduling issues and can reboot anode if there is a perceived hang. OCLSOMON is spawned in init.cssd and runsas the Oracle user."   oclsmon进程在10.2.0.2以后版本被引入,用以监视css进程,   若发生hang或操作系统调度问题时该进程可能会reboot节点,   oclsmon进程会被init.cssd脚本spawned.  */==================oclsmon.log======================2011-01-10 11:15:11.376unspecified member number is (1)Member 1 group OCLSMON_ in use. Is oclsmon already up?2011-01-10 11:15:11.479Internal Error Information:  Category: 8  Operation: skgxnreg: the member number is i  Location: skgxnreg_7  Other:  Dep: 12011-01-10 11:15:11.737unspecified member number is (1)Member 1 group OCLSMON_ in use. Is oclsmon already up?2011-01-10 11:15:11.751Internal Error Information:  Category: 8  Operation: skgxnreg: the member number is i  Location: skgxnreg_7  Other:  Dep: 12011-01-10 11:15:12.006unspecified member number is (1)Member 1 group OCLSMON_ in use. Is oclsmon already up?2011-01-10 11:15:12.023Internal Error Information:  Category: 8  Operation: skgxnreg: the member number is i  Location: skgxnreg_7  Other:  Dep: 12011-01-10 11:15:12.278unspecified member number is (1)Member 1 group OCLSMON_ in use. Is oclsmon already up?2011-01-10 11:15:12.293Internal Error Information:  Category: 8  Operation: skgxnreg: the member number is i  Location: skgxnreg_7  Other:  Dep: 1/*  skgxn是Oracle Clusterware用以监视skgxn事件(即第三方CLUSTERWARE相关的事宜,他们应该有用sun的cluster);    似乎是修改hostname导致了Oracle CSS出现了fatal error,并启动了一个以上的OCLSMON进程(Duplicate Oracle CLSMON found),    最后"Oracle CSS daemon failed to start up. Check CRS logs for diagnostics",    在Oracle instance启动的情况下25k-a节点的CSS进程意外终止,    可能导致该节点上的所有实例的LMD(global Enqueue Service daemon)、LMON无法正常工作而导致实例hang住。*/==========================alert.log====================Errors in file /oracle/oracle/admin/BOCPCS/udump/bocpcs1_ora_12320.trc:ORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []=========================part of trace file===============*** 2011-01-10 11:11:02.957ksedmp: internal or fatal errorORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []Current SQL information unavailable - no session.----- Call Stack Trace -----calling              call     entry                argument values in hexlocation             type     point                (? means dubious value)-------------------- -------- -------------------- ----------------------------ksedmp()+716         CALL     ksedst()             FFFFFFFF7FFF9D40 ?                                                   000000000 ? 0FFFFFFFF ?                                                   FFFFFFFF7FFF8EE8 ?                                                   FFFFFFFF7FFFA640 ?                                                   000000008 ?kgerinv()+200        PTR_CALL 0000000000000000     000000002 ? 10638A1CC ?                                                   000000001 ? 000000000 ?                                                   10638A000 ? 10638A1CC ?kgeasnmierr()+28     CALL     kgerinv()            106384B98 ? 000000000 ?                                                   105D3B940 ? 000000002 ?                                                   FFFFFFFF7FFFDFF0 ?                                                   000001430 ?keltnfy()+784        CALL     kgeasnmierr()        106384B98 ? 1064DCBF0 ?                                                   105D3B940 ? 000000002 ?                                                   000000000 ? 00000002E ?kscnfy()+552         PTR_CALL 0000000000000000     10639B498 ? 38001E7A8 ?                                                   1055AC5D0 ? 10639B498 ?                                                   000102C00 ? 10638A1C0 ?ksucrp()+2436        CALL     kscnfy()             000008000 ? 000808214 ?                                                   100C4C220 ? 1055C6680 ?                                                   00000000F ? 000000001 ?opiino()+2056        CALL     ksucrp()             000106387 ? 380007608 ?                                                   000000000 ? 000380000 ?                                                   000106000 ? 106387618 ?opiodr()+1488        PTR_CALL 0000000000000000     10555A000 ?                                                   FFFFFFFF7FFFF1C8 ?                                                   00010555A ? 000106000 ?                                                   105C83000 ? 000000001 ?opidrv()+828         CALL     opiodr()             106391000 ? 000000000 ?                                                   106390DD8 ? 106390000 ?                                                   106391BD0 ? 000106000 ?sou2o()+80           CALL     opidrv()             106394358 ? 000000001 ?                                                   00000003C ? 000000000 ?                                                   00000003C ? 000106000 ?opimai_real()+124    CALL     sou2o()              FFFFFFFF7FFFF788 ?                                                   00000003C ? 000000004 ?                                                   FFFFFFFF7FFFF7B0 ?                                                   105C82000 ? 000105C82 ?main()+152           CALL     opimai_real()        000000002 ?                                                   FFFFFFFF7FFFF888 ?                                                   103F1BBCC ? 10632DB10 ?                                                   002411E44 ? 000014400 ?_start()+380         CALL     main()               000000002 ? 000000008 ?                                                   000000000 ?                                                   FFFFFFFF7FFFF898 ?                                                   FFFFFFFF7FFFF9A8 ?                                                   FFFFFFFF7C700200 ?/* 可以看到以上trace文件指出了no session,    在服务进程启动阶段遭遇了该keltnfy-ldmInit内部错误*/metalink文档Startup Database Produces Ora-00600: [Keltnfy-Ldminit] [ID 336447.1]介绍了该内部错误一般由主机上的不当网络配置引起,很显然使用hostname命令修改了一个无法解析的主机名时可能引发该ORA-00600[keltnfy-ldmInit]内部错误。Applies to:Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3 - Release: 10.2 to 10.2Information in this document applies to any platform.***Checked for relevance on 09-Jun-2010***SymptomsAn startup nomount on Oracle 10g Release 2 database produces the following exception in alert logStarting up ORACLE RDBMS Version: 10.2.0.1.0.Errors in file /opt/oracle/10.2/admin/ORCL/udump/ORCL_ora_535.trc:ORA-00600: internal error code, arguments: [keltnfy-ldmInit], [46], [1], [], [], [], [], []USER: terminating instance due to error 600Instance terminated by USER, pid = 535CauseThe problem is related to getting host information.In this case, ldmInit()/sldmInit() is failing with error 46 : LDMERR_HOST_NOT_FOUNDThe following exception may also occur :LDMERR_SOSD_INIT         OSD init failed to be specific in these OSD failures LDMERR_BAD_ADDR         bad address when system call gethostname failed LDMERR_HOST_NOT_FOUND   gethostbyname system call fails LDMERR_NO_SUPPORT       when specific address type is not supportedDevelopment has fixed two bugs so far regarding this issueBug:5438154 - Abstract: ORA-600[KELTNFY-LDMINIT]  STARTING THE DBRelease Notes:ldmInit returned LDMERR_HOST_NOT_FOUND for the machine huge alias list/address listWorkaround:reduce the alais list of the machineBug:5486074 - Abstract: ORA-600 [KELTNFY-LDMINIT] WHEN DNS IS NOT AVAILABLERelease Notes:Internal error is raised by the Server Generated Alert subsystem when it can not determine Host Name orNetwork Address. This can be caused by DNS server being unaavilable. SolutionThe fix for 5486074 will not fix any underlying error from gethostbyname(), it just change the internal error to a warning message : "Warning: keltnfy call to ldmInit failed with error 46"You will still need to fix the network config issue.  These are the check you can do verify the host information       Check permission on /etc/hosts $ ls -l /etc/hosts-rw-r--r--  2 root root 194 Oct 17  2006 /etc/hosts      Check if /etc/hosts file is correctly configured              ( all of this on one line ). Check the hostname:$ hostname$ ping `hostname`Make sure you are able to ping the hostname      Check if /etc/nodename is correctly configuredIf you have DNS setup, ping is not a tool to diagnose DNS problem. A better tool to use is nslookup, dnsquery, or dig.$ nslookup$ nslookup$ nslookup The forward and reverse lookup should succeed and return consistent address/info.   Check nsswitch.conf$ more nsswitch.confhosts:      files dnsMake sure host lookup is also done through the /etc/hosts file and not just dns.  It is recommended that FILES come first before DNS.Also, check the resolv.conf. This makes sure that the DNS is working properly.
显然在生产主机上使用hostname命令是危险的,因为你很难保证你在打字的时候不会因为同事的一下拍击而输错,有人说在生产环境中rm命令因该被禁用,那么这种特殊待遇对hostname命令也适用,我们可以用什么来代替hostname查看主机名呢?选择可以有非常多,这里我推荐一种:
-bash-3.00$ oslevel -r 5300-07-bash-3.00$ hostnameoracledatabase12g.com-bash-3.00$ uname -noracledatabase12g.com/* uname -n完全可以满足你的需要! */That's great!

转载地址:http://dyzfm.baihongyu.com/

你可能感兴趣的文章
linux监控对象及重要性
查看>>
walle-web自动化部署配置
查看>>
opencv轮廓提取、轮廓识别相关要点
查看>>
BOOST.ASIO源码剖析(一)
查看>>
过滤squidlog中各个链接的大小
查看>>
我的友情链接
查看>>
使用AnyChat如何实现任意两用户之间的音视频交互
查看>>
【个人小结】项目公共js的配置,解决不同页面多个配置修改的问题
查看>>
XAMP安装Apacher无法启动
查看>>
mongodb user
查看>>
ip地址子网划分
查看>>
Linux下快速搭建ntp时间同步服务器
查看>>
TouchEvent的传递过程学习笔记
查看>>
Android笔记--TCP Scoket(字符串收发)
查看>>
我的友情链接
查看>>
Hunt framework 2.0.0 发布,简单且高性能的 Web 服务框架
查看>>
数据库原理及应用(SQL Server 2016数据处理)【上海精品视频课程】
查看>>
MaxCompute表设计最佳实践
查看>>
Percona-Server-5.5.15源码安装
查看>>
容器安全拾遗 - Rootless Container初探
查看>>