- 論壇徽章:
- 0
|
環(huán)境:兩臺(tái)p650主機(jī):AIX5200-3;盤(pán)陣 S4300(8*73G;raid5);應(yīng)用:domino6.5.3(a,b機(jī)各跑2分區(qū));hacmp5.1
現(xiàn)象:1.某天某domino分區(qū)應(yīng)用故障,導(dǎo)致應(yīng)用宕機(jī),因部署了應(yīng)用自定義監(jiān)控腳本,故HACMP本應(yīng)監(jiān)控到此domino分區(qū)宕機(jī)并重啟應(yīng)用;但HACMP只監(jiān)控到此domino分區(qū)宕機(jī)(監(jiān)控腳本有輸出日志),而沒(méi)有執(zhí)行停止\啟動(dòng)腳本重啟此domino分區(qū)
2.此后我們對(duì)hacmp的監(jiān)控進(jìn)行了測(cè)試(模擬應(yīng)用監(jiān)控腳本中的domino分區(qū)應(yīng)用宕機(jī)情況),發(fā)現(xiàn)各domino分區(qū)(測(cè)試了a機(jī)b機(jī)各一domino分區(qū))的監(jiān)控腳本均沒(méi)有監(jiān)控到宕機(jī),監(jiān)控腳本沒(méi)有輸出日志,domino也沒(méi)有重啟.
相關(guān)日志:1./tmp/clstrmgr.debug
...(從13號(hào)開(kāi)始,很多以下類(lèi)似日志)
Fri May 19 17:59:13 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return
Fri May 19 17:59:43 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return
Fri May 19 18:00:13 PollAliasEvents: State not STABLE/RP_RUNNING or ibcasts, return
...
2./tmp/hacmp.out
...(很早就有,很多以下類(lèi)似日志)
WARNING: Cluster gsmsscluster has been running recovery program '/usr/es/sbin/cluster/events/server_restart.rp' for 8004000 seconds. Please check cluster status.
WARNING: Cluster gsmsscluster has been running recovery program '/usr/es/sbin/cluster/events/server_restart.rp' for 8007600 seconds. Please check cluster status.
WARNING: Cluster gsmsscluster has been running recovery program '/usr/es/sbin/cluster/events/server_restart.rp' for 8011200 seconds. Please check cluster status.
...
請(qǐng)各位老大幫忙看看,謝謝. |
|