- 論壇徽章:
- 0
|
本帖最后由 PinkOrient 于 2012-11-05 09:52 編輯
發(fā)現(xiàn)Rgmanager做restart的時候?qū)嶋H上是先stop再start腳本,跟預(yù)期的有點差異,為什么不直接調(diào)用腳本的restart參數(shù)呢?
設(shè)置如下- <service autostart="1" domain="xxx_dm" name="xxx_server" recovery="restart" max_restarts="3" restart_expire_time="60">
- <ip address="139.122.10.187" monitor_link="1">
- <script ref="xxx_server"/>
- </ip>
- </service>
復(fù)制代碼 其中腳本xxx_server會監(jiān)控n個xxx進程,如果任何一個xxx進程不存在了,則腳本status返回1,此時如果調(diào)用腳本的restart/start函數(shù)的話,其他n-1個正常的xxx進程不受影響,只是把停掉的拉起來。
嘗試kill掉一個其中一個xxx_server進程,期望的是rgmanager會在本地主機調(diào)用一次service xxx_serverd restart, 直接把死掉的嘗試?yán)饋恚渌谂艿牟挥绊懀?br />
但是實際情況如下,cluster發(fā)現(xiàn)status不為0后,重新把服務(wù)停掉并把資源withdraw,然后再重新register資源和拉起服務(wù),把好的xxx進程也干掉了,并且整個過程的周期是18s左右。- Nov 2 17:03:52 ServerNode01 xxx_serverd[29499]: status ... [OK]
- Nov 2 17:04:25 ServerNode01 xxx_serverd[30222]: status ... [OK]
- Nov 2 17:04:58 ServerNode01 xxx_serverd[30842]: status ... [Failed] #發(fā)現(xiàn)死了一個,status不正常
- Nov 2 17:04:58 ServerNode01 clurgmgrd: [23683]: <err> script:xxx_server: status of /etc/init.d/xxx_serverd failed (returned 1)
- Nov 2 17:04:58 ServerNode01 clurgmgrd[23683]: <notice> status on script "xxx_server" returned 1 (generic error)
- Nov 2 17:04:58 ServerNode01 clurgmgrd[23683]: <notice> Stopping service service:xxx_server #停掉service,導(dǎo)致其他的幾個也退出了
- Nov 2 17:04:58 ServerNode01 xxx_serverd[30985]: stop ... [OK]
- Nov 2 17:04:58 ServerNode01 avahi-daemon[6987]: Withdrawing address record for 139.122.10.187 on bond0. #VIP也withdraw掉了
- Nov 2 17:05:09 ServerNode01 clurgmgrd[23683]: <notice> Service service:xxx_server is recovering
- Nov 2 17:05:09 ServerNode01 clurgmgrd[23683]: <notice> Recovering failed service service:xxx_server
- Nov 2 17:05:11 ServerNode01 avahi-daemon[6987]: Registering new address record for 139.122.10.187 on bond0.
- Nov 2 17:05:16 ServerNode01 xxx_serverd[31550]: start ... [OK]
- Nov 2 17:05:16 ServerNode01 clurgmgrd[23683]: <notice> Service service:xxx_server started #重新分配資源和啟動完成
- Nov 2 17:05:49 ServerNode01 xxx_serverd[32390]: status ... [OK]
復(fù)制代碼 |
|