[Linux-HA] pgsql OCF resource agent and other questions

Andrew Beekhof beekhof at gmail.com
Tue Feb 12 10:57:49 MST 2008


On Feb 12, 2008, at 4:59 PM, Zoltan Boszormenyi wrote:

> Hi,
>
> Serge Dubrouski írta:
>> pgsql OCF RA doesn't support multistate configuration so I don't  
>> think
>> that creating a clone would be a good idea.
>>
>
> Thanks for the information.
>
> Some other questions.
>
> According to http://linux-ha.org/v2/faq/resource_too_active
> the monitor action should return 0 for running, 7 ($OCF_NOT_RUNNING)
> for downed resources and anything else for failed ones.
> Either this documentation is buggy,

no

> or heartbeat doesn't conform to its own docs.

also no

>
> Here's the scenario: londiste creates a pidfile and deletes it when  
> it quits correctly.
> However, if I kill it manually then the pidfile stays. What should  
> my script return
> when it detects that the process with the indicated PID is no longer  
> there?
> It's not a "downed" resource, it's a failed one. So I returned  
> $OCF_ERR_GENERIC.
> But after some time heartbeat says that my resource became  
> "unmanaged".

i'm guessing (because you've not included anything on which to comment  
properly) that the stop action failed

>
> In contrast to this, the pgsql OCF RA does it differently. It always  
> returns 7
> when it finds that there's no postmaster process. Which is the right  
> behaviour?

it depends what you want to happen.
if you want a stop to be sent, use OCF_ERR_GENERIC.
if the resource is stateless and doesnt need any cleaning up, use  
OCF_NOT_RUNNING

> We use heartbeat 2.0.8, I haven't said it in my first mail.

which was arguably the worst release that ever went out - please get  
something newer

>
>
>> On Feb 8, 2008 2:43 PM, Zoltan Boszormenyi <zb at cybertec.at> wrote:
>>
>>> ...
>>> But I noticed that somehow IP takeover doesn't take place
>>> if I pull the plug on the virtual ethernet card(s).
>>> I tried it with two Fedora 6 systems inside VMWare.
>>> I have set up pingd and my host machine as the ping node
>>> and services stop on the machine separated from the network,
>>> the virtual IP isn't started on the still connected machine.
>>>
>
> This above isn't true. I just didn't wait enough. However, there is  
> a problem.
> I have set up pingd according to the docs, so the ping attribute is  
> 100 and
> when the node doesn't have the ping attribute then the resource is  
> stopped.
> I have set a static preference of 20 points score to start virtual  
> IP on
> the master node and a 40 resource_stickiness for the virtual IP. So:
>
> 1. both nodes are up, preferred score makes the virtual IP to start  
> on the master.
>    (master: 120 points, slave: 100 points)
> 2. both nodes are up, virtual IP is already running on master
>    (master: 160 points, slave: 100 points)
> 3. I pull the ethernet out of master, after some time master notices
>   that the ping host is gone, and the slave notices the master is  
> gone.
>    (master: 60 points, slave: 100 points)
> 4. At this point, slave starts virtual IP, the master stops its own.
>    (master: 20 points, slave: 140 points)
> 5. I have put the plug back into the master, soon it notices the
>   ping node is back.
>   The slave also notices that the master is back.
>    (master: 120 points, slave: 140 points)
>
> Despite the 140 points on the slave for having the pingd score (100)
> and already running the virtual IP (resource stickiness), the master
> takes back the virtual IP. And it's regardless of auto_failback being
> on or off. Why?
>
>>> Attached are my ha.cf, cib.xml and the referenced extra scripts.
>>>
>>> Thanks in advance and best regards,
>>> Zoltán Böszörményi
>>>
>>> --
>>> ----------------------------------
>>> Zoltán Böszörményi
>>> Cybertec Schönig & Schönig GmbH
>>> http://www.postgresql.at/
>>>
>>>
>>> #
>>> #       There are lots of options in this file.  All you have to  
>>> have is a set
>>> #       of nodes listed {"node ...} one of {serial, bcast, mcast,  
>>> or ucast},
>>> #       and a value for "auto_failback".
>>> #
>>> #       ATTENTION: As the configuration file is read line by line,
>>> #                  THE ORDER OF DIRECTIVE MATTERS!
>>> #
>>> #       In particular, make sure that the udpport, serial baud rate
>>> #       etc. are set before the heartbeat media are defined!
>>> #       debug and log file directives go into effect when they
>>> #       are encountered.
>>> #
>>> #       All will be fine if you keep them ordered as in this  
>>> example.
>>> #
>>> #
>>> #       Note on logging:
>>> #       If any of debugfile, logfile and logfacility are defined  
>>> then they
>>> #       will be used. If debugfile and/or logfile are not defined  
>>> and
>>> #       logfacility is defined then the respective logging and debug
>>> #       messages will be loged to syslog. If logfacility is not  
>>> defined
>>> #       then debugfile and logfile will be used to log messges. If
>>> #       logfacility is not defined and debugfile and/or logfile  
>>> are not
>>> #       defined then defaults will be used for debugfile and  
>>> logfile as
>>> #       required and messages will be sent there.
>>> #
>>> #       File to write debug messages to
>>> #debugfile /var/log/ha-debug
>>> #
>>> #
>>> #       File to write other messages to
>>> #
>>> logfile /var/log/ha-log
>>> #
>>> #
>>> #       Facility to use for syslog()/logger
>>> #
>>> #logfacility    local0
>>> #
>>> #
>>> #       A note on specifying "how long" times below...
>>> #
>>> #       The default time unit is seconds
>>> #               10 means ten seconds
>>> #
>>> #       You can also specify them in milliseconds
>>> #               1500ms means 1.5 seconds
>>> #
>>> #
>>> #       keepalive: how long between heartbeats?
>>> #
>>> #keepalive 2
>>> #
>>> #       deadtime: how long-to-declare-host-dead?
>>> #
>>> #               If you set this too low you will get the problematic
>>> #               split-brain (or cluster partition) problem.
>>> #               See the FAQ for how to use warntime to tune  
>>> deadtime.
>>> #
>>> #deadtime 30
>>> #
>>> #       warntime: how long before issuing "late heartbeat" warning?
>>> #       See the FAQ for how to use warntime to tune deadtime.
>>> #
>>> #warntime 10
>>> #
>>> #
>>> #       Very first dead time (initdead)
>>> #
>>> #       On some machines/OSes, etc. the network takes a while to  
>>> come up
>>> #       and start working right after you've been rebooted.  As a  
>>> result
>>> #       we have a separate dead time for when things first come up.
>>> #       It should be at least twice the normal dead time.
>>> #
>>> #initdead 120
>>> #
>>> #
>>> #       What UDP port to use for bcast/ucast communication?
>>> #
>>> #udpport        694
>>> #
>>> #       Baud rate for serial ports...
>>> #
>>> #baud   19200
>>> #
>>> #       serial  serialportname ...
>>> #serial /dev/ttyS0      # Linux
>>> #serial /dev/cuaa0      # FreeBSD
>>> #serial /dev/cuad0      # FreeBSD 6.x
>>> #serial /dev/cua/a      # Solaris
>>> #
>>> #
>>> #       What interfaces to broadcast heartbeats over?
>>> #
>>> #bcast  eth0            # Linux
>>> #bcast  eth1 eth2       # Linux
>>> #bcast  le0             # Solaris
>>> #bcast  le1 le2         # Solaris
>>> #
>>> #       Set up a multicast heartbeat medium
>>> #       mcast [dev] [mcast group] [port] [ttl] [loop]
>>> #
>>> #       [dev]           device to send/rcv heartbeats on
>>> #       [mcast group]   multicast group to join (class D multicast  
>>> address
>>> #                       224.0.0.0 - 239.255.255.255)
>>> #       [port]          udp port to sendto/rcvfrom (set this value  
>>> to the
>>> #                       same value as "udpport" above)
>>> #       [ttl]           the ttl value for outbound heartbeats.   
>>> this effects
>>> #                       how far the multicast packet will  
>>> propagate.  (0-255)
>>> #                       Must be greater than zero.
>>> #       [loop]          toggles loopback for outbound multicast  
>>> heartbeats.
>>> #                       if enabled, an outbound packet will be  
>>> looped back and
>>> #                       received by the interface it was sent on.  
>>> (0 or 1)
>>> #                       Set this value to zero.
>>> #
>>> #
>>> #mcast eth0 225.0.0.1 694 1 0
>>> #
>>> #       Set up a unicast / udp heartbeat medium
>>> #       ucast [dev] [peer-ip-addr]
>>> #
>>> #       [dev]           device to send/rcv heartbeats on
>>> #       [peer-ip-addr]  IP address of peer to send packets to
>>> #
>>> #ucast eth0 192.168.1.2
>>> #
>>> #
>>> #       About boolean values...
>>> #
>>> #       Any of the following case-insensitive values will work for  
>>> true:
>>> #               true, on, yes, y, 1
>>> #       Any of the following case-insensitive values will work for  
>>> false:
>>> #               false, off, no, n, 0
>>> #
>>> #
>>> #
>>> #       auto_failback:  determines whether a resource will
>>> #       automatically fail back to its "primary" node, or remain
>>> #       on whatever node is serving it until that node fails, or
>>> #       an administrator intervenes.
>>> #
>>> #       The possible values for auto_failback are:
>>> #               on      - enable automatic failbacks
>>> #               off     - disable automatic failbacks
>>> #               legacy  - enable automatic failbacks in systems
>>> #                       where all nodes do not yet support
>>> #                       the auto_failback option.
>>> #
>>> #       auto_failback "on" and "off" are backwards compatible with  
>>> the old
>>> #               "nice_failback on" setting.
>>> #
>>> #       See the FAQ for information on how to convert
>>> #               from "legacy" to "on" without a flash cut.
>>> #               (i.e., using a "rolling upgrade" process)
>>> #
>>> #       The default value for auto_failback is "legacy", which
>>> #       will issue a warning at startup.  So, make sure you put
>>> #       an auto_failback directive in your ha.cf file.
>>> #       (note: auto_failback can be any boolean or "legacy")
>>> #
>>> #auto_failback on
>>> #
>>> #
>>> #       Basic STONITH support
>>> #       Using this directive assumes that there is one stonith
>>> #       device in the cluster.  Parameters to this device are
>>> #       read from a configuration file. The format of this line is:
>>> #
>>> #         stonith <stonith_type> <configfile>
>>> #
>>> #       NOTE: it is up to you to maintain this file on each node  
>>> in the
>>> #       cluster!
>>> #
>>> #stonith baytech /etc/ha.d/conf/stonith.baytech
>>> #
>>> #       STONITH support
>>> #       You can configure multiple stonith devices using this  
>>> directive.
>>> #       The format of the line is:
>>> #         stonith_host <hostfrom> <stonith_type> <params...>
>>> #         <hostfrom> is the machine the stonith device is attached
>>> #              to or * to mean it is accessible from any host.
>>> #         <stonith_type> is the type of stonith device (a list of
>>> #              supported drives is in /usr/lib/stonith.)
>>> #         <params...> are driver specific parameters.  To see the
>>> #              format for a particular device, run:
>>> #           stonith -l -t <stonith_type>
>>> #
>>> #
>>> #       Note that if you put your stonith device access  
>>> information in
>>> #       here, and you make this file publically readable, you're  
>>> asking
>>> #       for a denial of service attack ;-)
>>> #
>>> #       To get a list of supported stonith devices, run
>>> #               stonith -L
>>> #       For detailed information on which stonith devices are  
>>> supported
>>> #       and their detailed configuration options, run this command:
>>> #               stonith -h
>>> #
>>> #stonith_host *     baytech 10.0.0.3 mylogin mysecretpassword
>>> #stonith_host ken3  rps10 /dev/ttyS1 kathy 0
>>> #stonith_host kathy rps10 /dev/ttyS1 ken3 0
>>> #
>>> #       Watchdog is the watchdog timer.  If our own heart doesn't  
>>> beat for
>>> #       a minute, then our machine will reboot.
>>> #       NOTE: If you are using the software watchdog, you very  
>>> likely
>>> #       wish to load the module with the parameter "nowayout=0" or
>>> #       compile it without CONFIG_WATCHDOG_NOWAYOUT set. Otherwise  
>>> even
>>> #       an orderly shutdown of heartbeat will trigger a reboot,  
>>> which is
>>> #       very likely NOT what you want.
>>> #
>>> #watchdog /dev/watchdog
>>> #
>>> #       Tell what machines are in the cluster
>>> #       node    nodename ...    -- must match uname -n
>>> #node   ken3
>>> #node   kathy
>>> #
>>> #       Less common options...
>>> #
>>> #       Treats 10.10.10.254 as a psuedo-cluster-member
>>> #       Used together with ipfail below...
>>> #       note: don't use a cluster node as ping node
>>> #
>>> #ping 10.10.10.254
>>> #
>>> #       Treats 10.10.10.254 and 10.10.10.253 as a psuedo-cluster- 
>>> member
>>> #       called group1. If either 10.10.10.254 or 10.10.10.253 are up
>>> #       then group1 is up
>>> #       Used together with ipfail below...
>>> #
>>> #ping_group group1 10.10.10.254 10.10.10.253
>>> #
>>> #       HBA ping derective for Fiber Channel
>>> #       Treats fc-card-name as psudo-cluster-member
>>> #       used with ipfail below ...
>>> #
>>> #       You can obtain HBAAPI from http://hbaapi.sourceforge.net.   
>>> You need
>>> #       to get the library specific to your HBA directly from the  
>>> vender
>>> #       To install HBAAPI stuff, all You need to do is to compile  
>>> the common
>>> #       part you obtained from the sourceforge. This will produce  
>>> libHBAAPI.so
>>> #       which you need to copy to /usr/lib. You need also copy  
>>> hbaapi.h to
>>> #       /usr/include.
>>> #
>>> #       The fc-card-name is the name obtained from the hbaapitest  
>>> program
>>> #       that is part of the hbaapi package. Running hbaapitest  
>>> will produce
>>> #       a verbose output. One of the first line is similar to:
>>> #               Apapter number 0 is named: qlogic-qla2200-0
>>> #       Here fc-card-name is qlogic-qla2200-0.
>>> #
>>> #hbaping fc-card-name
>>> #
>>> #
>>> #       Processes started and stopped with heartbeat.  Restarted  
>>> unless
>>> #               they exit with rc=100
>>> #
>>> #respawn userid /path/name/to/run
>>> #respawn hacluster /usr/lib/heartbeat/ipfail
>>> #
>>> #       Access control for client api
>>> #               default is no access
>>> #
>>> #apiauth client-name gid=gidlist uid=uidlist
>>> #apiauth ipfail gid=haclient uid=hacluster
>>>
>>> ###########################
>>> #
>>> #       Unusual options.
>>> #
>>> ###########################
>>> #
>>> #       hopfudge maximum hop count minus number of nodes in config
>>> #hopfudge 1
>>> #
>>> #       deadping - dead time for ping nodes
>>> #deadping 30
>>> #
>>> #       hbgenmethod - Heartbeat generation number creation method
>>> #               Normally these are stored on disk and incremented  
>>> as needed.
>>> #hbgenmethod time
>>> #
>>> #       realtime - enable/disable realtime execution (high  
>>> priority, etc.)
>>> #               defaults to on
>>> #realtime off
>>> #
>>> #       debug - set debug level
>>> #               defaults to zero
>>> debug 1
>>> #
>>> #       API Authentication - replaces the fifo-permissions-based  
>>> system of the past
>>> #
>>> #
>>> #       You can put a uid list and/or a gid list.
>>> #       If you put both, then a process is authorized if it  
>>> qualifies under either
>>> #       the uid list, or under the gid list.
>>> #
>>> #       The groupname "default" has special meaning.  If it is  
>>> specified, then
>>> #       this will be used for authorizing groupless clients, and  
>>> any client groups
>>> #       not otherwise specified.
>>> #
>>> #       There is a subtle exception to this.  "default" will never  
>>> be used in the
>>> #       following cases (actual default auth directives noted in  
>>> brackets)
>>> #                 ipfail        (uid=HA_CCMUSER)
>>> #                 ccm           (uid=HA_CCMUSER)
>>> #                 ping          (gid=HA_APIGROUP)
>>> #                 cl_status     (gid=HA_APIGROUP)
>>> #
>>> #       This is done to avoid creating a gaping security hole and  
>>> matches the most
>>> #       likely desired configuration.
>>> #
>>> #apiauth ipfail uid=hacluster
>>> #apiauth ccm uid=hacluster
>>> #apiauth cms uid=hacluster
>>> #apiauth ping gid=haclient uid=alanr,root
>>> #apiauth default gid=haclient
>>>
>>> #       message format in the wire, it can be classic or netstring,
>>> #       default: classic
>>> #msgfmt  classic/netstring
>>>
>>> #       Do we use logging daemon?
>>> #       If logging daemon is used, logfile/debugfile/logfacility  
>>> in this file
>>> #       are not meaningful any longer. You should check the config  
>>> file for logging
>>> #       daemon (the default is /etc/logd.cf)
>>> #       more infomartion can be fould in http://www.linux-ha.org/ha_2ecf_2fUseLogdDirective
>>> #       Setting use_logd to "yes" is recommended
>>> #
>>> # use_logd yes/no
>>> #
>>> #       the interval we  reconnect to logging daemon if the  
>>> previous connection failed
>>> #       default: 60 seconds
>>> #conn_logd_time 60
>>> #
>>> #
>>> #       Configure compression module
>>> #       It could be zlib or bz2, depending on whether u have the  
>>> corresponding
>>> #       library in the system.
>>> #compression    bz2
>>> #
>>> #       Confiugre compression threshold
>>> #       This value determines the threshold to compress a message,
>>> #       e.g. if the threshold is 1, then any message with size  
>>> greater than 1 KB
>>> #       will be compressed, the default is 2 (KB)
>>> #compression_threshold 2
>>>
>>> node ws232.ltsp ws231.ltsp
>>> #bcast bond0
>>> #ucast bond0 157.177.2.31
>>> crm on
>>> ping 192.168.0.1
>>> #ping 157.177.6.210
>>> respawn root /usr/lib64/heartbeat/pingd -m 100 -d 5s
>>> #mcast bond0 224.0.0.1 694 1 0
>>> bcast eth0
>>> bcast eth1
>>>
>>> #!/bin/sh
>>> #
>>> #
>>> #       OCF RA for monitoring Londiste Replay process
>>> #
>>> #               based on:
>>> #
>>> #       Dummy OCF RA. Does nothing but wait a few seconds, can be
>>> #       configured to fail occassionally.
>>> #
>>> # Copyright (c) 2004 SUSE LINUX AG, Lars Marowsky-Brée
>>> #                    All Rights Reserved.
>>> #
>>> # This program is free software; you can redistribute it and/or  
>>> modify
>>> # it under the terms of version 2 of the GNU General Public  
>>> License as
>>> # published by the Free Software Foundation.
>>> #
>>> # This program is distributed in the hope that it would be useful,  
>>> but
>>> # WITHOUT ANY WARRANTY; without even the implied warranty of
>>> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>>> #
>>> # Further, this software is distributed without any warranty that  
>>> it is
>>> # free of the rightful claim of any third person regarding  
>>> infringement
>>> # or the like.  Any license provided herein, whether implied or
>>> # otherwise, applies only to this software file.  Patent licenses,  
>>> if
>>> # any, provided herein do not apply to combinations of this  
>>> program with
>>> # other software, or any other product whatsoever.
>>> #
>>> # You should have received a copy of the GNU General Public License
>>> # along with this program; if not, write the Free Software  
>>> Foundation,
>>> # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
>>> #
>>>
>>> #######################################################################
>>> # Initialization:
>>>
>>> if [ -f ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs ]
>>> then
>>>        . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
>>> else
>>>        if [ -f /usr/lib/heartbeat/ocf-shellfuncs ]
>>>        then
>>>                . /usr/lib/heartbeat/ocf-shellfuncs
>>>        else
>>>                if [ -f /usr/lib64/heartbeat/ocf-shellfuncs ]
>>>                then
>>>                        . /usr/lib64/heartbeat/ocf-shellfuncs
>>>                else
>>>                        exit $OCF_ERR_CONFIGURED
>>>                fi
>>>        fi
>>> fi
>>>
>>> #######################################################################
>>>
>>> meta_data() {
>>>        cat <<END
>>> <?xml version="1.0"?>
>>> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
>>> <resource-agent name="LondisteReplay" version="1.0">
>>> <version>1.0</version>
>>>
>>> <longdesc lang="en">
>>> This is a LondisteReplay Resource Agent. It starts/stops/monitors
>>> londiste.py's replay process.
>>> </longdesc>
>>> <shortdesc lang="en">LondisteReplay resource agent</shortdesc>
>>>
>>> <parameters>
>>> <parameter name="configdir" unique="0">
>>> <longdesc lang="en">
>>> This where londiste.ini is.
>>> </longdesc>
>>> <shortdesc lang="en">Configuration directory</shortdesc>
>>> <content type="string" default="/etc/cluster" />
>>> </parameter>
>>>
>>> <parameter name="pidfile" unique="0">
>>> <longdesc lang="en">
>>> This the pidfile PGQADM uses to indicate its started state.
>>> </longdesc>
>>> <shortdesc lang="en">Pidfile</shortdesc>
>>> <content type="string" default="/etc/cluster/londiste.pid" />
>>> </parameter>
>>>
>>> </parameters>
>>>
>>>
>>> <actions>
>>> <action name="start"        timeout="90" />
>>> <action name="stop"         timeout="100" />
>>> <action name="monitor"      timeout="20" interval="10" depth="0"  
>>> start-delay="0" />
>>> <action name="meta-data"    timeout="5" />
>>> <action name="verify-all"   timeout="30" />
>>> </actions>
>>> </resource-agent>
>>> END
>>> }
>>>
>>> #######################################################################
>>>
>>> # don't exit on TERM, to test that lrmd makes sure that we do exit
>>> trap sigterm_handler TERM
>>> sigterm_handler() {
>>>        ocf_log info "They use TERM to bring us down. No such luck."
>>>        return
>>> }
>>>
>>> dummy_usage() {
>>>        cat <<END
>>> usage: $0 {start|stop|monitor|validate-all|meta-data}
>>>
>>> Expects to have a fully populated OCF RA-compliant environment set.
>>> END
>>> }
>>>
>>> dummy_monitor() {
>>>        if [ -f $PIDFILE ]; then
>>>                PID="$((`cat $PIDFILE`))"
>>>        fi
>>>        if [ -z $PID ]; then
>>>                return $OCF_NOT_RUNNING
>>>        fi
>>>        PGQPROCFILE="/proc/$PID/cmdline"
>>>        if [ ! -f $PGQPROCFILE ]; then
>>>                return $OCF_ERR_GENERIC
>>>        fi
>>>        PGQADM=$((`grep -ia londiste $PGQPROCFILE 2>/dev/null | wc - 
>>> l`))
>>>        if [ "x$PGQADM" = "x0" ]; then
>>>                return $OCF_ERR_GENERIC
>>>        fi
>>>        return $OCF_SUCCESS
>>> }
>>>
>>> dummy_start() {
>>>        dummy_monitor
>>>        MONRET=$?
>>>        if [ $MONRET =  $OCF_SUCCESS ]; then
>>>                return $OCF_SUCCESS
>>>        fi
>>>        if [ $MONRET = $OCF_NOT_RUNNING ]; then
>>>                londiste.py $CONFIGDIR/londiste.ini replay &
>>>                return $OCF_SUCCESS
>>>        fi
>>>        return $OCF_ERR_GENERIC
>>> }
>>>
>>> dummy_stop() {
>>>        dummy_monitor
>>>        if [ $? =  $OCF_SUCCESS ]; then
>>>                kill -TERM $PID
>>>        fi
>>>        USLEEP="`which usleep`"
>>>        while [ -f $PIDFILE ]; do
>>>                if [ -x $USLEEP ]; then
>>>                        $USLEEP 20
>>>                        continue
>>>                fi
>>>                sleep 1
>>>        done
>>>        return $OCF_SUCCESS
>>> }
>>>
>>> dummy_validate() {
>>>        exit $OC_ERR_UNIMPLEMENTED
>>> }
>>>
>>> CONFIGDIR=${OCF_RESKEY_configdir:-/etc/cluster}
>>> PIDFILE=${OCF_RESKEY_pidfile:-/etc/cluster/pgqadm.pid}
>>>
>>> case $__OCF_ACTION in
>>> meta-data)      meta_data
>>>                exit $OCF_SUCCESS
>>>                ;;
>>> start)          dummy_start
>>>                ;;
>>> stop)           dummy_stop
>>>                ;;
>>> monitor)        dummy_monitor
>>>                ;;
>>> validate-all)   dummy_validate;;
>>> usage|help)     dummy_usage
>>>                exit $OCF_SUCCESS
>>>                ;;
>>> *)              dummy_usage
>>>                exit $OCF_ERR_UNIMPLEMENTED
>>>                ;;
>>> esac
>>> rc=$?
>>> ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc"
>>> exit $rc
>>>
>>>
>>> #!/bin/sh
>>> #
>>> #
>>> #       OCF RA for monitoring Londiste Ticker process
>>> #
>>> #               based on:
>>> #
>>> #       Dummy OCF RA. Does nothing but wait a few seconds, can be
>>> #       configured to fail occassionally.
>>> #
>>> # Copyright (c) 2004 SUSE LINUX AG, Lars Marowsky-Brée
>>> #                    All Rights Reserved.
>>> #
>>> # This program is free software; you can redistribute it and/or  
>>> modify
>>> # it under the terms of version 2 of the GNU General Public  
>>> License as
>>> # published by the Free Software Foundation.
>>> #
>>> # This program is distributed in the hope that it would be useful,  
>>> but
>>> # WITHOUT ANY WARRANTY; without even the implied warranty of
>>> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>>> #
>>> # Further, this software is distributed without any warranty that  
>>> it is
>>> # free of the rightful claim of any third person regarding  
>>> infringement
>>> # or the like.  Any license provided herein, whether implied or
>>> # otherwise, applies only to this software file.  Patent licenses,  
>>> if
>>> # any, provided herein do not apply to combinations of this  
>>> program with
>>> # other software, or any other product whatsoever.
>>> #
>>> # You should have received a copy of the GNU General Public License
>>> # along with this program; if not, write the Free Software  
>>> Foundation,
>>> # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
>>> #
>>>
>>> #######################################################################
>>> # Initialization:
>>>
>>> if [ -f ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs ]
>>> then
>>>        . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
>>> else
>>>        if [ -f /usr/lib/heartbeat/ocf-shellfuncs ]
>>>        then
>>>                . /usr/lib/heartbeat/ocf-shellfuncs
>>>        else
>>>                if [ -f /usr/lib64/heartbeat/ocf-shellfuncs ]
>>>                then
>>>                        . /usr/lib64/heartbeat/ocf-shellfuncs
>>>                else
>>>                        exit $OCF_ERR_CONFIGURED
>>>                fi
>>>        fi
>>> fi
>>>
>>> #######################################################################
>>>
>>> meta_data() {
>>>        cat <<END
>>> <?xml version="1.0"?>
>>> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
>>> <resource-agent name="LondisteTicker" version="0.9">
>>> <version>1.0</version>
>>>
>>> <longdesc lang="en">
>>> This is a LondisteTicker Resource Agent. It starts/stops/monitors
>>> pgqadm.py's ticker process.
>>> </longdesc>
>>> <shortdesc lang="en">LondisteTicker resource agent</shortdesc>
>>>
>>> <parameters>
>>> <parameter name="configdir" unique="0">
>>> <longdesc lang="en">
>>> This where pgqadm.ini is.
>>> </longdesc>
>>> <shortdesc lang="en">Configuration directory</shortdesc>
>>> <content type="string" default="/etc/cluster" />
>>> </parameter>
>>>
>>> <parameter name="pidfile" unique="0">
>>> <longdesc lang="en">
>>> This the pidfile PGQADM uses to indicate its started state.
>>> </longdesc>
>>> <shortdesc lang="en">Pidfile</shortdesc>
>>> <content type="string" default="/etc/cluster/pgqadm.pid" />
>>> </parameter>
>>>
>>> </parameters>
>>>
>>> <actions>
>>> <action name="start"        timeout="90" />
>>> <action name="stop"         timeout="100" />
>>> <action name="monitor"      timeout="20" interval="10" depth="0"  
>>> start-delay="5" />
>>> <action name="meta-data"    timeout="5" />
>>> <action name="verify-all"   timeout="30" />
>>> </actions>
>>> </resource-agent>
>>> END
>>> }
>>>
>>> #######################################################################
>>>
>>> # don't exit on TERM, to test that lrmd makes sure that we do exit
>>> trap sigterm_handler TERM
>>> sigterm_handler() {
>>>        ocf_log info "They use TERM to bring us down. No such luck."
>>>        return
>>> }
>>>
>>> dummy_usage() {
>>>        cat <<END
>>> usage: $0 {start|stop|monitor|validate-all|meta-data}
>>>
>>> Expects to have a fully populated OCF RA-compliant environment set.
>>> END
>>> }
>>>
>>> dummy_validate() {
>>>        exit $OC_ERR_UNIMPLEMENTED
>>> }
>>>
>>> dummy_monitor() {
>>>        if [ -f $PIDFILE ]; then
>>>                PID="$((`cat $PIDFILE`))"
>>>        fi
>>>        if [ -z $PID ]; then
>>>                return $OCF_NOT_RUNNING
>>>        fi
>>>        PGQPROCFILE="/proc/$PID/cmdline"
>>>        if [ ! -f $PGQPROCFILE ]; then
>>>                return $OCF_ERR_GENERIC
>>>        fi
>>>        PGQADM=$((`grep -ia pgqadm $PGQPROCFILE 2>/dev/null | wc - 
>>> l`))
>>>        if [ "x$PGQADM" = "x0" ]; then
>>>                return $OCF_ERR_GENERIC
>>>        fi
>>>        return $OCF_SUCCESS
>>> }
>>>
>>> dummy_start() {
>>>        dummy_monitor
>>>        MONRET=$?
>>>        if [ $MONRET =  $OCF_SUCCESS ]; then
>>>                return $OCF_SUCCESS
>>>        fi
>>>        if [ $MONRET = $OCF_NOT_RUNNING ]; then
>>>                pgqadm.py $CONFIGDIR/pgqadm.ini ticker &
>>>                return $OCF_SUCCESS
>>>        fi
>>>        return $OCF_ERR_GENERIC
>>> }
>>>
>>> dummy_stop() {
>>>        dummy_monitor
>>>        if [ $? =  $OCF_SUCCESS ]; then
>>>                kill -TERM $PID
>>>        fi
>>>        USLEEP="`which usleep`"
>>>        SLEEP="`which sleep`"
>>>        while [ -f $PIDFILE ]; do
>>>                if [ -x $USLEEP ]; then
>>>                        $USLEEP 20
>>>                        continue
>>>                fi
>>>                sleep 1
>>>        done
>>>        return $OCF_SUCCESS
>>> }
>>>
>>> CONFIGDIR=${OCF_RESKEY_configdir:-/etc/cluster}
>>> PIDFILE=${OCF_RESKEY_pidfile:-/etc/cluster/pgqadm.pid}
>>>
>>> case $__OCF_ACTION in
>>> meta-data)      meta_data
>>>                ;;
>>> start)          dummy_start
>>>                ;;
>>> stop)           dummy_stop
>>>                ;;
>>> monitor)        dummy_monitor
>>>                ;;
>>> validate-all)   dummy_validate;;
>>> usage|help)     dummy_usage
>>>                exit $OCF_SUCCESS
>>>                ;;
>>> *)              dummy_usage
>>>                exit $OCF_ERR_UNIMPLEMENTED
>>>                ;;
>>> esac
>>> rc=$?
>>> ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc"
>>> exit $rc
>>>
>>>
>>> #!/bin/sh
>>> #
>>> #
>>> #       SlaveMigration OCF RA. Sets up the slave PostgreSQL  
>>> pg_hba.conf
>>> #       when migration to/from lardb04
>>> #
>>> #               based on
>>> #
>>> #       Dummy OCF RA. Does nothing but wait a few seconds, can be
>>> #       configured to fail occassionally.
>>> #
>>> # Copyright (c) 2004 SUSE LINUX AG, Lars Marowsky-Brée
>>> #                    All Rights Reserved.
>>> #
>>> # This program is free software; you can redistribute it and/or  
>>> modify
>>> # it under the terms of version 2 of the GNU General Public  
>>> License as
>>> # published by the Free Software Foundation.
>>> #
>>> # This program is distributed in the hope that it would be useful,  
>>> but
>>> # WITHOUT ANY WARRANTY; without even the implied warranty of
>>> # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>>> #
>>> # Further, this software is distributed without any warranty that  
>>> it is
>>> # free of the rightful claim of any third person regarding  
>>> infringement
>>> # or the like.  Any license provided herein, whether implied or
>>> # otherwise, applies only to this software file.  Patent licenses,  
>>> if
>>> # any, provided herein do not apply to combinations of this  
>>> program with
>>> # other software, or any other product whatsoever.
>>> #
>>> # You should have received a copy of the GNU General Public License
>>> # along with this program; if not, write the Free Software  
>>> Foundation,
>>> # Inc., 59 Temple Place - Suite 330, Boston MA 02111-1307, USA.
>>> #
>>>
>>> #######################################################################
>>> # Initialization:
>>>
>>> if [ -f ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs ]
>>> then
>>>        . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
>>> else
>>>        if [ -f /usr/lib/heartbeat/ocf-shellfuncs ]
>>>        then
>>>                . /usr/lib/heartbeat/ocf-shellfuncs
>>>        else
>>>                if [ -f /usr/lib64/heartbeat/ocf-shellfuncs ]
>>>                then
>>>                        . /usr/lib64/heartbeat/ocf-shellfuncs
>>>                else
>>>                        exit $OCF_ERR_CONFIGURED
>>>                fi
>>>        fi
>>> fi
>>>
>>> #. ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs
>>> #. /usr/lib64/heartbeat/ocf-shellfuncs
>>> #. /usr/share/ocf/resource.d/heartbeat/.ocf-shellfuncs
>>>
>>> #######################################################################
>>>
>>> meta_data() {
>>>        cat <<END
>>> <?xml version="1.0"?>
>>> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd">
>>> <resource-agent name="SlaveMigration" version="0.9">
>>> <version>1.0</version>
>>>
>>> <longdesc lang="en">
>>> This is the SlaveMigration Resource Agent. It sets up the slave  
>>> PostgreSQL
>>> authentication so pgqadm/londiste cannot incidentally start  
>>> replicating from
>>> the master PostgreSQL server.
>>> </longdesc>
>>> <shortdesc lang="en">SlaveMigration resource agent</shortdesc>
>>>
>>> <parameters>
>>>
>>> <parameter name="masterip" unique="0" required="1">
>>> <longdesc lang="en">
>>> This is the space-separated list of IPs the master server lives at.
>>> </longdesc>
>>> <shortdesc lang="en">IP addresses of the master </shortdesc>
>>> <content type="string" default="" />
>>> </parameter>
>>>
>>> <parameter name="masterhostname" unique="0" required="0">
>>> <longdesc lang="en">
>>> This is the short form of the hostname of the master server
>>> </longdesc>
>>> <shortdesc lang="en">Master short hostname</shortdesc>
>>> <content type="string" default="" />
>>> </parameter>
>>>
>>> <parameter name="slavehostname" unique="0" required="1">
>>> <longdesc lang="en">
>>> This is the short form of the hostname of the slave server
>>> </longdesc>
>>> <shortdesc lang="en">Slave short hostname</shortdesc>
>>> <content type="string" default="" />
>>> </parameter>
>>>
>>> <parameter name="psql" unique="0" required="0">
>>> <longdesc lang="en">
>>> Path to psql command.
>>> </longdesc>
>>> <shortdesc lang="en">psql</shortdesc>
>>> <content type="string" default="/usr/bin/psql" />
>>> </parameter>
>>>
>>> <parameter name="pgport" unique="0">
>>> <longdesc lang="en">
>>> This is post PostgreSQL listens on.
>>> </longdesc>
>>> <shortdesc lang="en">PostgreSQL service port</shortdesc>
>>> <content type="string" default="" />
>>> </parameter>
>>>
>>> <parameter name="pgdata" unique="0" required="1">
>>> <longdesc lang="en">
>>> Path to PostgreSQL data directory.
>>> </longdesc>
>>> <shortdesc lang="en">pgdata</shortdesc>
>>> <content type="string" default="/var/lib/pgsql/data" />
>>> </parameter>
>>>
>>> <parameter name="pghba_ok" unique="0" required="1">
>>> <longdesc lang="en">
>>> Path to normal pg_hba.conf
>>> </longdesc>
>>> <shortdesc lang="en">pghba_ok</shortdesc>
>>> <content type="string" default="" />
>>> </parameter>
>>>
>>> <parameter name="pghba_failed" unique="0" required="1">
>>> <longdesc lang="en">
>>> Path to pg_hba.conf for fenced case to disallow connection of the  
>>> master server.
>>> </longdesc>
>>> <shortdesc lang="en">pghba_failed</shortdesc>
>>> <content type="string" default="" />
>>> </parameter>
>>>
>>> </parameters>
>>>
>>> <actions>
>>> <action name="start"        timeout="90" />
>>> <action name="stop"         timeout="100" />
>>> <action name="monitor"      timeout="20" interval="10" depth="0"  
>>> start-delay="1" />
>>> <action name="reload"       timeout="90" />
>>> <action name="migrate_to"   timeout="100" />
>>> <action name="migrate_from" timeout="90" />
>>> <action name="meta-data"    timeout="5" />
>>> <action name="verify-all"   timeout="30" />
>>> </actions>
>>> </resource-agent>
>>> END
>>> }
>>>
>>> #######################################################################
>>>
>>> # don't exit on TERM, to test that lrmd makes sure that we do exit
>>> trap sigterm_handler TERM
>>> sigterm_handler() {
>>>        ocf_log info "They use TERM to bring us down. No such luck."
>>>        return
>>> }
>>>
>>> dummy_usage() {
>>>        cat <<END
>>> usage: $0 {start|stop|monitor|migrate_to|migrate_from|validate-all| 
>>> meta-data}
>>>
>>> Expects to have a fully populated OCF RA-compliant environment set.
>>> END
>>> }
>>>
>>> dummy_validate() {
>>>        return $OC_ERR_UNIMPLEMENTED
>>> }
>>>
>>> dummy_monitor() {
>>>        if [ -f /var/lock/subsys/SlaveMigration ]
>>>        then
>>>                return $OCF_SUCCESS
>>>        fi
>>>        return $OCF_NOT_RUNNING
>>> }
>>>
>>>
>>> slave_start() {
>>>        dummy_monitor
>>>        if [ $? = $OCF_SUCCESS ]; then
>>>                return $OCF_SUCCESS
>>>        fi
>>>
>>>        touch /var/lock/subsys/SlaveMigration
>>>
>>>        check_for_slavehostname
>>>        check_for_masterip
>>>
>>>        if [ "`hostname -s`" = "$OCF_RESKEY_slavehostname" ]
>>>        then
>>>                check_for_pg_hbas
>>>                ln -sf "$OCF_RESKEY_pghba_failed" "$ 
>>> {OCF_RESKEY_pgdata}/pg_hba.conf"
>>>                reload_pg_conf
>>>                slave_kill_pg_from_master
>>>        fi
>>>
>>>        return $OCF_SUCCESS
>>> }
>>>
>>> slave_stop() {
>>>        dummy_monitor
>>>        if [ $? = $OCF_NOT_RUNNING ]; then
>>>                return $OCF_SUCCESS
>>>        fi
>>>
>>>        check_for_slavehostname
>>>
>>>        if [ "`hostname -s`" = "$OCF_RESKEY_slavehostname" ]
>>>        then
>>>                check_for_pg_hbas
>>>                ln -sf "$OCF_RESKEY_pghba_ok" "${OCF_RESKEY_pgdata}/ 
>>> pg_hba.conf"
>>>                reload_pg_conf
>>>        fi
>>>
>>>        rm -f /var/lock/subsys/SlaveMigration
>>>        return $OCF_SUCCESS
>>> }
>>>
>>> slave_kill_pg_from_master() {
>>>        PGPARAM="-A -t -U postgres -h localhost"
>>>        if [ "$OCF_RESKEY_pgport" != "" ]
>>>        then
>>>                PGPARAM="$PGPARAM -p $OCF_RESKEY_pgport"
>>>        fi
>>>        for ip in $OCF_RESKEY_masterip ; do
>>> #               echo -- $PGPARAM -c "\"select procpid from  
>>> pg_stat_activity where client_addr='"${ip}"'\""
>>>                $OCF_RESKEY_psql $PGPARAM -c "select procpid from  
>>> pg_stat_activity where client_addr='"${ip}"'" | \
>>>                while read pid ; do
>>>                        kill -TERM $pid
>>>                done
>>>        done
>>> }
>>>
>>> reload_pg_conf() {
>>>        PGPARAM="-U postgres -h localhost"
>>>        if [ "$OCF_RESKEY_pgport" != "" ]
>>>        then
>>>                PGPARAM="$PGPARAM -p $OCF_RESKEY_pgport"
>>>        fi
>>>        $OCF_RESKEY_psql $PGPARAM -c "select pg_reload_conf()" 1>/ 
>>> dev/null 2>/dev/null
>>> }
>>>
>>> check_for_slavehostname() {
>>>        if [ "$OCF_RESKEY_slavehostname" = "" ]
>>>        then
>>>                ocf_log debug "${OCF_RESOURCE_INSTANCE} No slave  
>>> hostname given"
>>>                exit $OCF_ERR_GENERIC
>>>        fi
>>> }
>>>
>>> check_for_masterhostname() {
>>>        if [ "$OCF_RESKEY_masterhostname" = "" ]
>>>        then
>>>                ocf_log debug "${OCF_RESOURCE_INSTANCE} No master  
>>> hostname given"
>>>                exit $OCF_ERR_GENERIC
>>>        fi
>>> }
>>>
>>> check_for_masterip() {
>>>        if [ "$OCF_RESKEY_masterip" = "" ]
>>>        then
>>>                ocf_log debug "${OCF_RESOURCE_INSTANCE} No master  
>>> IP given"
>>>                exit $OCF_ERR_GENERIC
>>>        fi
>>> }
>>>
>>> check_for_pg_hbas() {
>>>        if [ "$OCF_RESKEY_pghba_ok" = "" ]
>>>        then
>>>                echo OCF_RESKEY_pghba_ok
>>>                ocf_log debug "${OCF_RESOURCE_INSTANCE} No normal  
>>> pg_hba.conf given"
>>>                exit $OCF_ERR_GENERIC
>>>        fi
>>>        if [ ! -f "$OCF_RESKEY_pghba_ok" ]
>>>        then
>>>                echo -- -f OCF_RESKEY_pghba_ok
>>>                ocf_log debug "${OCF_RESOURCE_INSTANCE} pg_hba.conf  
>>> file for normal operation not exists"
>>>                exit $OCF_ERR_GENERIC
>>>        fi
>>>        if [ "$OCF_RESKEY_pghba_failed" = "" ]
>>>        then
>>>                echo OCF_RESKEY_pghba_failed
>>>                ocf_log debug "${OCF_RESOURCE_INSTANCE} No failed  
>>> pg_hba.conf given"
>>>                exit $OCF_ERR_GENERIC
>>>        fi
>>>        if [ ! -f "$OCF_RESKEY_pghba_failed" ]
>>>        then
>>>                echo -- -f OCF_RESKEY_pghba_failed
>>>                ocf_log debug "${OCF_RESOURCE_INSTANCE} pg_hba.conf  
>>> file for fenced operation not exists"
>>>                exit $OCF_ERR_GENERIC
>>>        fi
>>>        if [ ! -d "$OCF_RESKEY_pgdata" ]
>>>        then
>>>                echo -- -d OCF_RESKEY_pgdata
>>>                ocf_log debug "${OCF_RESOURCE_INSTANCE} PGDATA  
>>> directory not exists"
>>>                exit $OCF_ERR_GENERIC
>>>        fi
>>> }
>>>
>>> : ${OCF_RESKEY_psql=/usr/bin/psql}
>>>
>>> case $__OCF_ACTION in
>>> meta-data)      meta_data
>>>                exit $OCF_SUCCESS
>>>                ;;
>>> start)          slave_start
>>>                ;;
>>> stop)           slave_stop
>>>                ;;
>>> monitor)        dummy_monitor
>>>                ;;
>>> migrate_to)     ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE}  
>>> to ${OCF_RESKEY_CRM_meta_migrate_to}."
>>>                slave_stop
>>>                ;;
>>> migrate_from)   ocf_log info "Migrating ${OCF_RESOURCE_INSTANCE}  
>>> to ${OCF_RESKEY_CRM_meta_migrated_from}."
>>>                slave_start
>>>                ;;
>>> reload)         ocf_log err "Reloading..."
>>>                dummy_start
>>>                ;;
>>> validate-all)   dummy_validate;;
>>> usage|help)     dummy_usage
>>>                exit $OCF_SUCCESS
>>>                ;;
>>> *)              dummy_usage
>>>                exit $OCF_ERR_UNIMPLEMENTED
>>>                ;;
>>> esac
>>> rc=$?
>>> ocf_log debug "${OCF_RESOURCE_INSTANCE} $__OCF_ACTION : $rc"
>>> exit $rc
>>>
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA at lists.linux-ha.org
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>>
>>
>>
>>
>>
>
>
> -- 
> ----------------------------------
> Zoltán Böszörményi
> Cybertec Schönig & Schönig GmbH
> http://www.postgresql.at/
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA at lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems



More information about the Linux-HA mailing list