Agents running in a clustered environment

Here at Admin,
I gave a session on "troubleshooting agents for administrators".
 At the end of the session, I was asked how should you deal with applications
that have scheduled agents running on clustered servers.  So the scenario
is as follows:

You have an application on both servers.
 Scheduled agents exist on both these servers.  You only want
the agent to run once (i.e. not run on both servers).  BUT, you want
the ability for the agent to execute on the other server if needed (i.e.
if the primary server goes down).  There is a View article entitled
  "Failover Support for Background Agents on Clustered
Domino Servers — A Solution for Your Lotus Workflow and Domino Applications"

The basic concept is to use a token
that grants a server permission to run agents.  Agents are enabled
to run on any server, but only one server at a time will have the token.
 The flow chart in Figure 1 delineates the code logic.  Here’s
how the Agent Failover Support solution works:

When an agent is triggered in your application, code that you have added
to the application (provided in the download that accompanies the article)
checks to see whether Agent Failover Support has been enabled (by means
of a checkbox on an administrative setup document also added to your application).

If agent failover is enabled, the code next checks to see whether the server
on which the agent has been triggered (let’s call it the Current server)
is listed in the ClusterServers field in the administrative setup document.
 This field provides a means to restrict agents to certain servers
in the cluster.  If the server is not listed there, the execution
of the agent is terminated.  Now the server checks to see whether
there is a server in the cluster that possesses the token to run agents
(let’s call it the Token server) If there is a Token server, execution
proceeds to step 4; if there is no Token server, execution proceeds to
step 6.

A Token server exists, so the Current server checks to see whether it is
the Token server.  If it is, the Current server runs the agent.  If
the Current server is not the token server, execution proceeds to step
5. The Current server is not the Token server, so the Current server checks
to see whether the Token server is timed out. If the answer is yes (the
Token server is timed out), proceed to step 6.  If the answer is no,
execution of the agent on the Current Server is terminated.  If there
is no Token server, or the Token server is timed out, the Current server
claims the token and runs the agent.


A picture named M2

No Comments »

  1. Greg Walrath Said,

    June 5, 2007 @ 4:13 pm

    Paul – do you know which issue of The View that this is in?

  2. Christian Henseler Said,

    June 5, 2007 @ 4:44 pm

    It’s
    May/June 2002, Volume 8, Issue 3 from Luciano Resende

  3. Kevin Pettitt Said,

    June 5, 2007 @ 5:24 pm

    That’s a great tip. Years ago I came across a similar technique that I’ve since adopted that allows non-designers to adjust agent execution times.

    Instead of setting a daily schedule for an agent, you set it to hourly, then use a configuration document to specify which hour (or hours), it should actually run. The agent kicks off every hour like you’d expect, but has to get past the “ShouldIRunNow” logic which compares the current hour to the config document. If it shouldn’t run, the agent terminates. Another advantage of this approach is that you can schedule multiple runs of the agent throughout the day which may not be evenly spaced.

    The downside is that if you use any of the tools I discuss here { Link } to audit all your server agents, the schedule information listed will be misleading.

Leave a Comment