Puzzled with this Up/Down story

To the Ipswitch web site

Ipswitch Forums
Home      Members   Calendar   Who's On
Welcome Guest ( Login | Register )
      


12»»

Puzzled with this Up/Down storyExpand / Collapse
Author
Message
Posted 12/16/2004 11:19:22 AM


Time Traveler

Time TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime Traveler

Group: WhatsUp Gold Expert
Last Login: Today @ 12:33:43 PM
Posts: 1,437, Visits: 3,945
Hi,

I am really puzzled with this Up/Down story, the fact that there is no "trigger" in WUP like there was in WUG.

Something I really DON'T understand, and I would really appreciate an explanation from Ipswitch :

There is normally a polling cycle, and by default it is every 60secs. But with my WUP test installation, it seems that as soon as I connect / disconnect a device WUP sees it, and triggers a notification.

Basically I am trying to find a workaround for "up" without "down" alarms, and what I did was :

* Create a "myping" monitor
* Set the timeout on this monitor to 30 seconds
* Set the retries to 4

My idea behind is that the polling engine will wait up to 30*4 seconds before declaring the device is down, thus avoiding "false downs" (due for instance to a lazy device)... and of course "false ups".

But it doesn't work ! I put a test icon and ask to poll with the dns name. Just to test, I change the true dns name to a fake one. Less than 2 seconds after changing I get the "up" and the "down" !

Besides I don't know if setting such a high timeout to the monitor is a good idea (if the wup polling is asynchronous it will poll the next device while waiting for an answer, but if it is not... then polling cycles could get very long).

Somebody help please !


Reading, writing and arithmetic - If you need to choose, please take option 1.
Post #3329
Posted 12/16/2004 11:27:58 AM


Time Traveler

Time TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime Traveler

Group: WhatsUp Gold Expert
Last Login: Today @ 12:33:43 PM
Posts: 1,437, Visits: 3,945
Folks,

A quick update on my previous post. It seems changing the DNS name was not a relevant test (maybe WUP uses the Windows DNS cache to lookup immediately).

If I really unplug the device from the network I do get the result I want.

It remains to know whether putting a 30sec timeout and 4 retries on a monitor will be a serious problem. I can believe it is as soon as you use dependencies...

For my own and current use maybe I could lower this to 15sec and 2 retries but I think it does not solve all cases.

Ipswitch input appreciated here ! This is really dealing with the inner working of the poll engine so they can answer best I guess !


Reading, writing and arithmetic - If you need to choose, please take option 1.
Post #3331
Posted 12/16/2004 6:12:31 PM


Time Traveler

Time TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime Traveler

Group: Administrators
Last Login: 10/10/2008 4:43:07 PM
Posts: 359, Visits: 293

Sergio,

1. The polling engine will go as low as 30 seconds. 

2. What is different for triggers in this release is that you can go into Configure | Program Options and choose Device States, and you can define your device state based on the number of minutes you want.

When you assign and alert (or Action) you will choose a state that you want the action processed.  Let's say you defined a new or edited an existing 'device state' to 15 minutes.  Then when assigning the Action (alert) you will choose the device state of 15 minutes.  This took care of the trigger value.  We have combined the trigger value and the device states into a single setting.

3. The UP alert is a different issue.  This is broken and we will address this in Service Pack 1.   It currently does not alert based on a Down alert.



Mark Singh
Ipswitch, Inc.
Post #3360
Posted 12/17/2004 11:38:26 AM


Time Traveler

Time TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime Traveler

Group: WhatsUp Gold Expert
Last Login: Today @ 12:33:43 PM
Posts: 1,437, Visits: 3,945
Hi Mark,

I think I didn't explain my problem correctly. I WAS talking about the Up/down issue (Up not being linked with down). Since I really HAVE to use WUP now I was looking for a workaround to this problem.

The idea is the following :

* Device gets down but it's a very temporary condition (let's say the device is next to a slow wan link and a ping gets lost from time to time), that lasts 1 minute.

* Because I don't want to get an overflow of alarms, I configured WUP to get an alarm if device is down for 2 minutes.

* In this case I don't get the down but I do get the up... Which is EXACTLY the issue we're talking about.

My idea was that, by putting higher delay/retry on my ping monitor, I would AVOID that WUP sees the device as down... Unless, after 4*30secs, the device REALLY didn't answer of course, but in this case I DO want a down then an up notification !

So the question remains : What is the nasty side effect, if any, of putting a 30s timeout and 4 retries on a ping monitor ? Will the engine "wait" for 2 minutes before polling the next device, or will it somehow "queue" this request, go to the next device, and handle the previous request when the answer comes back ?

Thanks,

Sergio


Reading, writing and arithmetic - If you need to choose, please take option 1.
Post #3383
Posted 12/17/2004 12:35:09 PM


Time Traveler

Time TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime Traveler

Group: Administrators
Last Login: 10/10/2008 4:43:07 PM
Posts: 359, Visits: 293

Sergio,

With that setting, the effect is that you delay the status of the device if its actually down, just as you suspected.

It will continue to monitor other devices, but on the ones where a missed poll was detected, it will wait for 30 seconds and then retry that device according to the number of retries you specified.



Mark Singh
Ipswitch, Inc.
Post #3387
Posted 12/17/2004 1:01:39 PM


Time Traveler

Time TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime Traveler

Group: WhatsUp Gold Expert
Last Login: Today @ 12:33:43 PM
Posts: 1,437, Visits: 3,945
Thanks Mark. I guess this answers my question. I will play a little bit with that and let know if I see any big drawback in the polling sequence.

Reading, writing and arithmetic - If you need to choose, please take option 1.
Post #3390
Posted 12/17/2004 1:12:11 PM


Time Traveler

Time TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime TravelerTime Traveler

Group: Administrators
Last Login: 10/10/2008 4:43:07 PM
Posts: 359, Visits: 293
That is cool Sergio.  It is a way around the instant UP for a missed packet, so if you are satisfied with that configuration and do not see too many draw-backs, you may want to post it to the email forum for others to see. 

Mark Singh
Ipswitch, Inc.
Post #3391
Posted 12/22/2004 9:59:09 AM


Forum Member

Forum MemberForum MemberForum MemberForum MemberForum MemberForum MemberForum MemberForum Member

Group: Forum Members
Last Login: 10/5/2006 10:23:00 AM
Posts: 35, Visits: 1

Mark,

For #3 where you mention that the UP is a bug that will be fixed in SP1, are you refering to getting UP messages without having a device first go into a DOWN state?  I've been getting that a lot where we are alerting DOWN on 5 mins and UP on 2 mins.  It seems that if a device misses a few pings, but not enough to reach the 5 min down state, and then pings again we get UP messages with no DOWNs.  Is that the bug?  When is SP1 estimated to be out? 

Thanks!

al



Senior Network Engineer

Fox Chase Cancer Center

Post #3482
Posted 12/30/2004 9:17:28 AM


Time Traveler

Time TravelerTime TravelerTime TravelerTime Traveler