| | | 
Time Traveler
       
Group: WhatsUp Gold Expert Last Login: Today @ 12:33:43 PM Posts: 1,437, Visits: 3,945 |
| Hi,
I am really puzzled with this Up/Down story, the fact that there is no "trigger" in WUP like there was in WUG.
Something I really DON'T understand, and I would really appreciate an explanation from Ipswitch :
There is normally a polling cycle, and by default it is every 60secs. But with my WUP test installation, it seems that as soon as I connect / disconnect a device WUP sees it, and triggers a notification.
Basically I am trying to find a workaround for "up" without "down" alarms, and what I did was :
* Create a "myping" monitor * Set the timeout on this monitor to 30 seconds * Set the retries to 4
My idea behind is that the polling engine will wait up to 30*4 seconds before declaring the device is down, thus avoiding "false downs" (due for instance to a lazy device)... and of course "false ups".
But it doesn't work ! I put a test icon and ask to poll with the dns name. Just to test, I change the true dns name to a fake one. Less than 2 seconds after changing I get the "up" and the "down" !
Besides I don't know if setting such a high timeout to the monitor is a good idea (if the wup polling is asynchronous it will poll the next device while waiting for an answer, but if it is not... then polling cycles could get very long).
Somebody help please !
Reading, writing and arithmetic - If you need to choose, please take option 1. |
| | | | 
Time Traveler
       
Group: WhatsUp Gold Expert Last Login: Today @ 12:33:43 PM Posts: 1,437, Visits: 3,945 |
| Folks,
A quick update on my previous post. It seems changing the DNS name was not a relevant test (maybe WUP uses the Windows DNS cache to lookup immediately).
If I really unplug the device from the network I do get the result I want.
It remains to know whether putting a 30sec timeout and 4 retries on a monitor will be a serious problem. I can believe it is as soon as you use dependencies...
For my own and current use maybe I could lower this to 15sec and 2 retries but I think it does not solve all cases.
Ipswitch input appreciated here ! This is really dealing with the inner working of the poll engine so they can answer best I guess !
Reading, writing and arithmetic - If you need to choose, please take option 1. |
| | | | 
Time Traveler
       
Group: Administrators Last Login: 10/10/2008 4:43:07 PM Posts: 359, Visits: 293 |
| Sergio, 1. The polling engine will go as low as 30 seconds. 2. What is different for triggers in this release is that you can go into Configure | Program Options and choose Device States, and you can define your device state based on the number of minutes you want. When you assign and alert (or Action) you will choose a state that you want the action processed. Let's say you defined a new or edited an existing 'device state' to 15 minutes. Then when assigning the Action (alert) you will choose the device state of 15 minutes. This took care of the trigger value. We have combined the trigger value and the device states into a single setting. 3. The UP alert is a different issue. This is broken and we will address this in Service Pack 1. It currently does not alert based on a Down alert.
Mark Singh Ipswitch, Inc. |
| | | | 
Time Traveler
       
Group: WhatsUp Gold Expert Last Login: Today @ 12:33:43 PM Posts: 1,437, Visits: 3,945 |
| Hi Mark,
I think I didn't explain my problem correctly. I WAS talking about the Up/down issue (Up not being linked with down). Since I really HAVE to use WUP now I was looking for a workaround to this problem.
The idea is the following :
* Device gets down but it's a very temporary condition (let's say the device is next to a slow wan link and a ping gets lost from time to time), that lasts 1 minute.
* Because I don't want to get an overflow of alarms, I configured WUP to get an alarm if device is down for 2 minutes.
* In this case I don't get the down but I do get the up... Which is EXACTLY the issue we're talking about.
My idea was that, by putting higher delay/retry on my ping monitor, I would AVOID that WUP sees the device as down... Unless, after 4*30secs, the device REALLY didn't answer of course, but in this case I DO want a down then an up notification !
So the question remains : What is the nasty side effect, if any, of putting a 30s timeout and 4 retries on a ping monitor ? Will the engine "wait" for 2 minutes before polling the next device, or will it somehow "queue" this request, go to the next device, and handle the previous request when the answer comes back ?
Thanks,
Sergio
Reading, writing and arithmetic - If you need to choose, please take option 1. |
| | | | 
Time Traveler
       
Group: Administrators Last Login: 10/10/2008 4:43:07 PM Posts: 359, Visits: 293 |
| Sergio, With that setting, the effect is that you delay the status of the device if its actually down, just as you suspected. It will continue to monitor other devices, but on the ones where a missed poll was detected, it will wait for 30 seconds and then retry that device according to the number of retries you specified.
Mark Singh Ipswitch, Inc. |
| | | | 
Time Traveler
       
Group: WhatsUp Gold Expert Last Login: Today @ 12:33:43 PM Posts: 1,437, Visits: 3,945 |
| Thanks Mark. I guess this answers my question. I will play a little bit with that and let know if I see any big drawback in the polling sequence.
Reading, writing and arithmetic - If you need to choose, please take option 1. |
| | | | 
Time Traveler
       
Group: Administrators Last Login: 10/10/2008 4:43:07 PM Posts: 359, Visits: 293 |
| That is cool Sergio. It is a way around the instant UP for a missed packet, so if you are satisfied with that configuration and do not see too many draw-backs, you may want to post it to the email forum for others to see.
Mark Singh Ipswitch, Inc. |
| | | | 
Forum Member
       
Group: Forum Members Last Login: 10/5/2006 10:23:00 AM Posts: 35, Visits: 1 |
| Mark, For #3 where you mention that the UP is a bug that will be fixed in SP1, are you refering to getting UP messages without having a device first go into a DOWN state? I've been getting that a lot where we are alerting DOWN on 5 mins and UP on 2 mins. It seems that if a device misses a few pings, but not enough to reach the 5 min down state, and then pings again we get UP messages with no DOWNs. Is that the bug? When is SP1 estimated to be out? Thanks! al
Senior Network Engineer Fox Chase Cancer Center |
| | | | | |
|