BB Unix Network Monitor - Message
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: {bb} Problems monitoring SMTP Service
On Wed, 2005-08-31 at 07:54, Dirk H. Schulz wrote:
> Hi Philip,
>
> One example from this night:
>
> Aug 31 00:52:08 smtpmachine postfix/smtpd[17060]: connect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:52:08 smtpmachine postfix/smtpd[17060]: disconnect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:52:26 smtpmachine postfix/smtpd[17060]: connect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:52:26 smtpmachine postfix/smtpd[17060]: disconnect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:52:40 smtpmachine postfix/smtpd[17060]: connect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:52:40 smtpmachine postfix/smtpd[17060]: disconnect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:56:48 smtpmachine postfix/smtpd[17208]: connect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:56:48 smtpmachine postfix/smtpd[17208]: disconnect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:56:51 smtpmachine postfix/smtpd[17208]: connect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:56:51 smtpmachine postfix/smtpd[17208]: disconnect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:56:53 smtpmachine postfix/smtpd[17208]: connect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
> Aug 31 00:56:53 smtpmachine postfix/smtpd[17208]: disconnect from
> bbnet.domain.tld[xxx.xxx.xxx.xxx]
>
>
> At 00:52:07 BBDISPLAY claims smtpmachine's smtp service to be down for
> 0:04:45. The times should be synchronized quite well since they all
> synchronize daily against the same ntp server.
It appears that your BBNET is actually testing 3 times each BBSLEEP
period. Is this what you intended? If not, do you have this SMTP
server listed more than once in your bb-hosts file?
If you want to display the machine on multiple HTML pages, only the
first entry in your bb-hosts file should specify the tests that you
want. The second (and subsequent) lines for this host are only
place-holders and should only have the "noconn" directive.
> It is "the internet", simply. Is is two different colocation centers
> with very different upstream providers.
Because of this, you're actually "testing the internet" at the
same time. If you are doing this to receive warning of a problem,
you might want to alter your paging rules so that you only get an
alert after two or three "down" reports. If you're trying to
generate "availability" type reporting, I doubt that it is
possible to get accurate statistics without either testing
from multiple locations or testing at the co-location centre.
> Yes, on smtpmachine ssh is tested as well. It also has regular problems,
> but not that much. With smtp they occur every 20 to 120 minutes, with
> ssh every few days. And the ssh service does not get a red "down" dot,
> but a black "unavailable".
This suggests to me that it's likely to be caused by delays on the
internet exceeding the BB time-out values. The default values work
well for tests on local networks, but you may need to increase the
values of the BBNETTIMER variables in bbdef-server.sh. If you also
have a large number of tests on local networks, it might be better
to create a secondary BBNET (with longer time-out values) just
for testing remote hosts.
Cheers, Phil.
--
Vail's Second Axiom: The amount of work to be done increases in
proportion to
the amount of work already completed.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=
To unsubscribe from this list, or to subscribe to the bb-digest list
send e-mail to mailto:majordomo@bb4.com with unsubscribe bb -and/or-
subscribe bb-digest in the BODY of the message.
Home |
Main Index |
Thread Index