We have a Big Brother installation monitoring 250 machines with a
primary and a failover server (both RHEL 3 boxen with openssh).
Today, the server started kicking out purple alerts for two of the
machines on the ssh check. There is nothing unique about them. They
are Solaris 9, like half of our systems. They run SSH.com, like most
of our systems. They were monitoring fine for a week. The only
changes I made today were to add bbwarnrules.cfg entries for both
servers.
I set up tcpdump to confirm that the server was checking every five
minutes. It is. I checked that ssh is working. It is. I removed the
files with bbrm for ssh on those systems and have been watching for
an hour. I see tcpdump talk to both machines on port 22, yet no file
is written to $BBVAR/logs to replace the ones bbrm took out.
So my question is this, why would Big Brother be doing a network
check but not writing the results to the logs directory?
For reference, here are the relevant bb-hosts lines:
128.205.7.27 libmgt2.acsu.buffalo.edu # ssh
128.205.7.26 libmgt1.acsu.buffalo.edu # ssh
Here are the lines from bbwarnrules.cfg:
libmgt2.acsu.buffalo.edu*;; conn ssh cpu disk msgs
procs;;*;0500-1700;library-support@gory.acsu.buffalo.edu:15
ext-epagesvc-University_libraries:15
ext-epagesvcsec-University_libraries:~15-30
libmgt1.acsu.buffalo.edu*;; conn ssh cpu disk msgs
procs;;*;0500-1700;library-support@gory.acsu.buffalo.edu:15
ext-epagesvc-University_libraries:15
ext-epagesvcsec-University_libraries:~15-30
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=
To unsubscribe from this list, or to subscribe to the bb-digest list
send e-mail to mailto:majordomo@bb4.com with unsubscribe bb -and/or-
subscribe bb-digest in the BODY of the message.