BB Unix Network Monitor - Message
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: {bb} The bb_rename error problem
- To: bb@bb4.com
- Subject: RE: {bb} The bb_rename error problem
- From: brent.mccrackin@bell.ca
- Date: Fri, 28 Jan 2005 14:42:30 -0500
- Content-class: urn:content-classes:message
- Content-transfer-encoding: 8bit
- Content-type: text/plain; charset=us-ascii
- Reply-to: bb@bb4.com
- Sender: owner-bb@bb4.com
- Thread-index: AcUFbD+4/weE7nItQx6Ra6pg8+XUMgAA9kEg
- Thread-topic: {bb} The bb_rename error problem
Ooooooh... I think that might be it!
I am (was) runing bbretest-net.sh as an extension script with a period
of 60 seconds. If it updates a test with a new status messages before
(or at the same time as) BBD has finished with the original status
message, then there will be a conflict - especially on a server with a
high I/O load. So some file locking may be needed to handle such
instances (maybe start using .somehost.test.xxx files where xxx is some
unique number).
So, to quikcly fix that, I've removed bbretest-net.sh from the
bb-bbexttab and incorporated it into the bb-network.sh script to ensure
it runs after the initial bbtest-net is done, with a sleep period in
between them (unless bbtest-net fails, then the script does it error
report and dumps out). The sleep needs to be long enough to allow BBD
to finish with all the previous status messages before the new ones are
sent in, but not so long that bb-network.sh takes longer than BBSLEEP.
So now I have BBVAR/logs running on a tmpfs that I hope stays in RAM
(limited size to hopefully ensure that, plus the system has lots anyway
with very little swap usage) and the bbretest-net function is now
guaranteed to run after the initial bbtest-net function is finished.
To answer Robert, the CPU Load and Utilization on that server are very
low - BBGEN has helped remove any concerns from that aspect a while ago.
Very little of BB is now running as shell scripts, most all of it now is
binary. The I/O load is no longer over 90% with using tmpfs for
BBVAR/logs, so some headroom is made there.
Thanks for everyone's input and suggestions.
---
Brent B McCrackin
UNIX Systems Specialist - Bell Sympatico
Brent.McCrackin@Bell.ca PH: 416-353-0692
"Serenity through viciousness."
-----Original Message-----
From: owner-bb@bb4.com [mailto:owner-bb@bb4.com] On Behalf Of Henrik
Storner
Sent: January 28, 2005 2:05 PM
To: bb@bb4.com
Subject: Re: {bb} The bb_rename error problem
In <41FA6C81.20801@bb4.com> Robert-Andre Croteau <robert@bb4.com>
writes:
>brent.mccrackin@bell.ca wrote:
>> Wed Jan 26 08:56:23 2005 bbd bb_rename Could not rename
>> /bbvar/logs/.somehost.http to /bbvar/logs/somehost.http - errno: 2
>the error occurs because /bbvar/logs/.somehost.http does not exists
>anymore. The only reason this can happen (from my POV) is that
>another status for the same somehost.http was received and is being
>process concurrently.
>Reasons this can happen ?
>2) Overloaded BB server. Very slow to process all incoming status
>logs in a timely fashion and actually didn't process that particular
>status in a BBSLEEP cycle.
I agree with this analysis. Does it seem possible that the BB server
cannot proces one status-message before the next status-message for
the same host+test combination arrives ? Yes, it is quite possible.
Brent mentioned in his initial posting that he was using bbgen, and
presumably also the bbgen network tester. There is an extra script
included with bbgen, which triggers a failing network test to be
repeated with 1 minute intervals for the first 30 minutes after a
failure is detected. I don't know if Brent is using this (the
"bbretest-net.sh" extension script), but if he does then the status
messages can arrive much more frequently. I use it, and if we have a
network "blip" (e.g. when a network switch reloads), BB records
outages of 30 seconds. Meaning that the time between the initial
status report and the next was 30 seconds.
Of course the problem gets worse when problems appear, because changes
in status triggers more file updates (history logs) and process
activity (alerts need to be sent) than what you see when everything is
green. So if the BB server is busy enough that processing a message
takes 1 or 2 minutes, this is a very likely reason for this problem.
Henrik
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=
To unsubscribe from this list, or to subscribe to the bb-digest list
send e-mail to mailto:majordomo@bb4.com with unsubscribe bb -and/or-
subscribe bb-digest in the BODY of the message.
--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-=-=-=
To unsubscribe from this list, or to subscribe to the bb-digest list
send e-mail to mailto:majordomo@bb4.com with unsubscribe bb -and/or-
subscribe bb-digest in the BODY of the message.
Home |
Main Index |
Thread Index