Big Brother - Help

A Web-based Systems and Network Monitoring and Notification System

Order of severity

Serious Trouble
No report lately
May need attention
All is well

All connections are checked every 5 minutes


BB Installation and Configuration Manual


Big Brother FAQ


Pager Codes

The administrator will be notified when conditions merit. The numeric message is formatted as follows: [3 DIGIT CODE] [IP-ADDRESS]


Severe Conditions

Most severe conditions result in the administrator being notified. These include loss of network connectivity, loss of HTTP access, and disk conditions over 95% full, since these can result in a system hang. Furthermore, any "NOTICE" messages in the message file causes a notification since this may signal a disk fault.

Under these circumstances, the screen should turn red. Click on the corresponding red dot for additional information about the condition.

If a severe situation is occurring that is not being noticed by Big Brother, use the PAGE/ACK button on the main screen to notify the administrator manually.


Warning Conditions

These include HTTP server errors, disks 90-94% full, the death of important processes, and "WARNING" messages in the system logs.

The screen should turn yellow if this is the most severe situation at the time. Click on the corresponding yellow dot for additional information, and notify the administrator manually if necessary.


No Report Warnings

Each report is checked for freshness. If any report is more than 30 minutes old, it is marked with a purple dot, and the screen turns purple, assuming that it is the most serious situation at the time.

These may be the result of heavily loaded systems, but may also indicate a more serious loss of communication within the Big Brother system itself.


System Information

Click on any server name for additional details about the machine. Information about all components are available, including serial numbers, partition sizes, SCSI addresses, and the physical locations of the devices. This information lives in the www/notes directory.


General Information

The current status of any individual component is always available by clicking the appropriate dot in the display matrix. You may have to hit Reload to get the most recent entry.

Occasionally the screen changes color for CPU or HTTP warnings. These can usually be disregarded since Big Brother has been instructed to be very sensitive during this initial test. Similarly, internet connections may turn yellow when the network is heavily loaded. Although it should be checked out, this is usually not a problem unless the whole Internet section goes yellow.


Big Brother Column Information

conn

The conn column denotes the ping check performed periodically. This code is located in bb-network.sh.


nntp

The nntp column denotes the nntp check performed periodically. This code is located in bb-network.sh. It makes sure the news server is alive and well.


cpu

The cpu column denotes the cpu check performed periodically. This figure is based on the 5 minute load average as reported by the 'uptime' command, in the second column. The code for this test is located in bb-local.sh.


disk

The disk column denotes the disk check performed periodically. This test is just the 'df' command with the disk most full being reported. The warning amount is 90% by default, and the system is set to panic at 95%. These values are set in $BBHOME/etc/bbdef.sh and may be changed. The code for the disk test lives in bb-local.sh. You may also set warning/panic level individually in the etc/bb-dftab file. See the etc/bb-dftab.INFO.


dns

The dns column verifies the status of the DNS server on that machine. The test is basically an nslookup with the server name and IP address as arguments.


ftp

The ftp column denotes the ftp check performed periodically. This code is located in bb-network.sh. It is part of the new group of generic server tests performed. To test this service on a given machine, just include 'ftp' on the line in the bb-hosts file.


http

The http column denotes the http check performed periodically. This code is located in bb-network.sh. It will return OK if the server is there and does not return a string containing the word 'Error'. It should be more rigourous. Note that password-protected pages return an error when they shouldn't.


msgs

The msgs column denotes the msgs check performed periodically. This code is located in bb-local.sh. Only NOTICE and WARNING conditions are considered. Note that a NOTICE condition will cause a notification (code red) whereas a WARNING just turns the screen yellow. There is no way to turn these messages off, short of clearing out the messages file manually or modifying the tags from WARNING to wARNING and NOTICE to nOTICE. You may also introduce tags in the etc/bbdef.sh file in the PAGEMSG and MSGS variables.


pop3

The pop3 column denotes the pop3 check performed periodically. This is part of the generic test code in bb-network.sh. It checks that the pop3 server is alive and well. To test a machine for the pop3 server, put the word 'pop3' on that server's line in the bb-hosts file. You may have to put pop-3 instead on certain platforms. Check /etc/services for the correct spelling.


procs

The procs column denotes the procs check performed periodically. This code is located in bb-local.sh. It makes sure that the processes defined in etc/bbdef.sh in the PROCS variable exist on the local machine. If a process does not exist, and it has been defined in the PAGEPROCS variable, then the code is red and a notification is sent out. The ps command is used to get a current process listing.


smtp

The smtp column denotes the smtp check performed periodically. This is part of the generic server test code located in bb-network.sh. It makes sure that the SMTP process (usually sendmail) is alive and well.

Copyright © 1997-1999 The MacLawran Group Inc - All Rights Reserved