wassup is a bash script that pings a target host to see if it is online. If the target host does not respond to the ping, wassup will send email alerts to the specified addresses. If the target responds to the ping the next time wassup runs, another notification will be sent to show that the host is online again.
wassup requires no special privileges, and may be run as an ordinary user. For best results, the user should have the ability to run wassup in a crontab. For example, to run it every 5 minutes:
*/5 * * * * /home/bob/wassup/wassup 192.168.1.2 >/dev/null
Untar wassup and copy the entire wassup directory to the desired location in a user's home directory.
To configure, edit SAFETY, BASEDIR, and
MAILTO, which appear in the beginning of the script.
The safety host is a reliable host that is always online, pingable, and reachable from the monitoring machine. If the target host doesn't respond to a ping, wassup will check the safety host to confirm that the network is availabe, preventing false alarms.
SAFETY="www.example.com"
Lockfiles and logs will be written in BASEDIR, which must be
writable by the user running wassup. This is also where the bodies of the
notification messages will be stored.
BASEDIR=/home/bob/wassup
Specify the notification address, or list of addresses separated by commas with no spaces. Obviously, these accounts should be accessible if the target is down.
MAILTO=bob MAILTO=bob,bob@example.com,5551234567@mobile.example.net
Optionally, edit the notification messages:
MESSAGE_DOWN=$BASEDIR/message_down MESSAGE_UP=$BASEDIR/message_up
These files contain the body of the email notifications. I have included samples which were designed to be informative, yet small enough for the display on a cell phone or beeper. To provide custom messages for each target, change the variable to:
MESSAGE_DOWN=$BASEDIR/${TARGET}.message_down
and make sure matching files exist in the base directory for each monitored host. This obviously increases the maintenance overhead of the application, so avoid this customization unless there is a pressing need.
When wassup runs, it will attempt to ping the target host. If the target can't be reached, the safety host will be tried. If the safety does not respond, wassup will abort with an appropriate log entry. Since this indicates a problem with the local network or interface, no notification is sent.
If the safety responds, wassup will try to ping the target host again. If the target still doesn't respond, wassup will assume the target is offline and check for a lockfile. If no lockfile exists, it will make a log entry, email the alert, and create a lockfile to prevent further notifications.
If the target host responds to the initial ping, wassup checks for a lockfile. The presence of a lockfile indicates that the target was previously offline, so wassup writes to the log, sends a notification that the host is back online, and removes the lockfile.
wassup's logging is very simple, using parameter=value pairs.
It only logs if the safety can't be reached, or when the target goes offline,
then back online. This keeps the log quite small. It will probably never need
to be rotated. If it gets big, there's either something seriously wrong with
your script variables or your network.
To disable logging, comment out any lines that echo to the logfile.
In order to work, wassup requires that both the target and safety machines respond to pings.
Test wassup on the command line before running it in cron, using both a live target and an unreachable host. Once things are working, add it to the user's crontab.
Be sure to use the complete path to the wassup script in the crontab. Also make sure that the script is executable.
BASEDIR and its files must be writable for the user running wassup.
The first line must point to the system's bash executable.
README
wassup
message_up
message_down
v.20080805:
v.20031008:
v.20011129: