Nagios, Splunk, SNMP are various monitoring tools that can alert you to the status of connectivity to your SQL server. While I would like to employ these tools to monitor our network, I find that I really just want a tool to tell me 'Hey! Go check the server! It ain't working because of x,y,z...'. Checkpg.sh is such a tool.
checkpg.sh is a simple shell script that can notify an admin when there is a potential connectivity problem to a PostgreSQL server. The script accepts the following command line arguements and requires one database name to connect to.
-H print this help and exit -h hostname of the database server (default $PGHOST) -p listening port of the database server (default $PGPORT) -U name of database user (default $PGUSER) -L name of logfile (default $LOGFILE) -E email of admin to send notification to (default $EMAIL)
For sending a notification email, I prefer to use 'mutt', which may need to be installed via your package manager on Linux. For OSX servers, you can install mutt also.
Please note that the server you are checking should be able to accept connections from the host you are running the script from, this may mean modifying the firewall rules and the pg_hba.conf on the server to allow trust from the particular host that is running the script.
Example crontab entry to poll the PostgreSQL server every 15 minutes:
*/15 * * * * /home/xtuple/checkpg/checkpg.sh -h 192.168.0.4 -p 5432 -U postgres -L /home/xtuple/checkpg/logs/checkpg.log -E dbadmin@xyzco.com livedata
In the above example, the IP of the server we're checking is 192.168.0.4 and this instance of postgres should be running on port 5432. Connecting as user postgres, to the database named 'postgres'. We are logging our output to checkpg.log, sending notifications to dbadmin@xyzco.com, and we are connecting to the database named 'livedata'.
Example of what is logged:
13:30:01 Exit Code 0 - Everything looks good! 13:45:01 Exit Code 0 - Everything looks good! 14:00:01 Exit Code 0 - Everything looks good! 14:15:09 Error code 2 - The connection to the server went bad and the session was not interactive Is the PostgreSQL Service running on 192.168.0.4 port 5432? Does database named livedata exist? Ping 192.168.0.4 Reported Code: 0 Server is UP.
When a fail occurs, an email is kicked out with the log attached to the admin named in the command line.