Some utility scripts used at Liverpool
Grid Engine
* /usr/local/bin/busy-nodes: List nodes currently running jobs;
* /usr/local/bin/dead-nodes: List dead nodes (where the exec daemon isn't responding, c.f. qhost(1));
* /usr/local/sbin/disable-nodes: Disable queues on a list of nodes;
* /usr/local/sbin/disabled-nodes: List nodes disabled in one or more queue;
* /usr/local/sbin/enable-nodes: Enable all queues on a list of nodes;
* /usr/local/bin/idle-nodes: List nodes currently not running jobs;
* /usr/local/bin/nodes-in-job: List nodes used by a running job;
* /usr/local/sbin/qselect-node-list: Produce pdsh(1)-compatible host list according to qselect(1) criteria;
* /usr/local/sbin/sge-disable-submits: Disable job submission except for users in group test, e.g. to drain the cluster;
* /usr/local/sbin/sge-enable-submits: Re-enable general job submission;
* /usr/local/sbin/sge-restrict-nodes: Restrict a list of nodes to access only by group "testing";
* /usr/local/sbin/sge-unrestrict-nodes: Remove restriction to group "testing" on a list of nodes;
* /usr/local/sbin/sge-user-lists: Show consolidated per-node user and excluded user lists;
Monitoring
* /usr/local/sbin/freeipmi-gmetric-temp: Out-of-band IPMI temperature monitoring for Ganglia with FreeIPMI. Run from cron on the relevant nodes and somewhat site-specific;
* /usr/local/sbin/smartd-nsca: Send smartd(8) reports to Nagios host running NSCA. Used in /etc/smartd.conf like:
# Ignore temperature and power-on hours reports, use nsca to # report failures. /dev/sda -a -d sat -I 194 -I 231 -I 9 -m root@localhost -M daily -M exec /usr/local/sbin/smartd-nsca
* /etc/cron.daily/set-ipmi-times: cron job to synchronize service processor times.
Miscellaneous
* /usr/local/sbin/remote-ipmitool: Run ipmitool(1) commands on a given node number's service processor;
* /usr/local/sbin/useradd-grid: Add a Globus user (site-specific).
* /usr/local/sbin/baystack-port: Look up a MAC address in Baystack port table.
* /usr/local/sbin/submit-reboot-job: Submit a job to reboot a given node when it is empty. (Somewhat site-specific.)