LSF integrates with Windows Performance Monitor, so you can chart LSF cluster, host, queue, and job performance information. Windows Performance Monitor can also be used to trigger external commands when specified thresholds are exceeded.
A service called LSF Monitor passes information from LSF to the Windows Performance Monitor. LSF Monitor must be installed separately.
Once installed, LSF Monitor automatically sends information to the Windows Performance Monitor. Use the Windows Performance Monitor to chart LSF performance information.
The host, queue, and job objects support multiple instances.
The following LSF information is available:
Cluster information
Host information
Queue information
Job information
External information
Number of available servers
Number of unavailable servers
Number of servers where an LSF daemon (sbatchd or RES service) is down
Number of unlicensed servers
Number of pending jobs in the cluster
Number of running jobs in the cluster
Number of suspended jobs in the cluster
Number of sick jobs (jobs submitted with no password, jobs with job dependency never satisfied, and jobs pending more than 3 days)
Response time of LIM (as measured by the time to make an ls_load call)
Response time of mbatchd (as measured by the time to make an lsb_queueinfo call)
Load indices: r15s, r15m, mem, swap, pg, ut
Number of running jobs
Number of suspended jobs
Number of reserved job slots
External load Indices
Number of pending jobs
Number of running jobs
Number of suspended jobs
Number of reserved job slots
CPU time used by the job
Memory used by the job (for jobs running on UNIX only)
Swap space used by the job (for jobs running on UNIX only)
Values of one or two external load indices (configured by the LSF administrator)
You must have a cluster running LSF version 4.0 or higher. You must install LSF Monitor on any LSF server or client host running Windows. The cluster can include UNIX hosts. You must specify a cluster administrator account and password.
The LSF Monitor setup program is installed with LSF (LSF Monitor is not supported on 64bit machines). Use lsfmon -install to actually install the LSF Monitor service:
Back up your registry before you make any changes.
You can configure sample intervals for host, queue and job information along with external load indices.
LSF Monitor periodically samples information from LSF and updates the Windows Performance Monitor.
By default, information is sampled at the following intervals:
Host information = 30 seconds
Queue information = 45 seconds
Job information = 60 seconds