Charting Resources with Windows Performance Monitor

LSF integrates with Windows Performance Monitor, so you can chart LSF cluster, host, queue, and job performance information. Windows Performance Monitor can also be used to trigger external commands when specified thresholds are exceeded.

A service called LSF Monitor passes information from LSF to the Windows Performance Monitor. LSF Monitor must be installed separately.

LSF Monitor statistics

Once installed, LSF Monitor automatically sends information to the Windows Performance Monitor. Use the Windows Performance Monitor to chart LSF performance information.

The host, queue, and job objects support multiple instances.

The following LSF information is available:

  • Cluster information

  • Host information

  • Queue information

  • Job information

  • External information

Cluster information

  • Number of available servers

  • Number of unavailable servers

  • Number of servers where an LSF daemon (sbatchd or RES service) is down

  • Number of unlicensed servers

  • Number of pending jobs in the cluster

  • Number of running jobs in the cluster

  • Number of suspended jobs in the cluster

  • Number of sick jobs (jobs submitted with no password, jobs with job dependency never satisfied, and jobs pending more than 3 days)

  • Response time of LIM (as measured by the time to make an ls_load call)

  • Response time of mbatchd (as measured by the time to make an lsb_queueinfo call)

Host information

  • Load indices: r15s, r15m, mem, swap, pg, ut

  • Number of running jobs

  • Number of suspended jobs

  • Number of reserved job slots

  • External load Indices

Queue information

  • Number of pending jobs

  • Number of running jobs

  • Number of suspended jobs

  • Number of reserved job slots

Job information

  • CPU time used by the job

  • Memory used by the job (for jobs running on UNIX only)

  • Swap space used by the job (for jobs running on UNIX only)

External information

  • Values of one or two external load indices (configured by the LSF administrator)

Install LSF Monitor

Before you begin

You must have a cluster running LSF version 4.0 or higher. You must install LSF Monitor on any LSF server or client host running Windows. The cluster can include UNIX hosts. You must specify a cluster administrator account and password.

About this task

The LSF Monitor setup program is installed with LSF (LSF Monitor is not supported on 64bit machines). Use lsfmon -install to actually install the LSF Monitor service:

Procedure

  1. Log on to a Windows host as an LSF user in an existing LSF cluster.
  2. In a command prompt, type:

    lsfmon -install

    LSF Monitor is installed.

  3. On the Windows Control Panel, click Services.

    The Services window opens.

  4. Right-click LSF Monitor and click Properties.
  5. In the Log On As section, deselect System Account, select This Account, and specify an LSF cluster administrator account (such as Administrator).
  6. Type in the password twice and click OK.
  7. In the Services window, select LSF Monitor and click Start to start the service.

Configure LSF Monitor

Before you begin

Back up your registry before you make any changes.

About this task

You can configure sample intervals for host, queue and job information along with external load indices.

LSF Monitor periodically samples information from LSF and updates the Windows Performance Monitor.

By default, information is sampled at the following intervals:

  • Host information = 30 seconds

  • Queue information = 45 seconds

  • Job information = 60 seconds

Procedure

  1. Change the sample intervals for LSF host, job, or queue information by modifying the Windows Registry settings.
    1. Select the Registry subkey:
      HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LSFMonitor
    2. Edit the appropriate value, and specify the new sample interval in seconds:
      • SampleIntervalHost

      • SampleIntervalJob

      • SampleIntervalQueue

  2. Configure LSF Monitor to monitor external load indices.
    1. Go to the Registry subkey HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\LSFMonitor.
    2. Specify the appropriate value and type the name of an external load index that is configured in your cluster:
      • ExternalLoadIndex1

      • ExternalLoadIndex2

Administer LSF Monitor

Procedure

  • Start or stop LSF Monitor.

    Use the Windows Control Panel to start or stop the LSF Monitor service.

  • Use the Windows Event Viewer to view the Windows event log.

    Errors related to LSF API calls and the operation of LSF services are logged to the Windows event log.

  • Uninstall LSF Monitor. From a command prompt, type:

    lsfmon -remove

    This command stops the LSF Monitor service if it is running, then removes it and removes related information from the Windows Registry.