HPC features are installed on UNIX or Linux hosts as part of the PARALLEL template. When you install, some changes are made for you automatically. You should add the appropriate resource names under the RESOURCES column of the Host section of lsf.cluster.cluster_name.
The HPC feature installation Automatically configures the following files:
lsb.modules
lsb.resources
lsb.queues
lsf.cluster
lsf.conf
lsf.shared
Adds the external scheduler plugin module names to the PluginModule section of lsb.modules:
Begin PluginModule
SCH_PLUGIN RB_PLUGIN SCH_DISABLE_PHASES
schmod_default () ()
schmod_fcfs () ()
schmod_fairshare () ()
schmod_limit () ()
schmod_parallel () ()
schmod_reserve () ()
schmod_mc () ()
schmod_preemption () ()
schmod_advrsv () ()
schmod_ps () ()
schmod_affinity () ()
#schmod_dc () ()
schmod_aps () ()
schmod_cpuset () ()
End PluginModule
The HPC plugin names must be configured after the standard LSF plugin names in the PluginModule list.
For IBM POE jobs, lsfinstall configures the ReservationUsage section in lsb.resources to reserve HPS resources on a per-slot basis.
Resource usage defined in the ReservationUsage section overrides the cluster-wide RESOURCE_RESERVE_PER_SLOT parameter defined in lsb.params if it also exists.
Begin ReservationUsage
RESOURCE METHOD
adapter_windows PER_SLOT
nrt_windows PER_SLOT
End ReservationUsage
Configures hpc_ibm queue for IBM POE jobs and the hpc_ibm_tv queue for debugging IBM POE jobs:
Begin Queue
QUEUE_NAME = hpc_linux
PRIORITY = 30
NICE = 20
#RUN_WINDOW = 5:19:00-1:8:30 20:00-8:30
#r1m = 0.7/2.0 # loadSched/loadStop
#r15m = 1.0/2.5
#pg = 4.0/8
#ut = 0.2
#io = 50/240
#CPULIMIT = 180/hostA # 3 hours of host hostA
#FILELIMIT = 20000
#DATALIMIT = 20000 # jobs data segment limit
#CORELIMIT = 20000
#PROCLIMIT = 5 # job processor limit
#USERS = all # users who can submit jobs to this queue
#HOSTS = all # hosts on which jobs in this queue can run
#PRE_EXEC = /usr/local/lsf/misc/testq_pre >> /tmp/pre.out
#POST_EXEC = /usr/local/lsf/misc/testq_post |grep -v Hey
DESCRIPTION = IBM Platform LSF 9.1 for linux.
End Queue
Begin Queue
QUEUE_NAME = hpc_linux_tv
PRIORITY = 30
NICE = 20
#RUN_WINDOW = 5:19:00-1:8:30 20:00-8:30
#r1m = 0.7/2.0 # loadSched/loadStop
#r15m = 1.0/2.5
#pg = 4.0/8
#ut = 0.2
#io = 50/240
#CPULIMIT = 180/hostA # 3 hours of host hostA
#FILELIMIT = 20000
#DATALIMIT = 20000 # jobs data segment limit
#CORELIMIT = 20000
#PROCLIMIT = 5 # job processor limit
#USERS = all # users who can submit jobs to this queue
#HOSTS = all # hosts on which jobs in this queue can run
#PRE_EXEC = /usr/local/lsf/misc/testq_pre >> /tmp/pre.out
#POST_EXEC = /usr/local/lsf/misc/testq_post |grep -v Hey
TERMINATE_WHEN = LOAD PREEMPT WINDOW
RERUNNABLE = NO
INTERACTIVE = NO
DESCRIPTION = IBM Platform LSF 9.1 for linux debug queue.
End Queue
Begin Queue
QUEUE_NAME = hpc_ibm
PRIORITY = 30
NICE = 20
#RUN_WINDOW = 5:19:00-1:8:30 20:00-8:30
#r1m = 0.7/2.0 # loadSched/loadStop
#r15m = 1.0/2.5
#pg = 4.0/8
#ut = 0.2
#io = 50/240
#CPULIMIT = 180/hostA # 3 hours of host hostA
#FILELIMIT = 20000
#DATALIMIT = 20000 # jobs data segment limit
#CORELIMIT = 20000
#PROCLIMIT = 5 # job processor limit
#USERS = all # users who can submit jobs to this queue
#HOSTS = all # hosts on which jobs in this queue can run
#PRE_EXEC = /usr/local/lsf/misc/testq_pre >> /tmp/pre.out
#POST_EXEC = /usr/local/lsf/misc/testq_post |grep -v Hey
RES_REQ = select[ poe > 0 ]
EXCLUSIVE = Y
REQUEUE_EXIT_VALUES = 133 134 135
DESCRIPTION = IBM Platform LSF 9.1 for IBM. This queue is to run POE jobs ONLY.
End Queue
Begin Queue
QUEUE_NAME = hpc_ibm_tv
PRIORITY = 30
NICE = 20
#RUN_WINDOW = 5:19:00-1:8:30 20:00-8:30
#r1m = 0.7/2.0 # loadSched/loadStop
#r15m = 1.0/2.5
#pg = 4.0/8
#ut = 0.2
#io = 50/240
#CPULIMIT = 180/hostA # 3 hours of host hostA
#FILELIMIT = 20000
#DATALIMIT = 20000 # jobs data segment limit
#CORELIMIT = 20000
#PROCLIMIT = 5 # job processor limit
#USERS = all # users who can submit jobs to this queue
#HOSTS = all # hosts on which jobs in this queue can run
#PRE_EXEC = /usr/local/lsf/misc/testq_pre >> /tmp/pre.out
#POST_EXEC = /usr/local/lsf/misc/testq_post |grep -v Hey
RES_REQ = select[ poe > 0 ]
REQUEUE_EXIT_VALUES = 133 134 135
TERMINATE_WHEN = LOAD PREEMPT WINDOW
RERUNNABLE = NO
INTERACTIVE = NO
DESCRIPTION = IBM Platform LSF 9.1 for IBM debug queue. This queue is to run POE jobs ONLY.
End Queue
For IBM POE jobs, configures the ResourceMap section of lsf.cluster.cluster_name to map the following shared resources for POE jobs to all hosts in the cluster:
Begin ResourceMap
RESOURCENAME LOCATION
adapter_windows [default]
ntbl_windows [default]
poe [default]
dedicated_tasks (0@[default])
ip_tasks (0@[default])
us_tasks (0@[default])
End ResourceMap
LSB_SUB_COMMANDNAME=Y to lsf.conf to enable the LSF_SUB_COMMANDLINE environment variable required by esub.
LSF_ENABLE_EXTSCHEDULER=Y: LSF uses an external scheduler for topology-aware external scheduling.
LSB_CPUSET_BESTCPUS=Y: LSF schedules jobs based on the shortest CPU radius in the processor topology using a best-fit algorithm. On HP-UX hosts, sets the full path to the HP vendor MPI library libmpirm.sl LSF_VPLUGIN="/opt/mpi/lib/pa1.1/libmpirm.sl"
LSB_RLA_PORT=port_number, where port_number is the TCP port used for communication between the LSF HPC topology adapter (RLA) and sbatchd. The default port number is 6883.
LSB_SHORT_HOSTLIST=1: Displays an abbreviated list of hosts in bjobs and bhist for a parallel job where multiple processes of a job are running on a host. Multiple processes are displayed in the format processes*hostA.
Defines the following shared resources required by HPC features in lsf.shared:
Begin Resource
RESOURCENAME TYPE INTERVAL INCREASING DESCRIPTION # Keywords
slurm Boolean () () (SLURM)
cpuset Boolean () () (CPUSET)
mpich_gm Boolean () () (MPICH GM MPI)
lammpi Boolean () () (LAM MPI)
mpichp4 Boolean () () (MPICH P4 MPI)
mvapich Boolean () () (Infiniband MPI)
sca_mpimon Boolean () () (SCALI MPI)
ibmmpi Boolean () () (IBM POE MPI)
hpmpi Boolean () () (HP MPI)
intelmpi Boolean () () (Intel MPI)
crayxt3 Boolean () () (Cray XT3 MPI)
crayx1 Boolean () () (Cray X1 MPI)
fluent Boolean () () (fluent availability)
ls_dyna Boolean () () (ls_dyna availability)
nastran Boolean () () (nastran availability)
pvm Boolean () () (pvm availability)
openmp Boolean () () (openmp availability)
ansys Boolean () () (ansys availability)
blast Boolean () () (blast availability)
gaussian Boolean () () (gaussian availability)
lion Boolean () () (lion availability)
scitegic Boolean () () (scitegic availability)
schroedinger Boolean () () (schroedinger availability)
hmmer Boolean () () (hmmer availability)
adapter_windows Numeric 30 N (free adapter windows on css0 on IBM SP)
ntbl_windows Numeric 30 N (free ntbl windows on IBM HPS)
poe Numeric 30 N (poe availability)
css0 Numeric 30 N (free adapter windows on css0 on IBM SP)
csss Numeric 30 N (free adapter windows on csss on IBM SP)
dedicated_tasks Numeric () Y (running dedicated tasks)
ip_tasks Numeric () Y (running IP tasks)
us_tasks Numeric () Y (running US tasks)
End Resource