Once a Session Scheduler session job has been dispatched and starts running, Session Scheduler parses the task definition file specified on the ssched command. Each line of the task definition file is one task. Tasks run on the hosts in the allocation in any order. Dependencies between tasks are not supported.
Session Scheduler status is posted to the Session Scheduler session job through the LSF bpost command. Use bread or bjobs -l to view Session Scheduler status. The status includes the current number of pending, running and completed tasks. LSF administrators can configure how often the status is updated.
When all tasks are completed, the Session Scheduler exits normally.
ssched runs under the submission user account. Any processes it creates, either locally or remotely, also run under the submission user account. Session Scheduler does not require any privileges beyond those normally granted a user.
The Session Scheduler session job is compatible with all currently supported LSF job submission and execution parameters, including pre-execution, post-execution, job-starters, I/O redirection, queue and application profile configuration.
Run limits are interpreted and enforced as normal LSF parallel jobs. Application-level checkpointing is also supported. Job chunking is not relevant to Session Scheduler jobs since a single Session Scheduler session is generally long running and should not be chunked.
If the Session Scheduler session is killed (bkill) or requeued (brequeue), the Session Scheduler kills all running tasks, execution agents, and any other processes it has started, both local and remote. The session scheduler also cleans up any temporary files created and then exits. If the session scheduler is then requeued and restarted, all tasks are rerun.
If the Session Scheduler session is suspended (bstop), the Session Scheduler and all local and remote components will be stopped until the session is resumed (bresume).
ssched and sservice and sschild execution agents ensure that the user submission environment variables are set correctly for each task. In order to minimize the load on the LSF, mbatchd does not have any knowledge of individual tasks.
[task_options] command [arguments]
Jobs corresponding to the Session Scheduler session have one record in lsb.acct. This record represents the aggregate resource usage of all tasks in the allocation.
If task accounting is enabled with SSCHED_ACCT_DIR in lsb.params, Session Scheduler creates task accounting files for each Session Scheduler session job and appends an accounting record to the end of the file. This record follows a similar format to the LSF accounting file lsb.acct format, but with additional fields/
The accounting file is named jobID.ssched.acct. If no directory is specified, accounting records are not written.
The Session Scheduler accounting directory must be accessible and writable from all hosts in the cluster. Each Session Scheduler session (each ssched instance) creates one accounting file. Each file contains one accounting entry for each task. Each completed task index has one line in the file. Each line records the resource usage of one task.
Field |
Description |
---|---|
Event type (%s) |
TASK_FINISH |
Version Number (%s) |
9.1.2 |
Event Time (%d) |
Time the event was logged (in seconds since the epoch) |
jobId (%d) |
ID for the job |
userId (%d) |
UNIX user ID of the submitter |
options (%d) |
Always 0 |
numProcessors (%d) |
Always 1 |
submitTime (%d) |
Task enqueue time |
beginTime (%d) |
Always 0 |
termTime (%d) |
Always 0 |
startTime (%d) |
Task start time |
userName (%s) |
User name of the submitter |
queue (%s) |
Always empty |
resReq (%s) |
Always empty |
dependCond (%s) |
Always empty |
preExecCmd (%s) |
Task pre-execution command |
fromHost (%s) |
Submission host name |
cwd (%s) |
Execution host current working directory (up to 4094 characters) |
inFile (%s) |
Task input file name (up to 4094 characters) |
outFile (%s) |
Task output file name (up to 4094 characters) |
errFile (%s) |
Task error output file name (up to 4094 characters) |
jobFile (%s) |
Task script file name |
numAskedHosts (%d) |
Always 0 |
askedHosts (%s) |
Always empty |
numExHosts (%d) |
Always 1 |
execHosts (%s) |
Name of task execution host |
jStatus (%d) |
64 indicates task completed normally. 32 indicates task exited abnormally |
hostFactor (%f) |
CPU factor of the task execution host |
jobName (%s) |
Always empty |
command (%s) |
Complete batch task command specified by the user (up to 4094 characters) |
lsfRusage (%f) |
All rusage fields contain resource usage information for the task |
mailUser (%s) |
Always empty |
projectName (%s) |
Always empty |
exitStatus (%d) |
UNIX exit status of the task |
maxNumProcessors (%d) |
Always 1 |
loginShell (%s) |
Always empty |
timeEvent (%s) |
Always empty |
idx (%d) |
Session Job Index |
maxRMem (%d) |
Always 0 |
maxRSwap (%d) |
Always 0 |
inFileSpool (%s) |
Always empty |
commandSpool (%s) |
Always empty |
rsvId (%s) |
Always empty |
sla (%s) |
Always empty |
exceptMask (%d) |
Always 0 |
additionalInfo (%s) |
Always empty |
exitInfo (%d) |
Always 0 |
warningAction (%s) |
Always empty |
warningTimePeriod (%d) |
Always 0 |
chargedSAAP (%s) |
Always empty |
licenseProject (%s) |
Always empty |
options3 (%d) |
Always 0 |
app (%s) |
Always empty |
taskID (%d) |
Task ID |
taskIdx (%d) |
Task index |
taskName (%s) |
Task name |
taskOptions (%d) |
Bit mask of task options:
|
taskExitReason (%d) |
Task exit reason:
|