
|
HOWTO
|
- Introduction
- User guide
- Description of the different commands
- Visualisation tools
- Admin guide
- Admin commands
- Database scheme
- Some tricks
1. Introduction
OAR is a resource manager or (batch
scheduler) for large clusters. In functionnalities, it's near of
PBS, LSF, CCS and Condor. It's suitable for productive plateforms
and research experiments.
2.
User guide
2.1.
Description of the different commands
All user commands are installed on cluster login
nodes. So you must connect to one of these computers first.
This command
prints jobs in execution mode on the terminal.
Options:
-f prints each job
in full details
-a prints more
details and keeps table format
Examples:
# oarstat
# oarstat -f
This command prints informations
about cluster nodes (state, which jobs on which nodes, node properties,
...)
Example:
# oarnodes
The user can submit a job with
this command.
So what is a job in our context?
A job is defined by
needed resources and a script/program to run. So, the user must specify
how many nodes and what kind of resources needed by his application.
Thus, OAR system will give him or not what he wants and will
control the execution.
When a job is
launched, OAR executes user program only on the first reservation node.
So this program can access some environnement variables to know its
environnement:
$OAR_NODEFILE contains the name of a file which lists
all reserved nodes for this job
$OAR_JOBID contains
the OAR job identificator
$OAR_NB_NODES contains the number of reserved nodes
Options:
-q queuename :
specify the queue for this job
-I : turn on
INTERACTIVE mode (OAR gives you a shell instead of executing a script)
-l : defines
resource list requested for this job; the different parameters are:
nodes
: request number of nodes
weight
: the weight that you want to reserve on each node
walltime : Request maximun time. Format is [hour:mn:sec|hour:mn|hour];
after this elapsed time, the job will be killed
-p
"properties" : specify with SQL syntax reservation properties
-r
"2004-05-11 23:32:03" : ask for a reservation job to begin at the date
in argument
-c jobId : connect
to a reservation in Running state
-v : turn on
verbose mode
Exemples:
# oarsub test.sh
(the "test.sh"
script will be run on 1 node of default weight in the default queue
with a walltime of 1 hour)
# oarsub -l
nodes=2,walltime=2:15:00 test.sh
(the "test.sh"
script will be run on 2 nodes of default weight in the default queue
with a walltime of 2:15:00)
# oarsub -p
"hostname = 'host2' OR hostname = 'host3'" test.sh
(the "test.sh"
script will be run on the node host2 or on the node host3)
# oarsub -I
(gives a shell on a
node)
The user can delete his jobs
with this command.
Exemples:
# oardel 14
(delete job 14)
2.2.
Visualisation tools
This is a web cgi normally
installed on the cluster frontal. This tool executes oarnodes and
oarstat then format data in a html page. Thus you can have a global
view of cluster state and where your jobs are running.
(Monika
screenshot)
This is also a web cgi. It
creates a Gantt chart which shows job repartition on nodes in the time.
It is very usefull to see cluster occupation in the past and to know
when a job will be launched in the futur.
(DrawOARGantt
screenshot)
3.
Admin guide
3.1.
Admin commands
This comman must be run by oar
user. It change node state dynamically or add a new node in OAR
database if it does not already exist.
Options:
-s : specify the
new node state (Alive, Absent or Dead)
-h : specify node
name
-w : specify
mawWeight for this node. This option is relevant only if this is a new
node otherwise it is not interpreted.
Exemple:
# oarnodesetting -s
Alive -h host1.imag.fr -w 2
(add a new node
"host1" in OAR database with a maxWeight od 2)
# oarnodesetting -s
Absent -h host1.imag.fr
(turn node "host1"
in Absent state. So it will be unaccessible in OAR)
3.2.
Database scheme
TABLE jobs :
Each oarsub inserts a new line in
this table.
idJob INT
UNSIGNED NOT NULL AUTO_INCREMENT
job identity number. it is given to users when they
submit a job.
jobType
ENUM('INTERACTIVE','PASSIVE') DEFAULT 'PASSIVE' NOT NULL
INTERACTIVE means that the user ask for a shell on
the reserved nodes. PASSIVE means that the job is a script/executable
to run on the reserved nodes.
infoType
VARCHAR( 255 )
string with syntax "host:port". This field is not
NULL for interactive jobs. "host" is the host where oarsub command was
launched and port is the socket port where oarsub waits for a
connection. When the interactive job is run, OAR connects to the oarsub
socket to wake it up and launch its bipbip (log on the reserved nodes).
state
ENUM('Waiting','Hold','toLaunch','toError','toAckReservation','Launching','Running','Terminated','Error')
NOT NULL
this is the job state.
message
VARCHAR( 255 )
log message. This is usefull when a job is in ERROR
and we want to know why.
user VARCHAR(
20 ) NOT NULL
user name.
nbNodes INT
UNSIGNED NOT NULL
number of nodes to reserve.
weight INT
UNSIGNED NOT NULL
weight to reserve on each node
command
VARCHAR( 255 ) NOT NULL
the command to launch if it is a PASSIVE job.
bpid VARCHAR(
255 )
string with syntax "host:pid:port". "host" is the
hostname where bipbip is launched. "pid" is the process id of bipbip.
"port" is the socket port which is opened by bipbip. Leon connects to
this socket to give kill instructions.
queueName
VARCHAR( 100 ) NOT NULL
queue used for this job.
reservation
ENUM('None','toSchedule','Scheduled') DEFAULT 'None' NOT NULL ,
This is for job in reservation mode (different
states for a reservation job).
maxTime TIME
NOT NULL
walltime for this job.
properties
VARCHAR( 255 )
this string is a sub-request to give constraints on
nodes that this job wants. It is a "WHERE" clause SQL syntaxe on the
table nodeProperties.
launchingDirectory VARCHAR( 255 ) DEFAULT ' ' NOT NULL
This the folder where user has launched oarsub
command.
submissionTime
DATETIME NOT NULL
Time when the job was inserted in the database.
startTime
DATETIME NOT NULL
Time when the job was started its execution.
stopTime
DATETIME NOT NULL
Time when the job was finished or killed.
TABLE admissionRules :
This table is used when a
new job is submitted. You can give default behavior when all properties
are not given by the user. For example you can specify a default
walltime when it is not set on the oarsub command line.
rule VARCHAR(
255 ) NOT NULL
this a string in Perl langage.
TABLE nodes :
This table contains node
informations.
hostname
VARCHAR( 100 ) NOT NULL
node name.
state
ENUM('Alive','Dead','Suspected','Absent') NOT NULL
node state:
- Alive : this node can be
reserved
- Absent : the node is not in
pool but will come back soon.
- Dead : this node is out of
order and will not come back soon.
- Suspected : OAR suspects that
the node is down.
maxWeight INT
UNSIGNED DEFAULT 1 NOT NULL
maximum weight for the node. For example you can
give a weight of 2 for a dual processor computer, thus users can
reserved half nodes.
weight INT
UNSIGNED NOT NULL
current weight used on this node.
nextState
ENUM('UnChanged','Alive','Dead','Absent','Suspected') DEFAULT
'UnChanged' NOT NULL
this field is used for dynamic nodes. When a node
wants to change its state, this field is set to the next state and OAR
manages this action. For example OAR can kill some jobs when a node go
out.
TABLE CREATE nodeState_log :
hostname
VARCHAR( 100 ) NOT NULL
node hostname.
changeState
ENUM('Alive','Dead','Suspected','Absent') NOT NULL
after change node state.
date DATETIME
NOT NULL
event date.
TABLE nodeProperties :
This table specify some node
properties. You can add all properties that you want (just add new
fields).
hostname
VARCHAR( 100 ) NOT NULL
hostname that you can find in the nodes table.
besteffort
ENUM('YES','NO') DEFAULT 'YES' NOT NULL
This property indicates if a node accepts or not
besteffort job.
TABLE processJobs :
This table links current jobs and
nodes (you can know where a is launched)
idJob INT
UNSIGNED NOT NULL
job identity
hostname
VARCHAR( 100 ) NOT NULL
node where the job is running
TABLE processJobs_log :
This is the same table as
processJobs but it contains old jobs.
idJob INT
UNSIGNED NOT NULL
hostname
VARCHAR( 100 ) NOT NULL
TABLE fragJobs :
When a job is killed, this table
is set up
fragIdJob INT
UNSIGNED NOT NULL
job identity to kill
fragDate
DATETIME NOT NULL
request date.
fragState
ENUM('LEON','TIMER_ARMED','LEON_EXTERMINATE','FRAGGED') DEFAULT 'LEON'
NOT NULL
job kill state:
- LEON : "soft" Leon must be run
on this job.
- TIMER_ARMED : a Leon was
launched and we wait the end of this job.
- LEON_EXTERMINATE : "hard" Leon
must be run on this job.
- FRAGGED : job is fragged,
nothing to do.
TABLE queue :
This table give the right
scheduler for a queue.
queueName
VARCHAR( 100 ) NOT NULL
queue name that you can also find in the job table.
priority INT
UNSIGNED NOT NULL
queue priority.
schedulerPolicy VARCHAR( 100 ) NOT NULL
program name that corresponds to the scheduler which
implements this policy.
state
ENUM('Active','notActive') NOT NULL DEFAULT 'Active'
you can activate or not a queue.
TABLE ganttJobsPrediction :
This table store scheduler
decisions. You can know when a job will start.
idJob INT
UNSIGNED NOT NULL
job identity.
startTime
DATETIME NOT NULL ,
date when the job will start.
TABLE ganttJobsNodes :
This table indicates which nodes
a job will be assigned to a job
idJob INT
UNSIGNED NOT NULL
job identity.
hostname
VARCHAR( 100 ) NOT NULL
assigned node by scheduler
DEFAULT DATA IN DATABASE:
INSERT IGNORE INTO
`admissionRules` ( `rule` ) VALUES ('if (not defined($maxTime))
{$maxTime = "1:00:00";}');
The default
walltime is 1 hour.
INSERT IGNORE INTO
`admissionRules` ( `rule` ) VALUES ('if (not defined($queueName))
{$queueName="default";}');
The default
job queue is default.
INSERT IGNORE INTO
`admissionRules` ( `rule` ) VALUES ('if ((defined($maxTime)) &&
($jobType eq "INTERACTIVE") &&
(sql_to_duration($maxTime) > sql_to_duration("12:00:00"))) {$maxTime
= "12:00:00";}');
The maximum
walltime for an interactive job is 12 hours.
INSERT IGNORE INTO
`admissionRules` ( `rule` ) VALUES ('if (($queueName eq "admin")
&& ($user ne "oar")) {$queueName="default";}');
oar user can
use the admin queue. So he can pass before all waiting other jobs.
INSERT IGNORE INTO `queue`
(`queueName` , `priority` , `schedulerPolicy`) VALUES
('default','1','oar_sched_fifo_queue_killer');
Define the
scheduler to use for the default queue.
INSERT IGNORE INTO `queue`
(`queueName` , `priority` , `schedulerPolicy`) VALUES
('besteffort','0','oar_sched_fifo_queue');
Define the
scheduler to use for the besteffort queue.
3.3.
Some tricks