run(self,
mode='local',
dataDirRoot='.',
isContinue=False,
isForceContinue=False,
nCores=None,
memMb=None,
isDryRun=False,
retryMax=2,
retryWait=90,
retryWindow=360,
retryMode='nonlocal',
mailTo=None,
updateInterval=60,
schedulerArgList=None,
isQuiet=False,
warningLogFile=None,
errorLogFile=None,
successMsg=None,
startFromTasks=None,
ignoreTasksAfter=None,
resetTasks=None)
|
|
Call this method to execute the workflow() method overridden in a
child class and specify the resources available for the workflow to
run.
Task retry behavior: Retry attempts will be made per the arguments
below for distributed workflow runs (eg. sge run mode). Note this means
that retries will be attempted for tasks with an 'isForceLocal' setting
during distributed runs.
Task error behavior: When a task error occurs the task manager stops
submitting new tasks and allows all currently running tasks to complete.
Note that in this case 'task error' means that the task could not be
completed after exhausting attempted retries.
Workflow exception behavior: Any exceptions thrown from the python
code of classes derived from WorkflowRunner will be logged and trigger
notification (e.g. email). The exception will not come down to the
client's stack. In sub-workflows the exception is handled exactly like a
task error (ie. task submission is shut-down and remaining tasks are
allowed to complete). An exception in the master workflow will lead to
workflow termination without waiting for currently running tasks to
finish.
- Parameters:
mode - Workflow run mode. Current options are (local|sge)
dataDirRoot - All workflow data is written to {dataDirRoot}/pyflow.data/ These
include workflow/task logs, persistent task state data, and
summary run info. Two workflows cannot simultaneously use the
same dataDir.
isContinue - If True, continue workflow from a previous incomplete run based
on the workflow data files. You must use the same dataDirRoot as
a previous run for this to work. Set to 'Auto' to have the run
continue only if the previous dataDir exists. (default: False)
isForceContinue - Only used if isContinue is not False. Normally when isContinue is
run, the commands of completed tasks are checked to ensure they
match. When isForceContinue is true, failing this check is
reduced from an error to a warning
nCores - Total number of cores available, or 'unlimited', sge is currently
configured for a maximum job count of 128, any value higher than
this in sge mode will be reduced to the maximum. (default: 1 for
local mode, 128 for sge mode)
memMb - Total memory available (in megabytes), or 'unlimited', Note that
this value will be ignored in non-local modes (such as sge),
because in this case total memory available is expected to be
known by the scheduler for each node in its cluster. (default:
2048*nCores for local mode, 'unlimited' for sge mode)
isDryRun - List the commands to be executed without running them. Note that
recursive and dynamic workflows will potentially have to account
for the fact that expected files will be missing -- here
'recursive workflow' refers to any workflow which uses the
addWorkflowTask() method, and 'dynamic workflow' refers to any
workflow which uses the waitForTasks() method. These types of
workflows can query this status with the isDryRun() to make
accomadations. (default: False)
retryMax - Maximum number of task retries
retryWait - Delay (in seconds) before resubmitting task
retryWindow - Maximum time (in seconds) after the first task submission in
which retries are allowed. A value of zero or less puts no limit
on the time when retries will be attempted. Retries are always
allowed (up to retryMax times), for failed make jobs.
retryMode - Modes are 'nonlocal' and 'all'. For 'nonlocal' retries are not
attempted in local run mode. For 'all' retries are attempted for
any run mode. The default mode is 'nonolocal'.
mailTo - An email address or container of email addresses. Notification
will be sent to each email address when either (1) the run
successfully completes (2) the first task error occurs or (3) an
unhandled exception is raised. The intention is to send one
status message per run() indicating either success or the reason
for failure. This should occur for all cases except a host
hardware/power failure. Note that mail comes from
'pyflow-bot@csaunders-ubuntu64' (configurable), which may be
classified as junk-mail by your system.
updateInterval - How often (in minutes) should pyflow log a status update message
summarizing the run status. Set this to zero or less to turn the
update off.
schedulerArgList - A list of arguments can be specified to be passed on to an
external scheduler when non-local modes are used (e.g. in sge
mode you could pass schedulerArgList=['-q','work.q'] to put the
whole pyflow job into the sge work.q queue)
isQuiet - Don't write any logging output to stderr (but still write log to
pyflow_log.txt)
warningLogFile - Replicate all warning messages to the specified file. Warning
messages will still appear in the standard logs, this file will
contain a subset of the log messages pertaining to warnings only.
errorLogFile - Replicate all error messages to the specified file. Error
messages will still appear in the standard logs, this file will
contain a subset of the log messages pertaining to errors only.
It should be empty for a successful run.
successMsg - Provide a string containing a custom message which will be
prepended to pyflow's standard success notification. This message
will appear in the log and any configured notifications (e.g.
email). The message may contain linebreaks.
startFromTasks (A single string, or set, tuple or list of strings) - A task label or container of task labels. Any tasks which are not
in this set or descendants of this set will be marked as
completed.
ignoreTasksAfter (A single string, or set, tuple or list of strings) - A task label or container of task labels. All descendants of
these task labels will be ignored.
resetTasks (A single string, or set, tuple or list of strings) - A task label or container of task labels. These tasks and all of
their descendants will be reset to the "waiting" state
to be re-run. Note this option will only affect a workflow which
has been continued from a previous run. This will not override
any nodes altered by the startFromTasks setting in the case that
both options are used together.
- Returns:
- 0 if all tasks completed successfully and 1 otherwise
|