General

General
Where can I download the sources and binaries of the Grid Workflow Execution Service?
At http://www.gridworkflow.org/kwfgrid/distributions/
Is the Grid Workflow Execution Service open source?
It is open source for non-commercial uTsage. For commercial usage, or if you want to (re-)use parts of the software in your own product, you must contact steffen.unger@first.fraunhofer.de to get a special license agreement. So the Grid Workflow Execution Service is not open source with respect to the Open Source Initiative's (OSI) definition of the term "open source", whereas the source is open in the sense, that everybody can download it. You find the license agreement at http://www.gridworkflow.org/kwfgrid/gwes/docs/license.html.
How should I pronounce "GWES"?
Within the K-Wf Grid project we pronounce it "ge-wes".
How should I pronounce "K-Wf Grid"?
Well, we are almost famous in the Grid community to have the most ugly acronym in history:-) But you could pronounce it "quif-grid".

Grid Workflow Description Language

Grid Workflow Description Language
What is "GWorkflowDL"?
GWorkflowDL is an acronym for "Grid Workflow Description Language". The GWorkflowDL is an XML-based description language for Grid workflows developed within the K-Wf Grid project.
What are these side effects? Do we have to assign a side effect for each operation?
No, you do not have to assign a side effect for each operation. We need this side effects just in order to be able to automatically compose a workflow that contains web service operations without real response values. For example: operation "void A.store()" writes its output data to a SQL database and returns without any return value. Then we need some side effect, e.g. "DataXY stored in SQL database" in order to use this operation in a workflow. The operation "String A.getData()" however does not need any side effect.
Why do the tokens on the input places contain real data?
We need tokens with real data in order to enact concrete workflows. In this case we have different (instances of) workflows for different input instances. However it is also possible to put an abstract token without any contents onto a place. Then some refinement tool or the user must provide the real values before running the workflow.
What is the owl attribute meant for?
The owl attribute can be used in order to annotate the corresponding element with some meta data. In the K-Wf Grid project we use this attribute in order to refer to a certain ontologic property. This property is then used in an RDQL query to the Grid Organizational Memory (kind of registry) for retrieving additional information about the element.
Which properties are supported within workflows?
activitiesCompleted (output):
    Number of completed activities within this workflow (e.g. for progress bar)

activitiesTerminated (output):
    Number of terminated activities within this workflow

birthdayMs (output):
    Time of the initialization of this workflow (in milliseconds since 1970)

complexity (output):
    Workflow Analysis: measure for the Karp-Miller-Tree complexity of this workflow

decision.1, decision.2, ... (output): Workflow Analysis: This workflow contains a decision
    (e.g., a conflict where two transitions compete for the same token). Conditions which
    resolve the decision are not regarded.
    "TAKE_CONFLICT" if input place contains only one token 
    "TAKE_CHOICE"   if input place contains more than one token
    "PUT_CONFLICT"  if output place has capacity for only one more token
    "PUT_CHOICE"    if output place has capacity for more than one token

domain (input):
    Class of application domain of this workflow (for WCT)

DN (output):
    The distiguished name of the workflow owner's credential

durationUndefinedMs (output):
    Duration in milliseconds status being undefined and not yet initialized.

durationInitiatedMs (output):
    Duration in milliseconds status being initiated and not running.

durationRunningMs (output):
    Duration in milliseconds status being running (not including active).

durationActiveMs (output):
    Duration in milliseconds status being active.

durationSuspendedMs (output):
    Duration in milliseconds status being suspended.

durationTotalMs (output):
    Total workflow duration in milliseconds.

endTimeMs (output):
    Time of the completion or termination of this workflow (in milliseconds since 1970)

error.1, error.2, ... (output):
    String with critical error message

faultManagementPolicy (input):
    "abortOnActivityTerminated" (default)
    "continueOnActivityTerminated"
    "suspendOnActivityTerminated"

isUnbounded (output):
    "true" if workflow is unbounded (e.g., may have infinit loop)
    "false" if workflow is bounded

occurrence.sequence (input,output):
    Stores the sequence of transitions. The format is the ID of the transition
    followed by a space.

redistributionOfFailedActivities (input):
    "true" (default) - use fault tolerance mechanism to restart failed activities
    "false"

resource.repository.dgrdl.collection (input):
     Name of the XML database collection, which contains the resource descriptions
     to be used during the resource matching process

status (output): Current status of the workflow
    "UNDEFINED"
    "INITIATED"
    "RUNNING"
    "SUSPENDED"
    "ACTIVE"
    "TERMINATED"
    "COMPLETED"
    "FAILED"

warn.1, warn.2, ... (output):
    String with non-critical warning message

workflow.branchingFactor (output):
    Defined as number of edges divided by number of places and transitions                

workflow.sequentialExecutionPathSizeMs (output):
    Defined as the sum of the total activity duration of all activities (in milliseconds)  

workflow.speedupActive (output):
    Defined as the sequential execution path size divided by the active workflow duration. Does not consider the time
    used in workflow status running, initiated, suspended, etc.

workflow.speedupTotal (output):
    Defined as the sequential execution path size divided by the makespan. The makespan is defined as the
    total workflow duration.

workflow.persistence (input):
    "true" (default) - store workflows in database
    "false"          - do not store workflows in database
                
Which properties are supported within transitions?
aab.refinement.failed (output):
    depricated -> yellow2blue.refinement.failed

activityHasFailed (output):
    "true": A activity related to this transition failed and the fault management tries to recover
            the failure.

breakpoint (input/output): If a transition contains a breakpoint property, the GWES suspends the
    execution of the workflow when reaching the transition. The GWES then sets the property value to
    "REACHED". When resuming the workflow, the GWES sets the value to "RELEASED" until the
    breakpoint is reached again. Valid values are:
    empty
    "REACHED"
    "RELEASED"

combine.data.groups (input):
    The transition will generate output tokens with a "data.group" property that is a combination of the input token
    data groups. If, e.g., a token from one input place belongs to data.group="a" and the token from a second input
    place belongs to data.group="b", then the output token will get data.group="a x b".

decision.1, decision.2, ... (output): Workflow Analysis: This transition is involved in a decision
    (e.g., a conflict where two transitions compete for the same token). Conditions which
    resolve the decision are not regarded.
    "TAKE_CONFLICT" if input place contains only one token
    "TAKE_CHOICE"   if input place contains more than one token
    "PUT_CONFLICT"  if output place has capacity for only one more token
    "PUT_CHOICE"    if output place has capacity for more than one token

durationActiveMs.max (output):
    Maximum value of the duration in milliseconds for a activity being in status "active" regarding all activities
    triggered by this transition.

durationActiveMs.mean (output):
    Arithmetic mean value of the duration in milleseconds of the activities being in status "active". This mean value
    is calculated as average over all activities triggered by this transition.

durationActiveMs.stdDeviation (output):
    Standard deviation of durationActiveMs.mean.

durationActiveMs.min (output):
    Minimum value of the duration in milliseconds for a activity being in status "active" regarding all activities
    triggered by this transition.

durationTotalMs.max (output):
    Maximum value of the total activity duration in milliseconds of all activities triggered by this transition.

durationTotalMs.mean (output):
    Arithmetic mean value of the total activity duration in milliseconds calculated as average over all activities
    triggered by this transition.

durationTotalMs.stdDeviation (output):
    Standard deviation of durationTotalMs.mean.

durationTotalMs.min (output):
    Minimum value of the total activity duration in milliseconds of all activities triggered by this transition.

ignore.data.groups (input): Ignore "data.group" property from specific tokens
    "read": Ignore "data.group" property from tokens on read places
    "control": Ingore "data.group" property from control tokens

instanceGroupID (input):
    depricated -> resource.allocation.group

isQuasiLive (output):  Workflow Analysis: "false" if the transition will never fire
    "true"
    "false"

last.operation.name (output):
    Last operation name used by this transition.

last.resource.name (output):
    Last resource name used by this transition.

numberActivities (output):
    Number of completed or terminated activities triggered by this transition  

priority (input):
    priority of this transition (high number = high priority = comes first if possible!)

red2yellow.refinement.failed (output):
    "true" if automatic workflow refinement from red to yellow failed.

remove.resource.allocation.group (input):
    Remove the resource.allocation.group property from the output tokens of this transition.
    This is used when the spacial co-scheduling should be resetted within workflows, e.g., in
    iterations

resource.allocation.group (input):
    String (without spaces) that defines the instance group to which this transition belongs to.
    Transitions with the same instance group that are connected by arcs are scheduled on the same
    resources (spacial co-scheduling)

resourcematcher.refinement.failed (output):
    depricated -> yellow2blue.refinement.failed

revertedToYellow (output): Fault management
    "true"

status (output): status of the last activity triggered by this transition
    "UNDEFINED"
    "INITIATED"
    "RUNNING"
    "SUSPENDED"
    "ACTIVE"
    "TERMINATED"
    "COMPLETED"
    "FAILED"

timeout (input):
    Default value for timeout.running and timeout.active.

timeout.running (input):
    Timeout for this activity in state running or active in milliseconds (including waiting time).

timeout.active (input):
    Timeout for this activity in state active in milliseconds (not including waiting time).

wct.refinement.failed (output): depricated -> red2yellow.refinement.failed

yellow2blue.refinement.failed (output):
    "true" if automatic workflow refinement from yellow to blue failed.

xmlns:XXX (input):
    This property maps an XML namespace prefix onto a specific namespace URI. Replace "XXX"
    by the namespace prefix string. The namespace prefixes defined by this property can be used
    within the conditions and edgeExpressions of this transition.
    The following namespace prefixes are predefined:
      xmlns:tns=[target namespace as defined in the correspondig WSDL]
      xmlns:gwdl="http://www.gridworkflow.org/gworkflowdl"
      xmlns:soapenv="http://www.w3.org/2003/05/soap-envelope"
      xmlns:xsd="http://www.w3.org/2001/XMLSchema"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
                
Which properties are supported within places?
decision.1, decision.2, ... (output): Workflow Analysis: This place is involved in a decision
    (e.g., a conflict where two transitions compete for the same token). Conditions which
    resolve the decision are not regarded.
    "TAKE_CONFLICT" if input place contains only one token
    "TAKE_CHOICE"   if input place contains more than one token
    "PUT_CONFLICT"  if output place has capacity for only one more token
    "PUT_CHOICE"    if output place has capacity for more than one token

isFinallyMarked (output): Workflow Analysis: "true" if the place will be marked when the workflow
    completes.
    "true"
    "false"

isQuasiLive (output): Workflow Analysis: "false" if the place can never get a token
    "true"
    "false"
                
Which properties are supported within tokens?
data.group (input):
    String that identifies the group to which the data on the token belongs. If a transition has serveral input or
    read places it tries to group tokens with the same data.group identifier. If a token does not contain a data.group
    identifier, then it is treated as an arbitrary group (wild card).

reservedByTransition (input/output):
    The transition ID that reserves this token. Used by the spatial co-allocation algorithm.  

resource.allocation.group.XXX (input/output):
    Spatial co-allocation: The property value is the resource name where the corresponding resource allocation group
    has been mapped onto. "XXX" denotes the identifier specified by the transition property "resource.allocation.group".
                

Troubleshooting

Troubleshooting
Which are the exit codes of the components that start remote command line programs?
Here a list of pre-defined exit codes:
  • 0: (All components) OK
  • 1: (Nagios test script) WARNING
  • 2: (Nagios test script) CRITICAL
  • 3: (Nagios test script) UNKNOWN
  • 96: (GWES cli start script) The stdin is not available or not readable
  • 97: (GWES cli start script) The executable is not available or not executable
  • 98: (GWES cli start script) The executable has not been specified
  • 99: (GWES cli start script) Working directoy has not been specified
  • 127: (WS-GRAM:JobManager.pm) A child error code with an exit code of 127 indicates that the application could not be run.
  • 128+: (Linux) Program has been interrupted by signal X-128, e.g., 143-128 = 15 = SIGTERM.
  • 153: (PBS:qstat) Unknown Job Id
How can I debug the GWES if there are problems? Which are possible sources of errors? Where do I find the log files?
Here a checklist for detecting possible sources of errors:
  • Check the error and warn messages in the workflow details tab (GWES servlet or portlet).
  • Check the XML resource database: Are all resource descriptions up to date? If not, then restart the corresponding resource updater daemon.
  • Check the XML resource database for double entries of properties using the eXist Java webstart client.
  • Check $CATALINA_HOME/logs/gwes.log (search for "OutOfMemory" and "Exception")
  • Check $CATALINA_HOME/logs/resmatch.log
  • Check $CATALINA_HOME/logs/catalina.out
  • Check ganglia output
  • Check for free space on all relevant partitions (df -h)
  • For more debugging modify the file $CATALINA_HOME/webapps/gwes/WEB-INF/classes/log4j.properties and restart the tomcat container.