Thursday, January 24, 2008

JS API

Google Doc address: http://docs.google.com/Doc?id=dmxthpg_102fvp7vvhk


(1) Workflow:

Description:

This class represents a workflow. A workflow can contain more than one job. Note: a job in workflow can be another workflow.

Functions:

Workflow.addJob( jobname, job )

A new job is appended to this workflow.

Workflow.deleteJob( jobname )

The job with the name specified by parameter jobname is deleted. As a result, all related dependency is deleted as well.

Workflow.listJobs( )

List all jobs in a workflow.

Workflow.searchByName( jobname )

search job in terms of name.

Workflow.addDependency( jobparent, jobchild )

Add dependency which says that job jobparent should be executed before job jobchild.

Workflow.removeDependency( jobparent, jobchild )

Remove dependency between job jobparent and job jobchild.

Workflow.toggleDependency( jobparent, jobchild )

Toggle dependency. If there is a dependency, remove it. If there is not, create a new one.


(2) WFQueue

Description:

This class represents a workflow queue in which all workflows have no dependency on each other. In other words, workflows in WFQueue can be executed in arbitrary order.

Functions:

WFQueue.addWorkflow( name, workflow )

Append the workflow specified by parameter to a workflow queue.

WFQueue.removeWorkflowByIndex( index )

Remove a workflow according to parameter index.

WFQueue.removeWorkflowByName( name )

Remove a workflow with name specified by parameter name.

WFQueue.clearAll( )

Remove all workflows in the queue.

WFQueue.searchByName( name )

Return the index of the workflow with the name specified by parameter. If no

corresponding workflow exists, return -1.


(3) Executor

Description:

This class mainly submits workflow to remote server.

Functions:

Executor.Executor( serverURL )

Constructor of class Executor. The parameter specifies URL of remote server where following submissions will be sent.

Executor.submitWF( workflow );

Submit workflow to remote server.

Executor.submitWFQueue( workflow );
Submit workflow queue to remote server.

Executor.transfer( from, to )

Shortcut for transferring a file. Note: this task can be done by putting a corresponding job into a workflow, then submitting the workflow to remote server. However, job “transfer” is used so frequently that I decide to separate it.


(4) Queryer

Description:

This class is used to query state of workflows submitted by end users.

Functions:

Queryer.query( username );

Get state of all workflows submitted by a user.

Queryer.query( username, workflowid )

Get state of a specified workflow submitted by a user.

Queryer.query( username, workflowids )

Get state of specified workflows submitted by a user.


(5) Authenticator

Functions:

Authenticator.login( username, password )

Authenticator.logout( username, [ password ] )

Sunday, January 20, 2008

Add Support for Job Dependency Edit

Lately, I have been working on support for client-side job queue management and job dependency management.
This two parts can be designed and implemented separately. But I think putting them together is better which is more user-friendly.
Before jobs are submitted, they are maintained at client side.
Currently, following functionalities are supported:
(1) add a job to job queue
(2) remove a job from job queue
(3) edit dependency between jobs.
To make the system easy to use, I provide visual widget interface.

Main interface:
job_management_small
Job addition:
After a use inputs the workflow description in the text area and workflow name in the text field, he/she can add the workflow to job queue by clicking button "Add to Queue". If a job with that name has already existed, a prompt window is popped up. Then user can choose to overwrite current job or modify the name.
By clicking button "Job Management" or tab "Job Management", user can be redirected to job management panel.
Note: name of every workflow must be specified. In other words, value of text filed "Workflow Name" can not be left blank. Moreover, different workflows/jobs can not have a same name. So name of every job must be different.

Job Management Panel:
As described in the picture, every rectangle represents a job and every line with arrow represents dependency between two jobs.
When a use adds job1, job2,...jobn, the default relation is that job2 depends on job1, job3 depends on job2 and jobn depends on jobn-1.
When you move curse over a rectangle for a few seconds, a pop-up window is displayed which contains the content of that job.
job_dependency_panel 

When you right click (click with right button) a rectangle, a context menu is displayed. This menu contains several items.
job_dependency_panel_contextmenu
Currently, items are "from", "to", "delete" and "edit".
(1) If menu item "delete" is clicked, the corresponding job will be deleted permanently from job queue.
   When a job is deleted, all related dependency is deleted as well. There are two kinds of dependency: one is that some other jobs depend on this job; the other is that this job depend on some other jobs.
(2) If menu item "from" is clicked, the corresponding job is marked as the starting point of dependency. Assume it is called parentJob.
(3) If menu item "to" is clicked, the corresponding job is markded as the end point of dependency. Assume it is called childJob.
Then there are three possible steps:
  (3.1) If parentJob is null.
    In this case, it means the user has not selected a job by clicking menu item "from". Then nothing will happen.
  (3.2) If job parentJob is not prerequisite of job childJob.
    And then a line is drawn from the rectangle which represents job parentJob and the rectangle which represents job childJob. And job parentJob is considered as prerequisite of job childJob.
  (3.3) If job parentJob is already prerequisite of job childJob.
    In this case, there must be an existing line drawn from the rectangle which represents job parentJob and the rectangle which represents job childJob. Then this relation is deleted and the line is removed from display.
(4) If menu item "edit" is clicked, the system will redirect user to the workflow edit panel.

User can use drag-and-drop to move the rectangles to anywhere in the screen. Related lines (two situations: one is that some lines starts from this job; the other is some lines ends at this job) are moved as well. Note: You can not use drag-and-drop to move the lines.
Following picture is a sample job dependency graph I got:
job_dependency_sample
Next step:
Currently, all operations above are carried out at client side and no interaction with server is involved.
Issue:
Next step is related to how to send the job queue to server side.
In Karajan, I don't think workflow-level composition is supported directly. However, Karajan provides two elements parallel and sequential which can control the execution sequence of subtasks in a workflow.
So one idea is that all jobs in a job queue are put into a single big workflow which uses elements parallel and sequential to represent original relationship.  One natural question is that whether elements parallel and sequential are enough to express any possible relationship among jobs.
My answer is no.
For the job dependency shown above, I can not think of a way to represent it  with Karajan elements parallel and sequential.
Solution 1:
So, if my conclusion is correct, we can implement a sub system which manages sequence of job submission to underlying grid infrastructure. In my opinion, it is better to be put at server side.
Solution 2:
Aother solution is that we can simplify this issue at cost of losing performance. We can get a job submission sequence by using topological sort. In other words, all jobs are submitted sequentially. Obviously, performance is not the best because some jobs actually can be executed in parallel.

Friday, January 11, 2008

Tutorial

(1) Workflow submission

ui_annotated_1

(1.1)If you already have an existing workflow, you can just paste it into the input area and click button “WS Workflow Submission”.

(1.2)If you just want to get familiar with the user interface and content of workflow does not matter, you can click button “click to see example” and you will see a pop-up window which contains a sample window. Then you can copy it and paste it into the input area. Now, you can submit it. The pop-up window will be hidden if you press “Esc” key or if you click anywhere else which results in that the pop-up window loses focus.

After workflow is submitted, response from server is displayed in the area labeled as “Response from server”.

Note: response is appended to the content of the output area. So if you want to discard the existing response, you should clear it first by clicking button “clear”.

Sample response is:
-------------------------------------------------
(This is done by using web service with status monitoring enabled)
ID for the workflow you just submitted is:
test_79
You can use it as a handle to check its status.
As you see, the workflow id for the workflow you just submitted is returned. And then you can check its status.

(2) Workflow status query

ui_annotated_2

(2.1) If you want to query status of all submitted workflows belonging to your, just click button “Get State of All Workflows”.

(2.2) If you want to query status of specified workflows, you should input ids of the workflows status of which you want to query. Then click button “Get State of A Workflow”. Note: name of the button is kind of misleading. Actually, more than one workflow can be queried every time. Multiple workflow ids should be separated by line feed, blank space and tab.

Note: response is appended to the content of the output area. So if you want to discard the existing response, you should clear it first by clicking button “clear”.

(3) Workflow composition

When the widget toolbox is expanded, it looks like:

wf_composition1

When you move cursor over a certain element in the panel, a pop-up window is displayed to show brief description of that element. When you move cursor out of the element, that pop-up window disappears.

wf_composition3

If you want to insert an element into the workflow, just click the corresponding element in the toolbox panel. If that element does not have parameter, corresponding xml snippet is directly inserted into the workflow. If the element has attributes which need to be set, a window will be popped up. For element “execute”, the pop-up window looks like this:

wf_composition2

You can specify values of various attributes and insert it into workflow by clicking button “Save”. You don’t need to specify values of all attributes. In other words, you just need to set values of those attributes you need.

The xml snippet is not simply appended to existing content of the input area. Instead it is inserted into current caret position in the input area. Moreover, the xml snippet can be inserted to enclose your selected text.

Wednesday, January 09, 2008

Milestone

Recent effort:
(1) add functionality that user can query status of a workflow based on combination of username and workflow id. Moreover, users can query more than one workflow in a single query.
(2) modify user interface.
    add tab panel so that submission panel and status query panel are separated.
(3) Modify the stuff returned after a workflow is submitted.
    Originally, after user submits a workflow, nothing will be returned until the workflow
    is executed completely.
    Now, after user submits a workflow, the workflow id is returned. And the workflow
    id can be used as a handle to query status of the workflow.
(4) Modify the location of configuration file.
    Modify the method by which configuration file is located(Absolute path -> relative path).
    Here it took me lots of time because it is not so easy to get the current working directory
    in Axis2.

So, now the whole system satisfies our basic requirements.
Client side:
(1) workflow submission
(2) workflow status query
(3) user-friendly visual widget support for workflow composition
Server side:
(1) simple user management
(2) workflow execution ( by using CoG kit)
(3) workflow status service

Possible future work:
(1) More sophisticated user management system
(2) Security
(3) To support more powerful status query.
    For example, query which workflows are completed, query which workflows are started but not completed, query which workflows have not started to execute ...