Showing posts with label JavaCOG. Show all posts
Showing posts with label JavaCOG. Show all posts

Friday, November 12, 2010

JsUnit Maven Plugin

Document is at http://jsunit.berlios.de/maven2.html. It's too brief, especially following paragraph:

The type of the test suite, one of the following values:

ALLTESTS
Looks for a class AllTests derived from TestSuite and runs its suite.
TESTSUITES
Looks for all classes ending with TestSuite and that are derived from TestSuite and run their suites.
TESTCASES
Looks for all classes ending with TestCase and that are derived from TestCase and runs them (the default).

The problem is what "derived from" means and how to to that. I will show in detail how to use JsUnit plugin.

1) sample test file

Following is a dummy test file. It should be put into src/test/js.

var dummyobj = dummyobj || {};

function DummyTest(name) {
  TestCase.call(this, name);
};

DummyTest.inherits(TestCase);

DummyTest.prototype.setUp = function() {
    dummyobj.name = "Gerald";
};

DummyTest.prototype.tearDown = function() {
    delete dummyobj.name;
};

DummyTest.prototype.testDummy = function() {
  this.assertEquals('Gerald', dummyobj.name);
};

DummyTest is the test case.
It "inherits" from class TestCase. All of its functions whose name start with "test" will be tested.

2) Inherit implementation

Following is implementation of inherit borrowedfrom Shindig code. It can be put in a file inherit_implementation.js under directory src/main/js.

Function.prototype.inherits = function(parentCtor) {
    function tempCtor() {};
    tempCtor.prototype = parentCtor.prototype;
    this.superClass_ = parentCtor.prototype;
    this.prototype = new tempCtor();
    this.prototype.constructor = this;
};

3) Pom.xml

<plugin>
    <groupId>de.berlios.jsunit</groupId>
    <artifactId>jsunit-maven2-plugin</artifactId>
    <executions>
        <execution>
            <id>test</id>
            <configuration>
                <sourceDirectory>${basedir}/src/main/js</sourceDirectory>
                <sources>
                    <source>inherit_implementation.js</source> 
<source>file_to_be_tested.js</source> </sources> <testSourceDirectory>${basedir}/src/test/js</testSourceDirectory> <testSuites> <testSuite> <name>SampleSuite</name> <type>TESTCASES</type> <includes> <include>*/*test.js</include> </includes> </testSuite> </testSuites> </configuration> <goals> <goal>jsunit-test</goal> </goals> </execution> </executions> </plugin>

Resources

For some tests, you need to provide fake window object, XmlHttpRequest object, DOM objects, etc. Project env-js (site) is exactly designed for this purpose. It provides a simulated browser environment.

Thursday, February 21, 2008

DAG construction

After communication with Mike, I got the dag.k file. Then I implemented the construction of DAG workflow based on all jobs and their relationship in a workflow. In other words, I put together all this information in a large Karajan workflow by using DAG. Note: this work is done at client side, not at server side. After construction of the huge workflow, it can be submitted to server just as a small job. Then id of the new submitted workflow will be returned. Based on this workflow id, a user can query state of the workflow. In addition, the user can access output files of the workflow by using common HTTP GET request.

More details about DAG construction:
Assume that we have four jobs in a workflow: job1, job2, job3, job4.
And their dependencies are:
job1 -> job2 ( this means job2 depends on job1 )
job1 -> job3
job3 -> job4

job3 -> job5
job1 -> job5
These dependencies are represented in following graph:
sample_workflow 
Then constructed DAG workflow looks like this:
<project>
<include file="cogkit.k"/>
<include file="dag.k"/>
<discard>
    <dag>
        <node>
            <string>job1</string>            //Here is name of the job.
            <element>
                <quotedlist/>
               content of job1
            </element>
            <edges>
                <string>job2</string>         //Here, it describes that job1 is prerequisite of job2, job3 and job5
                <string>job3</string>
                <string>job5</string>
            </edges>
        </node>
        <node>
            <string>job2</string>
            <element>
                <quotedlist/>
                content of job2
            </element>
        </node>
        <node>
            <string>job3</string>
            <element>
                <quotedlist/>
                content of job3
            </element>
            <edges>
                <string>job4</string>
                <string>job5</string>
            </edges>
        </node>
        <node>
            <string>job4</string>
            <element>
                <quotedlist/>
                content of job4
            </element>
        </node>
        <node>
            <string>job4</string>
            <element>
                <quotedlist/>
                content of job4
            </element>
        </node>
    </dag>
</discard>
</project>

Karajan Workflow Formats: .k and .xml

In CoGKit, there are two supported formats in Karajan workflow -- .k and .xml.
Personally, I like xml because of its prevalence and openness. There are many handy tools which can process xml documents in various ways. Unfortunately, xml support in Karajan workflow is not comprehensive. In recent programming, I need to use element dag which is used to support Directed Acyclic Graph.
(1) At first, Karajan workflow supports DAG. But it does not give users a dag.xml file to import. Instead, just dag.k is provided. So I need to find a way to convert dag.k to dag.xml. Thank Mike and Gregor to help me get out of the struggle. Following method can be used to do the task:

cog-workflow -intermediate dag.k

Although following error pops up, I still can get the file I need (dag.xml).

Execution failed:
Variable not found: #channel#defs
        kernel:export @ dag.k, line: 44

Note: actually, what is generated is dag.kml not dag.xml. Currently, I assume they are the same because I have no better choice.

(2) In document I can find, sample workflows about DAG are written in .k format.
This is the unique resource I find helpful: http://wiki.cogkit.org/index.php/Java_CoG_Kit_Workflow_Guide#Direct_Acyclic_Graphs.
Once again, I used command cog-workflow to do this task. But the generated .kml(xml) file is lengthy and not appropriate for human readers. So, I decided to convert it manually by myself. I found this useful link(http://wiki.cogkit.org/index.php/Java_CoG_Kit_Karajan_Workflow_Reference_Manual_4.1.5) which describes both formats in detail.

How to access results?

Now, it is time to consider how to make users easily and conveniently access the results of their workflows. There are several questions here:
(1) How to track output files in Karajan workflows?

The first option is to analyze content of the Karajan workflow to figure out output files. For example, for element execute, attribute stdout indicates the name of output file.
<execute executable="/bin/date" stdout="thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
However, if we use this method to track all output files, it is difficult and time-consuming because it is possible that many elements generate output files. As a result, we must capture possible output files from all these elements.
Another option I can think of is kind of tricky. The newly submitted workflow is executed in a newly created directory. After execution, the files (except workflow file) in the directory are output files. This is the method I am using in my implementation.
(2) How to organize output files?
For the same workflow, we can categorize it based on different criteria. For example, we can categorize a workflow based on the date on which it is submitted, or the date on which it is completed... I would like to make use of workflow id and user id to categorize the workflows. All workflows submitted by a user belong to the same group which can be accessed by this user. Within these workflows, workflow id is used by the user to access a specified workflow. The id of every workflow belonging to a user is unique.
So, the directory layout may look like this:
users/user1/workflow_122/output_file1
users/user1/workflow_122/output_file2
users/user2/workflow_1/output_file1
...
(3) How can users access output files?
After talking with Marlon, I would like to provide RESTful interface by which users can retrieve output files. In my implementation, URLs to access output files look like this:
http://domain:port/resources/user_name/workflow_id/ This retrieves list of all output files for the corresponding workflow.
http://domain:port/resources/user_name/workflow_id/output_file This will retrieve the specified output file directly.

Thursday, February 07, 2008

Workflow Queue Support

In previous implementation, I separated jobs from workflows. A job is a small task which currently is written in Karajan workflow language. Here, it is kind of confusing. Karajan workflow language is used to represent a job. A workflow consists of jobs number of which is variable. A workflow queue consists of some workflows. Workflows in a workflow queue can be executed in arbitrary order, which means that these workflows are totally independent. However, jobs in a workflow are usually related so that they should be executed in a certain order. Essentially workflow queue and workflow do not have fundamental difference because we can convert between them. So, difference is in logical level.
Now, workflow queue is supported.

Workflow management panel
wf_management copy 

When a new workflow is added, a new tab is created.
When an existing workflow is removed, the associated tab is removed from the user interface.
wf_management2 copy 

When a user creates and adds a new job, it can add this job to any existing workflow. Because we have created a workflow called "new_workflow1", this workflow appears in the drop box.
wf_management3 copy

When a user switches to the workflow panel, all jobs belonging to the workflow will be displayed in a canvas. What's more, relationship between jobs is displayed as well.
wf_management4 copy

When a user wants to see details of all jobs in a workflow, he/she can click button "click to see all jobs" and then a pop-up window is displayed to show detailed information.
wf_management5 copy

Sunday, January 20, 2008

Add Support for Job Dependency Edit

Lately, I have been working on support for client-side job queue management and job dependency management.
This two parts can be designed and implemented separately. But I think putting them together is better which is more user-friendly.
Before jobs are submitted, they are maintained at client side.
Currently, following functionalities are supported:
(1) add a job to job queue
(2) remove a job from job queue
(3) edit dependency between jobs.
To make the system easy to use, I provide visual widget interface.

Main interface:
job_management_small
Job addition:
After a use inputs the workflow description in the text area and workflow name in the text field, he/she can add the workflow to job queue by clicking button "Add to Queue". If a job with that name has already existed, a prompt window is popped up. Then user can choose to overwrite current job or modify the name.
By clicking button "Job Management" or tab "Job Management", user can be redirected to job management panel.
Note: name of every workflow must be specified. In other words, value of text filed "Workflow Name" can not be left blank. Moreover, different workflows/jobs can not have a same name. So name of every job must be different.

Job Management Panel:
As described in the picture, every rectangle represents a job and every line with arrow represents dependency between two jobs.
When a use adds job1, job2,...jobn, the default relation is that job2 depends on job1, job3 depends on job2 and jobn depends on jobn-1.
When you move curse over a rectangle for a few seconds, a pop-up window is displayed which contains the content of that job.
job_dependency_panel 

When you right click (click with right button) a rectangle, a context menu is displayed. This menu contains several items.
job_dependency_panel_contextmenu
Currently, items are "from", "to", "delete" and "edit".
(1) If menu item "delete" is clicked, the corresponding job will be deleted permanently from job queue.
   When a job is deleted, all related dependency is deleted as well. There are two kinds of dependency: one is that some other jobs depend on this job; the other is that this job depend on some other jobs.
(2) If menu item "from" is clicked, the corresponding job is marked as the starting point of dependency. Assume it is called parentJob.
(3) If menu item "to" is clicked, the corresponding job is markded as the end point of dependency. Assume it is called childJob.
Then there are three possible steps:
  (3.1) If parentJob is null.
    In this case, it means the user has not selected a job by clicking menu item "from". Then nothing will happen.
  (3.2) If job parentJob is not prerequisite of job childJob.
    And then a line is drawn from the rectangle which represents job parentJob and the rectangle which represents job childJob. And job parentJob is considered as prerequisite of job childJob.
  (3.3) If job parentJob is already prerequisite of job childJob.
    In this case, there must be an existing line drawn from the rectangle which represents job parentJob and the rectangle which represents job childJob. Then this relation is deleted and the line is removed from display.
(4) If menu item "edit" is clicked, the system will redirect user to the workflow edit panel.

User can use drag-and-drop to move the rectangles to anywhere in the screen. Related lines (two situations: one is that some lines starts from this job; the other is some lines ends at this job) are moved as well. Note: You can not use drag-and-drop to move the lines.
Following picture is a sample job dependency graph I got:
job_dependency_sample
Next step:
Currently, all operations above are carried out at client side and no interaction with server is involved.
Issue:
Next step is related to how to send the job queue to server side.
In Karajan, I don't think workflow-level composition is supported directly. However, Karajan provides two elements parallel and sequential which can control the execution sequence of subtasks in a workflow.
So one idea is that all jobs in a job queue are put into a single big workflow which uses elements parallel and sequential to represent original relationship.  One natural question is that whether elements parallel and sequential are enough to express any possible relationship among jobs.
My answer is no.
For the job dependency shown above, I can not think of a way to represent it  with Karajan elements parallel and sequential.
Solution 1:
So, if my conclusion is correct, we can implement a sub system which manages sequence of job submission to underlying grid infrastructure. In my opinion, it is better to be put at server side.
Solution 2:
Aother solution is that we can simplify this issue at cost of losing performance. We can get a job submission sequence by using topological sort. In other words, all jobs are submitted sequentially. Obviously, performance is not the best because some jobs actually can be executed in parallel.

Friday, January 11, 2008

Tutorial

(1) Workflow submission

ui_annotated_1

(1.1)If you already have an existing workflow, you can just paste it into the input area and click button “WS Workflow Submission”.

(1.2)If you just want to get familiar with the user interface and content of workflow does not matter, you can click button “click to see example” and you will see a pop-up window which contains a sample window. Then you can copy it and paste it into the input area. Now, you can submit it. The pop-up window will be hidden if you press “Esc” key or if you click anywhere else which results in that the pop-up window loses focus.

After workflow is submitted, response from server is displayed in the area labeled as “Response from server”.

Note: response is appended to the content of the output area. So if you want to discard the existing response, you should clear it first by clicking button “clear”.

Sample response is:
-------------------------------------------------
(This is done by using web service with status monitoring enabled)
ID for the workflow you just submitted is:
test_79
You can use it as a handle to check its status.
As you see, the workflow id for the workflow you just submitted is returned. And then you can check its status.

(2) Workflow status query

ui_annotated_2

(2.1) If you want to query status of all submitted workflows belonging to your, just click button “Get State of All Workflows”.

(2.2) If you want to query status of specified workflows, you should input ids of the workflows status of which you want to query. Then click button “Get State of A Workflow”. Note: name of the button is kind of misleading. Actually, more than one workflow can be queried every time. Multiple workflow ids should be separated by line feed, blank space and tab.

Note: response is appended to the content of the output area. So if you want to discard the existing response, you should clear it first by clicking button “clear”.

(3) Workflow composition

When the widget toolbox is expanded, it looks like:

wf_composition1

When you move cursor over a certain element in the panel, a pop-up window is displayed to show brief description of that element. When you move cursor out of the element, that pop-up window disappears.

wf_composition3

If you want to insert an element into the workflow, just click the corresponding element in the toolbox panel. If that element does not have parameter, corresponding xml snippet is directly inserted into the workflow. If the element has attributes which need to be set, a window will be popped up. For element “execute”, the pop-up window looks like this:

wf_composition2

You can specify values of various attributes and insert it into workflow by clicking button “Save”. You don’t need to specify values of all attributes. In other words, you just need to set values of those attributes you need.

The xml snippet is not simply appended to existing content of the input area. Instead it is inserted into current caret position in the input area. Moreover, the xml snippet can be inserted to enclose your selected text.

Wednesday, January 09, 2008

Milestone

Recent effort:
(1) add functionality that user can query status of a workflow based on combination of username and workflow id. Moreover, users can query more than one workflow in a single query.
(2) modify user interface.
    add tab panel so that submission panel and status query panel are separated.
(3) Modify the stuff returned after a workflow is submitted.
    Originally, after user submits a workflow, nothing will be returned until the workflow
    is executed completely.
    Now, after user submits a workflow, the workflow id is returned. And the workflow
    id can be used as a handle to query status of the workflow.
(4) Modify the location of configuration file.
    Modify the method by which configuration file is located(Absolute path -> relative path).
    Here it took me lots of time because it is not so easy to get the current working directory
    in Axis2.

So, now the whole system satisfies our basic requirements.
Client side:
(1) workflow submission
(2) workflow status query
(3) user-friendly visual widget support for workflow composition
Server side:
(1) simple user management
(2) workflow execution ( by using CoG kit)
(3) workflow status service

Possible future work:
(1) More sophisticated user management system
(2) Security
(3) To support more powerful status query.
    For example, query which workflows are completed, query which workflows are started but not completed, query which workflows have not started to execute ...

Friday, December 28, 2007

State server event arrivial order

State service has been implemented before. Detailed information is here: http://zhenhua-guo.blogspot.com/2007/12/state-service-implementation-zhenhua.html. However, there are several unsolved problems. Recently, I fixed one of them.
A typical procedure is:
(1) End user sends a workflow to agent server.
(2) Agent server transforms the original workflow into an internal format.
(3) Agent server sends the workflow to status server.
(4) Agent server submits the workflow to executor.
(5) Executor sends event messages to status server to report the progress of execution of the workflow.
The following senario is possible:
    When status server receives event messages from executor, it has not received the workflow from agent server. In other words, although workflow is sent by agent server before event messages are sent by executor, the arrivial order at the status server is not guaranteed.
In this case, the event messages will be lost and we have no chance to restore it.
[Solution]
Message buffer.
If the senario above happens, the received messages are buffered/saved temporarily at status server. Then, when a workflow is received by status server, all buffered/saved messages corresponding to that workflow will be applied to it.
[For example]
(1) Executor sends following message to status server.
    user "tony"                                    //user who owns the workflow
    workflow "weatherforecast"           //workflow id which uniquely identifies the workflow
    subjob: "solve equations"                //sub job
    status: started                               
//status
(2) Status server receives that message.
    However, by checking its repository, status server finds that the corresponding workflow has not been received yet.
    So, that message is buffered.
(3) Status server receives the following workflow message from agent server.
    user "tony"
    workflow "weatherforecast"
    workflow content "<project>....</project>"
(4) Apply the message to the workflow and status of the workflow is changed to indicate that some element starts to execute.

Thursday, December 27, 2007

BugFix and Improvement on Karajan workflow Composition

Last week, I implemented a basic visual Karajan workflow composition interface which eases wokflow composition. This is the related post: http://zhenhua-guo.blogspot.com/2007/12/karajan-workflow-composition.html.
This week, I fixed several bugs and made some improvements on top of it.
(1) The configuration of Karajan workflow is stored in a javascript object like this:
{
    elements: {},
    namespaces: {
    	kernel:{
    		elements:{
    			import:{
				properties: [],
				widgetProps: {}
    			}
    		},
    		namespaces:{
    		}
    	},
        sys: {
            namespaces: { },
            elements: {
                execute: {
                    properties: ["executable", "host", "stdout", "provider", "redirect"],
                    widgetProps: {height:"40px, width:"40px"}
                },
                echo: {
                    properties: ["message", "nl"],
                    widgetProps: {}
                }
            }
        }
    }
}

Note the element of which color is blue. Name of that Karajan element is "import" which is also a keyword of Javascript. As a result, the object above is not legal javascript object!!! So more work is needed here. To work around this problem and make the architecture more scalable, I add one more layer between the configuration object above and the code that uses it.
I construct an element tree called KarajanElementTree of which nodes are KarajanNSNode or KarajanElementNode. KarajanNSNode corresponds to a namespace in Karajan and KarajanElementNode corresponds to an usable element in Karajan. In other words, given a workflow configuration, I build a tree based on it. The tree has a set of fixed interface which can be used by programmers to access the information of various workflow elements. When underlying workflow configuration is modified, I just need to change implementation of the tree with interface staying the same. In other words, workflow configuration and use of the workflow are completely separated so that change of one part does not require change of the other part.
Concretely speaking:
(*) underlying workflow configuration
For those elements of which names are keywords of Javascript, I append a '_' to the element name and add a new property called "truename" to record the real name. Some people may argue that the real name can be obtained by removing the '_' character from the end of the name. Yes, that is right. However, considering the future expansion, my choice is better. For example, maybe one day "import_" becomes a keyword of javascript as well or '_' charater can not be contained in name of a property. Then we need to modify the code which handles the extraction of the real name from the element name.

{
    elements: {},
    namespaces: {
    	kernel:{
    		elements:{
    			import_:{
				truename: "import",
				properties: [],
				widgetProps: {}
    			}
    		},
    		namespaces:{
    		}
    	},
        sys: {
            namespaces: { },
            elements: {
                execute: {
                    properties: ["executable", "host", "stdout", "provider", "redirect"],
                    widgetProps: {height:"40px, width:"40px"}
                },
                echo: {
                    properties: ["message", "nl"],
                    widgetProps: {}
                }
            }
        }
    }
}

(**) Intermediate Element Tree
    KarajanElementNode:
    [ properties ]:
        name: name of the element, this is the name used to retrieve the object corresponding to that name in Javascript;
        truename: real Karajan name of the element.
        properties: properties of the Karajan elements;
        widgetProps: properties of the corresponding widget
    KarajanNSNode:
    [ properties ]:
        name: same as above;
        truename: same as above;
        elements: contains all elements in this namepace;
        namespaces: contains all sub namespaces in this namespace.
    KarajanElementTree:
    [ properties ]:
        root: root of the tree. Typically, type of the root is KarajanNSNode.
(***) Upper layer that uses the Karajan workflow

	var workflowele = getEle( KarajanElementTree, elementname );
	var realname = workflowele.truename;

Now, to get the information of a Karajan element, we don't need to know the underlying mechanism. For example, name of the element can be gotten by accessing property "truename".

(2) Empty namespace and element list elimination
In previous implementation, a new accordion panel is created for every namespace and element list no matter whether they are empty. As a result, the widget toolbox looks jumbly.
Now, I improve it. When a namespace is empty or an element list is empty, don't create an accordion panel for it at all.

(3)Add Pop-Up Window to display element description
In Karajan, there are hundreds of elements. Besides that, users can define their own customized elements. It is hard for a user to remember usage of so many elements. Sometimes, a user has used a certain element, but he/she cannot remember the usage of the element. At this time, a simple suggestive message is enough.
So, I add a new property "description" to every element which describes the usage and functionality of that element. When user moves cursor over a widget, the corresponding description is displayed in a pop-up message window. When user moves cursor out, the window disappears.
Screenshot:
wf_composition3 

(4) Better Support in Element insertion
In previous implementation, a Karajan element can just be inserted into current caret position. It is possible that user wants selected text to be enclosed by a certain element.
For example, we have workflow like this:

<project>
	<transfer srchost="aaa.com" srcfile="file1" desthost="localhost" destfile="file1"/>
	<transfer srchost="bbb.com" srcfile="file2" desthost="localhost" destfile="file1"/>
	<transfer srchost="ccc.com" srcfile="file3" desthost="localhost" destfile="file1"/>
</project>

The three transfer jobs are independent of each other. We would like to let them executed in parallel. Karajan element "parallel" can be used now. If we just support insertion of elements into current caret position, the user needs to first insert element "parallel" somewhere, and then copies "</parallel>" and paste it after the last transfer job. What is better is that user can select the jobs that want to be executed in parallel and element "parallel" will enclose the selected jobs during insertion.
Now, I have implemented this functionality. However, it sometimes does not work well in IE....

(5) Add more Karajan elements into the configuration object
    Karajan workflow contains so many built-in elements so that it is not practical to add all the elements into the javascript configuration object at a time. I decide to gradually add them to the object. Now, I have added many, but still many left...

Thursday, December 06, 2007

State Service Implementation

Zhenhua Guo

Previous posts gave initial design of the system. They mainly focus on high level abstraction. Now, I have nearly implemented the state server and some minor changes of the design have been made. It is time to elaborate the implementation of every component in the system.

Architecture

Figure 1

Client

Clients initiate the job submission. Currently, the Karajan workflow language is supported and it is the only way to represent a workflow. In fact, every user needs an account to submit jobs. However, currently the user management system at the agent server has not been decided. So I use the built-in user management system and authentication mechanism in Apache Tomcat.
Client invokes functions at the agent server by XML-RPC (Jsolait javascript library is used.). At client side, we provide functionality of converting between JSON and XML.
After authentication, what the client sends to the agent is user name and workflow for job submission.
Then the client periodically sends requests to state server to check the state of the workflow. The data is represented in JSON.

Agent
Agent uses Apache XML-RPC toolkit (http://ws.apache.org/xmlrpc/) to build server implementation.
After the user is authenticated, agent should receive job submission request which consists of user name and workflow. Then agent transforms original workflow into an internal format. Echo messages are added to report the progress of execution of the workflow.

For example, if original workflow is:

<project>
<include file="cogkit.xml"/>
<execute executable="/bin/rm" arguments="-f thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
<execute executable="/bin/date" stdout="thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
<transfer srchost="gf1.ucs.indiana.edu" srcfile="thedate" desthost="localhost" provider="gridftp"/>
</project>

then the internal workflow is:

<project>
<include file="cogkit.xml"/>
<echo message="/2|job:execute(/bin/rm) started|1"/>
<execute executable="/bin/rm" arguments="-f thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
<echo message="/2|job:execute(/bin/rm) completed|2"/>
<echo message="/3|job:execute(/bin/date) started|1"/>
<execute executable="/bin/date" stdout="thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
<echo message="/3|job:execute(/bin/date) completed|2"/>
<echo message="/4|job:transfer started|1"/>
<transfer srchost="gf1.ucs.indiana.edu" srcfile="thedate" desthost="localhost" provider="gridftp"/>
<echo message="/4|job:transfer completed|2"/>
</project>

    The statements written in blue are state reporting messages which are added by agent. The format of message is:
       /3/4/2|job:xxx|1
(“/3/4/2” is path, “job:xxx” is job state description and “1” is status code).
    The first part is path which contains all characters appear before character ‘|’. This part indicates the position of the element reported in the whole workflow. The path is /3/4/2 in the sample which means the element reported currently is the second child of the fourth child of the third child of the root element. The root element is <project>.
    The second part is job state description which contains all characters between the two ‘|’ characters. This reports the state of the element.
    The last part is job state code which appears after the last ‘|’ character. This actually is an integer which marks the state. Currently, 1 means the subtask is started and 2 means the subtask is completed.

A unique id is generated for every workflow of a user. I call it wfid(workflow id). In other words, every workflow is uniquely identified by combination of user name and workflow id.

After the transformation is done, agent sends original workflow to state server and sends transformed workflow to executor. Besides the workflows themselves, agent also sends corresponding wfid to state server and executor.

Note executor and state service are both wrapped as web services.

Executor

This part is wrapped as web service. Executor receives transformed workflow from agent and submits the workflow to underlying grid infrastructure by using Java CoGKit. Currently, I invoke command line tool provided by CoGKit to submit workflows to grid infrastructure. In the future, this may be changed. API of Java CoG may be a better choice.

When the workflow starts to be executed, executor sends message to state server to indicate the starting of execution of workflow.

During the execution, state messages are generated to report progress of execution. Every time a state message is generated, executor sends state message to state server.

When the workflow is completed, executor sends message to state server to indicate the completion of execution of workflow.

State server

This part is wrapped as web service.

During job submission, state server receives the workflow from agent. Then state server builds an element tree based on the workflow. State data is stored in every node of the workflow. Helper function is provided to serialize the element tree into JSON format or XML format (XML format serialization has not been implemented).

During job execution, state server receives state message from executor. Format of state message is defined above. State server updates state of corresponding node in the element tree. There are two kinds of state messages here. One is state messages related to state of whole workflow. The other is state messages related to state of a specific element in the workflow.


Issues:

  1. After a user sends a workflow to agent, what should be returned?

Option 1: unique id of the submitted workflow

The end user can check state of this submitted workflow by using the unique id. However, it is not easy to get the output of the workflow execution. To do this, executor or agent needs to store the result temporarily. Then when end user checks the result, agent can return it.

Option 2: output of the workflow execution

Because end user does not know id of the workflow which was submitted just now, the end user can only get state of all submitted workflows so far. This is current choice.

  1. Reliable messaging

Executor sends state messages to state server. Semantically, order of the state messages should be guaranteed which reflects the true order of execution of subtasks. In current system, this is not guaranteed. I am considering using WS Reliable Messaging to do this.

  1. Guarantee of the order of messages.

In description of state server, we know that state server should handle workflow from agent before it handle state messages from executor. In figure 1, it means step should occur before step . However, this can not be guaranteed now. Because of unpredictable network delay and process/thread scheduling, I have no way to satisfy that requirement without modifying code of state server.

If state server handles state messages from executor before it handles workflow from agent, obviously it will not find the corresponding workflow in its database.

One solution:

State server preserves the state messages from executor. How long should the state messages be preserved? Fixed time? Indefinitely time until state server receives workflow from agent?

Tuesday, November 27, 2007

Task Execution in CogKit

CogKit hides complexity of backend grid services and provides a uniform interface. To use CogKit, the first question should be: how to submit jobs.
CogKit provides several methods for users to submit jobs, which is flexible enough to satisfy almost all types of requirements.
(1) API

This interface is used by programmers. Note, in my case, I need built-in mechanism provided by CogKit to capture events in real time so that progress can be reported to end users. So, event support is considered in every kind of API.
(1.1) build jobs in program
By this set of classes, programmers can specify every aspect of a job in programs. Main classes that are involved here include Task, Specification(JobSpecification,...), Service, TaskHandler, ServiceContact, SecurityContext ... I have simply described these classes at http://zhenhua-guo.blogspot.com/2007/10/first-cog-program.html.
Sample program which uses this interface:
//create a task
Task task = new TaskImpl("mytest", Task.JOB_SUBMISSION);

//build specification about the job
JobSpecification spec = new JobSpecificationImpl();
spec.setExecutable("/bin/ls");
spec.setStdInput(null);
spec.setRedirected(false);
spec.setStdOutput("abstractions-testOutput");
spec.setBatchJob(true);

//create servic object which is local representation of remote service
Service service = new ServiceImpl(Service.JOB_SUBMISSION);
service.setProvider("GT2");

SecurityContext sc = null;
try{
 sc = AbstractionFactory.newSecurityContext("GT2");
}catch( Exception e ){
 System.exit(1);
}
sc.setCredentials(null);

ServiceContact scontact= new ServiceContactImpl("abc.com", 1234);

service.setSecurityContext(sc);
service.setServiceContact(scontact);

task.setSpecification(spec);
task.setService(Service.JOB_SUBMISSION_SERVICE,service);

TaskHandler handler = new GenericTaskHandler();

try {
    handler.submit( task );
} catch (Exception e){
 System.exit(1);
}
Event:
To add status listener, addStatusListener method can be utilized. The concrete status listener must implement StatusListener interface. However, the granularity of status change report does not satisfy my requirement. Only status "started/completed/failed" of the whole workflow can be captured. In other words, we can not get detailed progress about how things are going on inside the workflow.
(1.2)Karajan workflow language support
When using interface described above, users spend more time on writing and debugging programs than on logical representation of the job. This is not what we expect. As a result, it is not convenient and efficient to program that way. To solve this problem, CogKit team provides an additional support for workflow composition --- Karajan workflow engine. The workflow description can be written in both native format and XML format. It supports all basic elements that should be supported in a workflow engine: user-defined element, variable, function, condition statements( if...else...), loop statement(for, while, ...), sequential execution, parallel execution... In a word, the language is very powerful. Now, users mainly focus on composition of workflow instead of writing and debugging programs. Here, another question crops up: how to submit workflow description to engine?
(1.2.1) class KarajanWorkflow
org.globus.cog.karajan.KarajanWorkflow can be used to submit jobs.
Sample code likes this:
KarajanWorkflow workflow = new KarajanWorkflow();
String filename = "karajan.xml";
File workflowfile = new File(filename);
if( !workflowfile.exists() ){
 System.out.println("The karajan workflow file " + filename +" does not exist!!");
 return ;
}
workflow.setSpecification( workflowfile );
workflow.start();
workflow.waitFor();
Event:
However, there exists a big drawback here. As far as I know, programmers have no way to capture events generated during the execution of the workflow.
(1.2.2) class ExecutionContext
Actually, this calss is used by class KarajanWorkflow internally. I figured out it when I read the source code. This class provides detailed event reports about the internal execution progress of a workflow.
Sample code looks like:
//load workflow description from a file and construct a tree based
//on the content as logical representation.
ElementTree tree = Loader.load("karajan.xml");
//create execution context
ExecutionContext ec = new ExecutionContext(tree);
ec.start();
ec.waitFor();
Sample code with event handling:
//load workflow description from a file and construct a tree based
//on the content as logical representation.
ElementTree tree = Loader.load("karajan.xml");
//create execution context
ExecutionContext ec = new ExecutionContext(tree);
ec.addEventListener(this); //specify event listener
ec.setMonitoringEnabled(true);
ec.start();
ec.waitFor();
The class which handles event must implement EventListener interface. The only function must be implemented is:
public void event(Event e){
if (e instanceof StatusMonitoringEvent) {
 //do some operations
    }else if (e instanceof ProgressMonitoringEvent) {
 //do other operations
    }
}
Generally, users want to know the event is generated by which element/node in the workflow. There is a special class called FlowElement which represents a subpart(execution/transfer/echo/...) of a workflow. You can get the element corresponding to an event by invoking event.getFlowElement() function. In addition, it provides methods to get its children so that you can do traversal.
Note: After the workflow is loaded by the system, it will be converted by an internal format which is more complex and contains more elements than those you write. As a result, a lot of events will be generated even if the workflow description file is very simple. So some filtering work is needed here. My solution: all elements are stored in an ElementTree. Then when an event is received, the target/subject of the event must be checked to see whether it is part of the ElementTree. If not, just ignore it.
(2) desktop interface
This interface satisfies requirements of common users, not programmers. CogKit provides both command line tool and graphic user interface. These functionalities are written in script. Those script files first do some configuration work (mainly CLASSPATH configuration) and then execute built-in .class files in the CogKit package. In other words, it is just a thin wrapper around API.

Tuesday, October 23, 2007

A problem of JavaCOG

I submit an execution task. The COG program seems to wait forever. But the task actually is executed by the remote server.
The reason seems to be that: COG program expects response from remote server (Globus server or others), but the remote server does not return any response even if the task is completed successfully.
I don't know whether my understanding is correct. I tried different ways to submit a task.
(1) by using command line tool
    task 1: transfer a local file to remote machine using gridftp.
    cog-file-transfer.bat -s file:///E:/my_program/web_app/CogTest/bin/testkarajan.xml -d gsiftp://gf1.ucs.indiana.edu/home/zhguo/testkarajan.xml
    This can be completed successfully.
    task 2: execute a command on remote machine
    cog-job-submit.bat -s gf1.ucs.indiana.edu -e /bin/ls -stdout list.txt -d /home/zhguo -provider gt2
    This command hangs up forever and does not return control to user until it is terminated forcibly. But the corresponding command ( in this case, it is /bin/ls ") can be executed on remote machine and the result is written into the specified file.
    cog-job-submit.bat -s gf1.ucs.indiana.edu -e /bin/ls -stdout list.txt -d /home/zhguo -provider gt2 -b
    This works well.
    So, what is difference between batch mode and regular mode?
(2) by using GUI tool
     The same results as above.
(3) by using API
    I wrote a program which utilized the karajan engine to handle the karajan workflow related stuff.
    All kinds of invocation do not work well.

How to solve it? Is there anything I missed ?

Sunday, October 07, 2007

First COG program

Recently, I wrote a simple JavaCOG program which makes use of the globus middleware installed on gridfarm001. Four kinds of objects in the program are important:
  • Task--abstraction of the work you want to execute on remote machines.
  • JobSpecification
  • Service--local representation of remote service.
  • TaskHandler
To appropriately set Service object, other two objects are necessary:
  • ServiceContact -- specifies endpoint (host and port) you want to interact with.
  • SecurityContext--specifies the credential you will use to do the authentication by server.
Because Service is local representation of remote grid service, provider must be specified to complete the translation between upper abstraction and underlying infrastructure. In my case, the underlying infrastructure is Globus. However I don't know the version of Globus. At first, I used provider GT4. However, the program seemed not to have effect before being terminated antomatically. I added additional debugging and logging statements, but nothing was generated. I debugged it step by step, I found that the program was terminated when the handler submitted the task. Then I tried the command line tool provided by JavaCOG and it worked. So I ensured that some configurations in my program were incorrect. Finally, I found that it was because of the version of Globus. Gridfarm installs GT2. I didn't know that and assumed it was GT4. So I used GT4 as the service provider. Then you know the result... One bad thing is that my program didn't give any useful information before being terminated. No exception could be caught. I don't know why.