Friday, December 28, 2007

State server event arrivial order

State service has been implemented before. Detailed information is here: http://zhenhua-guo.blogspot.com/2007/12/state-service-implementation-zhenhua.html. However, there are several unsolved problems. Recently, I fixed one of them.
A typical procedure is:
(1) End user sends a workflow to agent server.
(2) Agent server transforms the original workflow into an internal format.
(3) Agent server sends the workflow to status server.
(4) Agent server submits the workflow to executor.
(5) Executor sends event messages to status server to report the progress of execution of the workflow.
The following senario is possible:
    When status server receives event messages from executor, it has not received the workflow from agent server. In other words, although workflow is sent by agent server before event messages are sent by executor, the arrivial order at the status server is not guaranteed.
In this case, the event messages will be lost and we have no chance to restore it.
[Solution]
Message buffer.
If the senario above happens, the received messages are buffered/saved temporarily at status server. Then, when a workflow is received by status server, all buffered/saved messages corresponding to that workflow will be applied to it.
[For example]
(1) Executor sends following message to status server.
    user "tony"                                    //user who owns the workflow
    workflow "weatherforecast"           //workflow id which uniquely identifies the workflow
    subjob: "solve equations"                //sub job
    status: started                               
//status
(2) Status server receives that message.
    However, by checking its repository, status server finds that the corresponding workflow has not been received yet.
    So, that message is buffered.
(3) Status server receives the following workflow message from agent server.
    user "tony"
    workflow "weatherforecast"
    workflow content "<project>....</project>"
(4) Apply the message to the workflow and status of the workflow is changed to indicate that some element starts to execute.

Thursday, December 27, 2007

BugFix and Improvement on Karajan workflow Composition

Last week, I implemented a basic visual Karajan workflow composition interface which eases wokflow composition. This is the related post: http://zhenhua-guo.blogspot.com/2007/12/karajan-workflow-composition.html.
This week, I fixed several bugs and made some improvements on top of it.
(1) The configuration of Karajan workflow is stored in a javascript object like this:
{
    elements: {},
    namespaces: {
    	kernel:{
    		elements:{
    			import:{
				properties: [],
				widgetProps: {}
    			}
    		},
    		namespaces:{
    		}
    	},
        sys: {
            namespaces: { },
            elements: {
                execute: {
                    properties: ["executable", "host", "stdout", "provider", "redirect"],
                    widgetProps: {height:"40px, width:"40px"}
                },
                echo: {
                    properties: ["message", "nl"],
                    widgetProps: {}
                }
            }
        }
    }
}

Note the element of which color is blue. Name of that Karajan element is "import" which is also a keyword of Javascript. As a result, the object above is not legal javascript object!!! So more work is needed here. To work around this problem and make the architecture more scalable, I add one more layer between the configuration object above and the code that uses it.
I construct an element tree called KarajanElementTree of which nodes are KarajanNSNode or KarajanElementNode. KarajanNSNode corresponds to a namespace in Karajan and KarajanElementNode corresponds to an usable element in Karajan. In other words, given a workflow configuration, I build a tree based on it. The tree has a set of fixed interface which can be used by programmers to access the information of various workflow elements. When underlying workflow configuration is modified, I just need to change implementation of the tree with interface staying the same. In other words, workflow configuration and use of the workflow are completely separated so that change of one part does not require change of the other part.
Concretely speaking:
(*) underlying workflow configuration
For those elements of which names are keywords of Javascript, I append a '_' to the element name and add a new property called "truename" to record the real name. Some people may argue that the real name can be obtained by removing the '_' character from the end of the name. Yes, that is right. However, considering the future expansion, my choice is better. For example, maybe one day "import_" becomes a keyword of javascript as well or '_' charater can not be contained in name of a property. Then we need to modify the code which handles the extraction of the real name from the element name.

{
    elements: {},
    namespaces: {
    	kernel:{
    		elements:{
    			import_:{
				truename: "import",
				properties: [],
				widgetProps: {}
    			}
    		},
    		namespaces:{
    		}
    	},
        sys: {
            namespaces: { },
            elements: {
                execute: {
                    properties: ["executable", "host", "stdout", "provider", "redirect"],
                    widgetProps: {height:"40px, width:"40px"}
                },
                echo: {
                    properties: ["message", "nl"],
                    widgetProps: {}
                }
            }
        }
    }
}

(**) Intermediate Element Tree
    KarajanElementNode:
    [ properties ]:
        name: name of the element, this is the name used to retrieve the object corresponding to that name in Javascript;
        truename: real Karajan name of the element.
        properties: properties of the Karajan elements;
        widgetProps: properties of the corresponding widget
    KarajanNSNode:
    [ properties ]:
        name: same as above;
        truename: same as above;
        elements: contains all elements in this namepace;
        namespaces: contains all sub namespaces in this namespace.
    KarajanElementTree:
    [ properties ]:
        root: root of the tree. Typically, type of the root is KarajanNSNode.
(***) Upper layer that uses the Karajan workflow

	var workflowele = getEle( KarajanElementTree, elementname );
	var realname = workflowele.truename;

Now, to get the information of a Karajan element, we don't need to know the underlying mechanism. For example, name of the element can be gotten by accessing property "truename".

(2) Empty namespace and element list elimination
In previous implementation, a new accordion panel is created for every namespace and element list no matter whether they are empty. As a result, the widget toolbox looks jumbly.
Now, I improve it. When a namespace is empty or an element list is empty, don't create an accordion panel for it at all.

(3)Add Pop-Up Window to display element description
In Karajan, there are hundreds of elements. Besides that, users can define their own customized elements. It is hard for a user to remember usage of so many elements. Sometimes, a user has used a certain element, but he/she cannot remember the usage of the element. At this time, a simple suggestive message is enough.
So, I add a new property "description" to every element which describes the usage and functionality of that element. When user moves cursor over a widget, the corresponding description is displayed in a pop-up message window. When user moves cursor out, the window disappears.
Screenshot:
wf_composition3 

(4) Better Support in Element insertion
In previous implementation, a Karajan element can just be inserted into current caret position. It is possible that user wants selected text to be enclosed by a certain element.
For example, we have workflow like this:

<project>
	<transfer srchost="aaa.com" srcfile="file1" desthost="localhost" destfile="file1"/>
	<transfer srchost="bbb.com" srcfile="file2" desthost="localhost" destfile="file1"/>
	<transfer srchost="ccc.com" srcfile="file3" desthost="localhost" destfile="file1"/>
</project>

The three transfer jobs are independent of each other. We would like to let them executed in parallel. Karajan element "parallel" can be used now. If we just support insertion of elements into current caret position, the user needs to first insert element "parallel" somewhere, and then copies "</parallel>" and paste it after the last transfer job. What is better is that user can select the jobs that want to be executed in parallel and element "parallel" will enclose the selected jobs during insertion.
Now, I have implemented this functionality. However, it sometimes does not work well in IE....

(5) Add more Karajan elements into the configuration object
    Karajan workflow contains so many built-in elements so that it is not practical to add all the elements into the javascript configuration object at a time. I decide to gradually add them to the object. Now, I have added many, but still many left...

Thursday, December 20, 2007

Karajan workflow composition

Recently, I am focusing on providing visual widgets to ease composition of Karajan workflow. It is not practical to build the application from scratch. I have surveyed several prevailing javascript/Ajax frameworks including qooxdoo, prototype, jQuery, Ext, mootools.
Survey of js frameworks:
(1) jQuery(http://jquery.com/)
   As its name implies, its emphasis is query. At first sight, the framework is beautiful and concise. It supports CSS dom selector and XPath syntax.  Besides, it provides further convenient selection syntax. Some examples follows:
   $("a")
   $("a[@name]")
   $("a[@href=content/resources]")
   $("ul > li")
   $("ul li")
   $("ul .list > a")
   $("#output")
More examples:
   $("li:eq(0)")   //gets the first item
   $("li:lt(3)")    //get the first three items
   $("li:not(.groove)")    //get li elements without class groove
Beautifully, right? We can do lots of work by code of just several lines!!
Chaining: Most functions return a jQuery object so that you can directly invoke more functions.
$('form#login')
    // hide all the labels inside the form with the 'optional' class
    .find('label.optional').hide().end()

    // add a red border to any password fields in the form
    .find('input:password').css('border', '1px solid red').end()

    // add a submit handler to the form
    .submit(function(){
        return confirm('Are you sure you want to submit?');
    });

No matter whether you like this kind of code or not, it is functionality provided by jQuery. I prefer to use multiple lines of code and self-documenting variable names which look clearer.
Plug-ins: jQuery is blooming considering number of its plug-ins. Plug-ins increase sharply recently and many developers contribute to it.
However, jQuery does not excel at UI. In other words, if you want to build fancy user interface, jQuery is not the first choice.

(2)Ext(http://extjs.com/)
   Originally, Ext was based on YUI and it was developed as extension for YUI. Then, Ext broke away from YUI and was developed as an independent project.
   Its emphasis is abundance of UI widgets. It provides many fancy and convenient UI widgets which can be used easily to build our own GUI. The configuration of UI widget looks like this:
   var panel = new Ext.Panel({
      title: "This is title",
      width: 400,
      height: 300,
      border: true,
      layout: "accordion",
      items: [ .... ]
   });
It is more convenient than invocation of bunch of functions to set values of properties(e.g. panel.setwidth(400); panel.setheight(400);...).
Not long ago, combination of jQuery and Ext was announed which is good news to web application developers. However, process of the combination is kind of slow and support of jQuery in Ext 2 is limited and buggy.

(3)Qooxdoo
  
This framework is sort of comprehensive which includes almost all common functionalities. Documentation is not bad. It is growing rapidly and seems promising.
   However, current version of this framework is 0.7, which means it is still in beta phase and not appropriate for production use. Apart from that, it aims to control the whole web page by Qooxdoo. So it is difficult for end users to directly access/modify underlying dom element. This drawback is annoying because inevitably  users sometimes want to manipulate underlying elements directly.

(4)Dojo(http://dojotoolkit.org/)
    This framework is so comprehensive and complex. It is the most powerful framework I have ever seen. It provides lots of functionalities: UI widgets, event system, offline support, presentation... As a result, the framework is sort of bloated and cumbersome. Bugs are not rare... Besides, documentation is done badly which makes development more difficult.
    Maybe, in the future, Dojo will become outstanding in term of functionality, performance and documentation. But for now, it is far from that.

(5)Prototype(http://www.prototypejs.org/)
    This framework adds basic OO features to javascript, e.g. inheritance. It is actually a language(javascript) extension library. Moreover, it extends some built-in objects (String, Array...) of javascript to offer more convenient functionalities. Script aculous(http://script.aculo.us/) is built on top of prototype and provides UI widgets.
    I read some articles about prototype and it seems that the support for OO features has problems in some situations.

Karajan Workflow Composition
Anyway, finally I chose Ext as my javascript framework.
Some screenshots about the workflow composition panel:
wf_composition1 
The panel is organized according to namespaces. So if user knows the namespace of an element, it is effortless to find the corresponding widget in the toolbox. And all main panels structured into accordion layout. If user clicks the title bar of a panel, that panel is expanded and all other panels are collapsed.

Karajan workflow element edit panel:
wf_composition2 
After values of various properties are typed, the xml document corresponding to the element will be automatically inserted into output panel. For sys.execute element, the xml snippet looks like this
    <sys:execute executable="..." host ="..." stdout="..." provider="..." redirect="..."/>
Currently, all values are enclosed by double quotation marks. The reason is that Karajan workflow is XML document in nature. For xml document, value of every attribute/property MUST be enclosed by quotation marks.
For other workflow languages,this is not always correct because types of some properties are integer/boolean and these values should not be enclosed by quotation marks.
After user clicks "Save" button, the generated xml snippet is inserted into output panel. The xml snippet is not simply appended to the output panel. Instead, it is inserted into current cursor position.

Scalability and Maintainability
    During design, I always keep a principle in my mind: built-in elements of Karajan are abundant and users can add their own customized elements. As a result, the addition of elements to widget window/toolbox should be easy and scalable.
The configuration is a javascript object:

{
    elements: {},
    namespaces: {
        sys: {
            namespaces: {
                file: {
                    elements: {
                        read: {
                            properties: ["name"],
                            widgetProps: { }
                        },
                        write: {
                            properties: ["name", "append"],
                            widgetProps: {}
                        }
                    },
                    namespace: {}
                }
            },
            elements: {
                execute: {
                    properties: ["executable", "host", "stdout", "provider", "redirect"],
                    widgetProps: {height:"40px, width:"40px"}
                },
                echo: {
                    properties: ["message", "nl"],
                    widgetProps: {}
                },
                parallel: {
                    properties: [],
                    widgetProps: {}
                },
                sequential: {
                    properties: [],
                    widgetProps: {}
                }
            }
        }
    }
}

In Karajan, namespace is supported. For every namespace, there are two properties: elements and namespaces. Property elements contains information about those elements directly in the namespace. Property namespaces contains information about sub namespaces.
In above example, namespace sys contains elements execute, echo, parallel and sequential and it contains sub namespaces file. Then namespace file contains elements read and write and it contains no sub namespaces.
For every element, it contains two properties: properties and widgetProps. Property properties contains list of parameters about the elements. Property widgetProps contains configuration information about how to display the corresponding widget in the toolbox window.
In above example, element execute has properties executable, host, stdout, provider and redirect.
To add more elements, I just need to modify the configuration object shown above. Obviously, it is convenient to modify it.

Improvements we can do in the future
(1) Now I list all properties of an element in the edit panel. For some elements, number of properties is more then ten. But only some properties are used frequently and others are seldom used. In the future, we can first hide the optional properties and only display the necessary properties. If user wants to use all properties, we show those optional properties as well.
(2) Search functionality. Number of elements may be enormous and it is painful to browse all namespaces to find the desired element.
However, this improvement is not necessary. Official website of CogKit provides reference manual for Karjan workflow and detailed information about all elements of Karajan is included. So user can first consult the reference manual for detailed information on the desired element. Then the user will get the fullname of the element which includes name of namespace in which the element is located. According to namespace, it is very easy to find the corresponding widget in the toolbox.

Friday, December 14, 2007

Client Side Enhancement

Obviously, it is not a favorable job to directly compose XML workflow file. Users may spend more time on XML formatting (start tags, end tags ...) than on business logic. So it is good news to provide auxiliary tools to ease composition of workflows. Visual widgets which support drag and drop can be used. In web2.0, there are several mainstream technologies including Ajax, Flash, Flex, Silverlight and OpenLaszlo, etc. I think Ajax is preferred because it can be run on almost all platforms. Support for Ajax is almost built in every prevailing browsers.
If we decide to use Ajax, it is time consuming to build our application from the scratch, considering there are so many mature Ajax frameworks. I can list some frameworks here: dojo, qooxdoo, jquery, prototype, mootools, GWT, YUI, Ext ... I have surveyed these frameworks.
GWT is developed by Google and YUI is developed by Yahoo. GWT provides conversion from Java to Javascript so that programmers can use Java to build web application. However, the customization of the webpage (layout, style ...) is not convenient.
Frameworks Prototype, Dojo, Qooxdoo, Jquery and mootools are all open-source and they have their own good and bad. However, the difference is kind of subtle. I used Qooxdoo in previous project and I don't recommend us to use it. All other frameworks seem good and further investigation is necessary to pick up the most appropriate one for us.

Thursday, December 06, 2007

State Service Implementation

Zhenhua Guo

Previous posts gave initial design of the system. They mainly focus on high level abstraction. Now, I have nearly implemented the state server and some minor changes of the design have been made. It is time to elaborate the implementation of every component in the system.

Architecture

Figure 1

Client

Clients initiate the job submission. Currently, the Karajan workflow language is supported and it is the only way to represent a workflow. In fact, every user needs an account to submit jobs. However, currently the user management system at the agent server has not been decided. So I use the built-in user management system and authentication mechanism in Apache Tomcat.
Client invokes functions at the agent server by XML-RPC (Jsolait javascript library is used.). At client side, we provide functionality of converting between JSON and XML.
After authentication, what the client sends to the agent is user name and workflow for job submission.
Then the client periodically sends requests to state server to check the state of the workflow. The data is represented in JSON.

Agent
Agent uses Apache XML-RPC toolkit (http://ws.apache.org/xmlrpc/) to build server implementation.
After the user is authenticated, agent should receive job submission request which consists of user name and workflow. Then agent transforms original workflow into an internal format. Echo messages are added to report the progress of execution of the workflow.

For example, if original workflow is:

<project>
<include file="cogkit.xml"/>
<execute executable="/bin/rm" arguments="-f thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
<execute executable="/bin/date" stdout="thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
<transfer srchost="gf1.ucs.indiana.edu" srcfile="thedate" desthost="localhost" provider="gridftp"/>
</project>

then the internal workflow is:

<project>
<include file="cogkit.xml"/>
<echo message="/2|job:execute(/bin/rm) started|1"/>
<execute executable="/bin/rm" arguments="-f thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
<echo message="/2|job:execute(/bin/rm) completed|2"/>
<echo message="/3|job:execute(/bin/date) started|1"/>
<execute executable="/bin/date" stdout="thedate" host="gf1.ucs.indiana.edu" provider="GT2" redirect="false"/>
<echo message="/3|job:execute(/bin/date) completed|2"/>
<echo message="/4|job:transfer started|1"/>
<transfer srchost="gf1.ucs.indiana.edu" srcfile="thedate" desthost="localhost" provider="gridftp"/>
<echo message="/4|job:transfer completed|2"/>
</project>

    The statements written in blue are state reporting messages which are added by agent. The format of message is:
       /3/4/2|job:xxx|1
(“/3/4/2” is path, “job:xxx” is job state description and “1” is status code).
    The first part is path which contains all characters appear before character ‘|’. This part indicates the position of the element reported in the whole workflow. The path is /3/4/2 in the sample which means the element reported currently is the second child of the fourth child of the third child of the root element. The root element is <project>.
    The second part is job state description which contains all characters between the two ‘|’ characters. This reports the state of the element.
    The last part is job state code which appears after the last ‘|’ character. This actually is an integer which marks the state. Currently, 1 means the subtask is started and 2 means the subtask is completed.

A unique id is generated for every workflow of a user. I call it wfid(workflow id). In other words, every workflow is uniquely identified by combination of user name and workflow id.

After the transformation is done, agent sends original workflow to state server and sends transformed workflow to executor. Besides the workflows themselves, agent also sends corresponding wfid to state server and executor.

Note executor and state service are both wrapped as web services.

Executor

This part is wrapped as web service. Executor receives transformed workflow from agent and submits the workflow to underlying grid infrastructure by using Java CoGKit. Currently, I invoke command line tool provided by CoGKit to submit workflows to grid infrastructure. In the future, this may be changed. API of Java CoG may be a better choice.

When the workflow starts to be executed, executor sends message to state server to indicate the starting of execution of workflow.

During the execution, state messages are generated to report progress of execution. Every time a state message is generated, executor sends state message to state server.

When the workflow is completed, executor sends message to state server to indicate the completion of execution of workflow.

State server

This part is wrapped as web service.

During job submission, state server receives the workflow from agent. Then state server builds an element tree based on the workflow. State data is stored in every node of the workflow. Helper function is provided to serialize the element tree into JSON format or XML format (XML format serialization has not been implemented).

During job execution, state server receives state message from executor. Format of state message is defined above. State server updates state of corresponding node in the element tree. There are two kinds of state messages here. One is state messages related to state of whole workflow. The other is state messages related to state of a specific element in the workflow.


Issues:

  1. After a user sends a workflow to agent, what should be returned?

Option 1: unique id of the submitted workflow

The end user can check state of this submitted workflow by using the unique id. However, it is not easy to get the output of the workflow execution. To do this, executor or agent needs to store the result temporarily. Then when end user checks the result, agent can return it.

Option 2: output of the workflow execution

Because end user does not know id of the workflow which was submitted just now, the end user can only get state of all submitted workflows so far. This is current choice.

  1. Reliable messaging

Executor sends state messages to state server. Semantically, order of the state messages should be guaranteed which reflects the true order of execution of subtasks. In current system, this is not guaranteed. I am considering using WS Reliable Messaging to do this.

  1. Guarantee of the order of messages.

In description of state server, we know that state server should handle workflow from agent before it handle state messages from executor. In figure 1, it means step should occur before step . However, this can not be guaranteed now. Because of unpredictable network delay and process/thread scheduling, I have no way to satisfy that requirement without modifying code of state server.

If state server handles state messages from executor before it handles workflow from agent, obviously it will not find the corresponding workflow in its database.

One solution:

State server preserves the state messages from executor. How long should the state messages be preserved? Fixed time? Indefinitely time until state server receives workflow from agent?

Friday, November 30, 2007

Detailed Architecture

Zhenhua Guo

Job Submission


Assumption:

User has logged in.

Detailed Steps:

  1. End user submits workflow to the agent server. In the meanwhile, username must be submitted as well.

  2. Agent checks the validity of the user.

Then it generates a unique id based on the user name and generates a unique workflow id based on the workflow.

  1. Agent server sends user id, workflow id and workflow to both Status server and Executor.

  2. Executor fetches a proxy certificate from MyProxy server.

  3. Executor submits workflow and proxy certificate to back end grid infrastructure.

Then grid infrastructure executes the workflow actually.

  1. Grid infrastructure returns result to executor.

  2. Executor returns result to agent server.

  3. Agent server returns result to end user.

Note: workflow id will be used in status query.

Job Status Query


Functionality:

Support query of status of all workflows submitted by a user. Currently, user can not specify a certain workflow of which status will be queried.

Event notification:

  1. During the execution of a workflow by underlying grid infrastructure, executor is notified when status of the workflow changes.

  2. Executor sends notification to status server.

The request consists of:

  • user id: to identify the user who owns the workflow

  • workflow id: to uniquely identify a workflow submitted by the user. It is necessary because a user can submit more than one workflow.

  • element: the element in a workflow of which status changes. A workflow contains many sub tasks/elements. This piece of information identifies the element of which status changes.

  • status: the new status of an element.

Status query:

  1. Client sends job query request to agent. Note: currently only user name is submitted.

  2. Agent checks validity of the user. The user account is transformed to a user id.

  3. Agent sends the user id to status server to query status of workflows submitted by this user.

  4. Status returns status report to agent. The user id is returned as well.

  5. Agent sends result to end user.

Tuesday, November 27, 2007

Task Execution in CogKit

CogKit hides complexity of backend grid services and provides a uniform interface. To use CogKit, the first question should be: how to submit jobs.
CogKit provides several methods for users to submit jobs, which is flexible enough to satisfy almost all types of requirements.
(1) API

This interface is used by programmers. Note, in my case, I need built-in mechanism provided by CogKit to capture events in real time so that progress can be reported to end users. So, event support is considered in every kind of API.
(1.1) build jobs in program
By this set of classes, programmers can specify every aspect of a job in programs. Main classes that are involved here include Task, Specification(JobSpecification,...), Service, TaskHandler, ServiceContact, SecurityContext ... I have simply described these classes at http://zhenhua-guo.blogspot.com/2007/10/first-cog-program.html.
Sample program which uses this interface:
//create a task
Task task = new TaskImpl("mytest", Task.JOB_SUBMISSION);

//build specification about the job
JobSpecification spec = new JobSpecificationImpl();
spec.setExecutable("/bin/ls");
spec.setStdInput(null);
spec.setRedirected(false);
spec.setStdOutput("abstractions-testOutput");
spec.setBatchJob(true);

//create servic object which is local representation of remote service
Service service = new ServiceImpl(Service.JOB_SUBMISSION);
service.setProvider("GT2");

SecurityContext sc = null;
try{
 sc = AbstractionFactory.newSecurityContext("GT2");
}catch( Exception e ){
 System.exit(1);
}
sc.setCredentials(null);

ServiceContact scontact= new ServiceContactImpl("abc.com", 1234);

service.setSecurityContext(sc);
service.setServiceContact(scontact);

task.setSpecification(spec);
task.setService(Service.JOB_SUBMISSION_SERVICE,service);

TaskHandler handler = new GenericTaskHandler();

try {
    handler.submit( task );
} catch (Exception e){
 System.exit(1);
}
Event:
To add status listener, addStatusListener method can be utilized. The concrete status listener must implement StatusListener interface. However, the granularity of status change report does not satisfy my requirement. Only status "started/completed/failed" of the whole workflow can be captured. In other words, we can not get detailed progress about how things are going on inside the workflow.
(1.2)Karajan workflow language support
When using interface described above, users spend more time on writing and debugging programs than on logical representation of the job. This is not what we expect. As a result, it is not convenient and efficient to program that way. To solve this problem, CogKit team provides an additional support for workflow composition --- Karajan workflow engine. The workflow description can be written in both native format and XML format. It supports all basic elements that should be supported in a workflow engine: user-defined element, variable, function, condition statements( if...else...), loop statement(for, while, ...), sequential execution, parallel execution... In a word, the language is very powerful. Now, users mainly focus on composition of workflow instead of writing and debugging programs. Here, another question crops up: how to submit workflow description to engine?
(1.2.1) class KarajanWorkflow
org.globus.cog.karajan.KarajanWorkflow can be used to submit jobs.
Sample code likes this:
KarajanWorkflow workflow = new KarajanWorkflow();
String filename = "karajan.xml";
File workflowfile = new File(filename);
if( !workflowfile.exists() ){
 System.out.println("The karajan workflow file " + filename +" does not exist!!");
 return ;
}
workflow.setSpecification( workflowfile );
workflow.start();
workflow.waitFor();
Event:
However, there exists a big drawback here. As far as I know, programmers have no way to capture events generated during the execution of the workflow.
(1.2.2) class ExecutionContext
Actually, this calss is used by class KarajanWorkflow internally. I figured out it when I read the source code. This class provides detailed event reports about the internal execution progress of a workflow.
Sample code looks like:
//load workflow description from a file and construct a tree based
//on the content as logical representation.
ElementTree tree = Loader.load("karajan.xml");
//create execution context
ExecutionContext ec = new ExecutionContext(tree);
ec.start();
ec.waitFor();
Sample code with event handling:
//load workflow description from a file and construct a tree based
//on the content as logical representation.
ElementTree tree = Loader.load("karajan.xml");
//create execution context
ExecutionContext ec = new ExecutionContext(tree);
ec.addEventListener(this); //specify event listener
ec.setMonitoringEnabled(true);
ec.start();
ec.waitFor();
The class which handles event must implement EventListener interface. The only function must be implemented is:
public void event(Event e){
if (e instanceof StatusMonitoringEvent) {
 //do some operations
    }else if (e instanceof ProgressMonitoringEvent) {
 //do other operations
    }
}
Generally, users want to know the event is generated by which element/node in the workflow. There is a special class called FlowElement which represents a subpart(execution/transfer/echo/...) of a workflow. You can get the element corresponding to an event by invoking event.getFlowElement() function. In addition, it provides methods to get its children so that you can do traversal.
Note: After the workflow is loaded by the system, it will be converted by an internal format which is more complex and contains more elements than those you write. As a result, a lot of events will be generated even if the workflow description file is very simple. So some filtering work is needed here. My solution: all elements are stored in an ElementTree. Then when an event is received, the target/subject of the event must be checked to see whether it is part of the ElementTree. If not, just ignore it.
(2) desktop interface
This interface satisfies requirements of common users, not programmers. CogKit provides both command line tool and graphic user interface. These functionalities are written in script. Those script files first do some configuration work (mainly CLASSPATH configuration) and then execute built-in .class files in the CogKit package. In other words, it is just a thin wrapper around API.

Sunday, November 25, 2007

Refinement

Zhenhua Guo

Architecture: 1

Job Submission and Execution. 1

Query: 1

Some issues: 1

Class Design. 2

Client-side Classes: 2

Agent side Classes 3

Executor side classes 4

Architecture:

Job Submission and Execution

Query:

Some issues:

(1) Authentication

As we all know, username and password is a fundamental authentication method.

Should we support other types of authentication, especially users’ certificate authentication?

Answer is no. Generally client-side javascript is not allowed to access local file system. Although some systems provide extension to support local file access (e.g. IE ActiveX object: FileSystemObject), this is not portable. As a result, to get user’s certificate, we must provide a text area on the web page where user can paste his/her certificate. You know mostly web pages cope with printable characters. In certificate, there are lots of non-printable characters so that user can not simply copy and paste its content. In a word, lots of tricky work must be done. In the meanwhile, convenience of using out system gets worse because user must do some additional work.

Now let’s rethink over this issue. Why do we need to provide support for certificate authentication? In our system, HTTPS is used as the fundamental transmission protocol. So we don’t worry about eavesdropping attack. By imposing strength constraint on users’ password, we can guarantee desirable security level. So I think username and password authentication is enough for us.

(2) Built-in authentication vs. our own authentication

Tomcat provides built-in support for username and password authentication. However, sometimes it is not flexible to satisfy our requirements.

(3) Work flow support

Do we need to support workflow-level composition? For example, one workflow depends on another workflow or two workflows can be executed parallel.

If answer is yes, does Karajan workflow engine support it?

If not, we must support it in javascript and now we encounter another big issue: javascript does not provide multithread support. I think we can do some workaround here. But the performance and functionality may not satisfy us.

(4) Job manipulation

To manipulate a job (cancel, suspend, resume ...), client should have a valid handle to refer to the job. I am not sure how users can get this handle. I think this handle should be returned when user submits a job. Then when user wants to manipulate the job, he/she needs to send that handle to tell agent which job will be manipulated.

(5) Status checking

Does the user have to be authenticated to do status checking? In other words, do we allow status checking issued by anonymous users?

What mechanisms are used to support status checking?

Ø check on demand

Every time a client sends a status-checking request, status repository sends a similar request to executor... This mechanism guarantees that the client always get accurate and real-time result. However, it may incur unnecessary workload. Even if the status of a certain job does not change, the whole process must be performed still.

In this case, a separate status repository does not benefit us except that it results in a clearer logic representation.

Ø update on demand

Every time status of a job changes, executor will notify status repository. When a user sends a request to get status of his/her jobs, agent server gets result from status repository. However status repository does not need to sends request to executor. By this means, unnecessary communication is eliminated.

I am not sure whether Java CoG provides callback support so that a specified action will be performed if a certain event occurs.

Class Design

Client-side Classes:

Foundation Stones:

1) Workflow

Description:

This class is used to encapsulate functionality related to workflow manipulation. A workflow can contain more than one task.

Now, only Karajan workflow language is supported.

Methods:

addTask(Task); //add a task to workflow description

removeTask(Task) //remove a task from workflow description

setWorkflow(String workflowDescription); //set workflow description directly

toXML() //convert description into a string which can be transferred on the wire.

2) WFStatus

Description:

This class contains status of a workflow.

Methods:

getStatus();

3) Task

Description:

This class represents a detailed task. The task

Methods:

initialize(String description) //For example, initialize(“<transfer source=... destination=...>”);

toXML();

4) WFGraph

Description:

This class contains a list of workflows which will be executed and maintains the relationship among these workflows. The detailed representations maybe vary (queue, stack...).

Methods:

addWorkflow(Workflow);

removeWorkflow(Workflow);

5) Agent

Description:

This class communicates with agent server.

Methods:

getURL(); setURL(); //getter and setter for URL of agent

getPort(); setPort();

Workflow manipulation:

Job manipulation:

6) Submit/cancel/suspend/resume/

Description:

This class submits workflows to remote server.

Methods:

submit(WFGraph, Agent);

submit(Workflow, Agent);

...

Monitor:

7) Monitor

Description:

This class contains functionalities which are needed to monitor status of submitted workflows by a user.

Methods:

getStatus(WFGraph, Workflow, Agent); //get status of a workflow,

getStatus(WFGraph, Agent); //get status of all workflows in a workflow graph

Authentication:

8) User

Description:

This class represents a user in our system. Note: this user account does not necessarily appear in the backend grid service account list.

Methods:

getUsername();

getPassword();

toXML();

9) UPAuth (Username Password authentication)

Description:

This class handles authentication of users.

Methods:

auth(User, Agent); //send authentication request to agent server

Agent side Classes

Agent server sits between client and backend executor. So it communicates with two parties.

Communication with client:

Accept requests from clients. Requests include:

(1) Login

(2) Logout

(3) Register

(4) Job submission

(5) Job status query

(6) Job cancellation

(7) Job suspension

(8) Job resumption

In the meanwhile, agent sends requests to backend executor. Requests include:

(1) Job submission

(2) Job status query

(3) Job cancellation

(4) Job suspension

(5) Job resumption

Note: I assume agent and executor trust each other, which is guaranteed by underlying authentication system.

Executor side classes

Executor receives requests from agent and sends requests to MyProxy server or underlying grid service provider.

Requests from agent:

(1) Job submission

(2) Job status query

(3) Job cancellation

(4) Job suspension

(5) Job resumption

Requests sent to MyProxy (Java CoG is used here):

(1) Fetch a certificate

(2) Renew a certificate?

Requests sent to grid service provider (Java CoG is used here):

(1) Job submission

(2) Job status query

(3) Job cancellation

(4) Job suspension

(5) Job resumption

Monday, November 12, 2007

Design alternatives

Zhenhua Guo

Where should the control logic be located, at service side or client side?

(1) Client side

There will be lots of javascript work.

To use JavaCOG, there two choices:

(a) Support task/taskgraph submission:

In javascript, mirror related classes (Task, Specification, Taskgraph … ) of JavaCoG. Those classes in javascript just send encoded instructions/commands to agent server.

Like this:

In a work submission, a client needs to contact server many times besides status checking.


(b) Support Karajan workflow submission:

The client javascript must support the Karajan workflow language, including parsing it, generating corresponding control logic... This functionality has been implemented at JavaCoG. So, I don't think it is reasonable to reimplement it in Javascript.

One issue:

  • How to submit task to the agent?

  1. Client splits the original workflow description into separate smaller ones.

Client javascript maintains relationship (parallel or sequential …) among these small pieces. When needed, it submits a certain workflow piece to agent.

If so, maybe submitting the whole Karajan workflow is better.

  1. Convert the workflow description to detailed instructions and then send these instructions in the way described above in (a).

(2) Server side

Client side composes the Karajan workflow document based on the users' requirement and then sends it to server.


Summary:

Method (2) is preferred.

Notification:

(1) Pull

Client side javascript program periodically sends messages to server to check the status of submitted task. The interval of polling is difficult to choose.

One solution is that the pull action is driven by users. We can put a button on the webpage. If the user wants to update the status report, just click the button. Obviously, this is not user-friendly.

(2) Push

In javascript, a client program can not listen to a specified port like regular socket programming. However, we can make use of the request-reply communication style to indirectly simulate that functionality. Every time the client sends a job status checking request to the server, server does not return immediately the response message until some conditions(job is completed, job fails or time out) are satisfied.



Complex workflow support

I am not very clear about support provided by Karajan for complex workflow. Can a workflow be part of another larger workflow?

If so, we don’t bother to do it.

If not, we need to complete simple control logic in client side javascript. Support sequential execution, parallel execution...

UI consideration

Do we need to provide client side UI components to help users build workflow more quickly? Are the UI components workflow language specific?


Server side components:

  1. User management

    1. write our own management system

    2. use existing system

How to integrate it into our system?

Where to store user related data, including users’ identity information, submitted tasks...? In database, or just regular files?

  1. Job management

queue...


How to check status of submitted jobs by using CogKit?

Wednesday, October 31, 2007

Install JDK, Tomcat and Axis2 on Ubuntu

Previously, I deployed all java related stuff on my windows machine. Now, I would like to deploy on linux.
My environment:
Operating System: Ubuntu 6.06.1 LTS f
JDK 6 installation
In Ubuntu, an open source Java environment has been included. It is Gcj(http://gcc.gnu.org/java/). It is a GNU project. However, I prefer Sun JDK.
(1) download and install Sun JDK 1.6
Ubuntu is a branch of Debian, so it naturally inherits the great package management tool -- apt-get. It is an easy job to install packages on Ubuntu.
First, you should add "universe" and "multiverse" repositories to your /etc/apt/sources.list file. Then, execute following command:
   sudo apt-get install sun-java6-jdkBy default, it will be installed to /usr/lib/jvm/.
(2) configure it
Because now we have two Java compilers/interpreters installed, to make it work in the way I expect configuration is a must. In directory /usr/lib/jvm/, there are several *.jinfo files which contains information about installed jdk/jre packages. You can change jdk/jre alternatives by using command:update-java-alternatives.   
   sudo update-java-alternatives -l
  //list all available jre/jdk installations
   sudo update-java-alternatives -s java-6-sun  //set sun's jdk to be used
Or, you can use:
   sudo update-alternatives --config java
Then a list of available jdk/jre installations is displayed. In my case, it is:
There are 3 alternatives which provide `java'.

  Selection    Alternative
-----------------------------------------------
          1    /usr/bin/gij-wrapper-4.1
 +        2    /usr/lib/jvm/java-gcj/jre/bin/java
*         3    /usr/lib/jvm/java-6-sun/jre/bin/java

Press enter to keep the default[*], or type selection number: 
Then, set environment vaiable. Add following two lines:  
   JAVA_HOME="/usr/lib/jvm/java-6-sun"
to file /etc/environment.
(3) test
Type:
javac -version
java -version
to see whether your configuration works.
Install Tomcat 5.5
(1) Download and install
sudo apt-get install tomcat5.5 tomcat5.5-admin tomcat5.5-webapps
Note: here I installed three packages. Tomcat 5.5 contains basic implementation of Servlet and JSP specifications. Tomcat 5.5-admin contains two web-based management interface. Tomcat 5.5-webapps contains documents and some sample web applications.
(2)Configure
I installed webapps and admin.
Set CATALINA_HOME environmen variable. In my case, CATALINA_HOME=/usr/share/tomcat5.5.
The example apps are installed to "/usr/share/tomcat5.5/webapps/".
Global configuration files (server.xml and web.xml) are located in "/etc/tomcat5.5".
To make use of the web-based management tool, you need an account. You can set up your account by modifying file /usr/share/tomcat5.5/conf/tomcat-users.xml. Add the following line:
   <user username="username"  password="password" roles="manager,admin" />
Then, restart tomcat
   sudo /etc/init.d/tomcat5.5 restart
Note: "admin" and "manger" are two different roles.
(3)Start/Stop server
Script used to start/stop tomcat server is /etc/init.d/tomcat5.5.
sudo /etc/init.d/tomcat5.5 start
sudo /etc/init.d/tomcat5.5 stop
sudo /etc/init.d/tomcat5.5 restart
sudo /etc/init.d/tomcat5.5 force-reload
sudo /etc/init.d/tomcat5.5 status
However, you had better use script in directory /usr/share/tomcat5.5/bin/. I guess /etc/init.d/tomcat5.5 is intended to be used internally.
      /usr/shar/share/tomcat5.5/bin/startup.sh
      /usr/shar/share/tomcat5.5/bin/shutdown.sh
(4)Test
Tomcat by default listens to port 8180. In windows, it listens to 8080. Strange...
You can visit http://localhost:8180 to check whether your installation succeeds.
(5) Deploy your own web app
Your application should be put into directory "/usr/share/tomcat5.5/webapps/". You can use a .war file or just regular file system tree.

Problem:

   After installing Tomcat 5.5, I typed http://localhost:8180 to do test. Everything seemed to work well -- management, administration ... However, when I tried the JSP example at http://156.56.104.196:8180/jsp-examples/, error occurred!
Error message is:

HTTP Status 500 -


type Exception

type Exception report

message

description The server encountered an internal error () that prevented it from fulfilling this request.

exception

org.apache.jasper.JasperException: Unable to load class for JSP
 org.apache.jasper.JspCompilationContext.load(JspCompilationContext.java:598)
 org.apache.jasper.servlet.JspServletWrapper.getServlet(JspServletWrapper.java:147)
 org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:315)
 org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
 org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
 javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

root cause
......

I made sure that all related .jar files were positioned correctly. What was weird was that servlet example could run correctly while JSP example always incurred error. I wrote a simple JSP file:
   <% out.println("Hello,world"); %>
After deploying it, I found that it worked!!!

Solution
After search by Google, finally I found this page http://forum.java.sun.com/thread.jspa?threadID=693082&messageID=4028997 which gave solution. The reason is incorrect configuration in Tomcat 5.5 distribution!!! I repeat the solution simply here:
Edit /usr/share/tomcat5.5/webapps/jsp-example/WEB-INF/tagPlugins.xml.
Original one is:

<tag-plugins>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.rt.core.IfTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.If</plugin-class>
  </tag-plugin>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.common.core.ChooseTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.Choose</plugin-class>
  </tag-plugin>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.rt.core.WhenTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.When</plugin-class>
  </tag-plugin>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.common.core.OtherwiseTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.Otherwise</plugin-class>
  </tag-plugin>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.rt.core.ForEachTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.ForEach</plugin-class>
  </tag-plugin>
</tag-plugins>
Note, those orange lines are not correct!! Altered copy is:
<tag-plugins>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.rt.core.IfTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.core.If</plugin-class>
  </tag-plugin>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.common.core.ChooseTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.core.Choose</plugin-class>
  </tag-plugin>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.rt.core.WhenTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.core.When</plugin-class>
  </tag-plugin>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.common.core.OtherwiseTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.core.Otherwise</plugin-class>
  </tag-plugin>
  <tag-plugin>
    <tag-class>org.apache.taglibs.standard.tag.rt.core.ForEachTag</tag-class>
    <plugin-class>org.apache.jasper.tagplugins.jstl.core.ForEach</plugin-class>
  </tag-plugin>
</tag-plugins>
Install Axis2 1.3 in Tomcat 5.5
(1) Download Axis2 1.3 Release
I chose to download .war file which would be deployed in tomcat 5.5
Command:
   wget http://www.eng.lsu.edu/mirrors/apache/ws/axis2/1_3/axis2-1.3-war.zip
(2) Deploy axis2.
Copy the downloaded axis2.war to /usr/share/tomcat5.5/webapps/. Then restart Tomcat. Tomcat automatically extracts and deploys the .war file.
(3) Test
Go to http://localhost:8180/axis2/. You should see the welcome page of Axis2.
Axis2 provides a user-friendly web-based interface to view all available services, validate the installation, upload new service (actually what are uploaded are .aar files)...
Standard binary distribution of Axis2 1.3
In fact, if you want to develop web service by using Axis2, you should download standard binary distribution which contains complete version of Axis2.
The most important part of Axis2 are those tools under directory bin. These tools include axis2.sh/axis2.bat, axis2server.bat/axis2server.sh, java2wsdl.bat/java2wsdl.sh, wsdl2java.sh/wsdl2.bat.  I described usage of these tools in my last post.

Tuesday, October 30, 2007

Web Service Development and Deployment using Axis2

Goal
Develop and deploy web service in Apache Tomcat by using Axis2 1.2/1.3.
Prerequisite:
(1) Install JDK, Tomcat and Axis2.
      Keys:
         (a) Versions of these packages must match.
         (b) JDK is recommended. Though JRE works most of the time, some features of the additional libraries/packages may not work well.
(2) Configure JDK. I think the most important part is to set the environment.
     There are already lots of tutorials around the internet. Typically, you should set environment variables : JAVA_HOME and CLASSPATH.
(3) Configure Tomcat
     Tomcat must be run on JVM. So path of JVM needs to be configured correctly.
(4) Deploy Axis2
     Axis2 is deployed in Tomcat. You can put the axis2.war file to webapps directory of Tomcat. Tomcat automatically unextracts and deploys it.
     In addition, you should set AXIS2_HOME environment variable to point to the location of axis2.
Procedure
Now, let's start to write our service prototype.
(1) Define interface in Java.
My sample interface:
File Add.java
package test;
interface Add{
   int addoperation( int op1, int op2 ) ;
}
(2) Now, compile Add.java to .class file. Command:
     javac Add.java
(3) Then, we will generate WSDL document based on the interface defined above.
     The tool is Java2WSDL which is included in Axis2 package.  My command used to generate WSDL document is:
          Java2WSDL -cn test.Add
     The newly generated WSDL document is Add.wsdl.
     Note: (1) Java2WSDL generates WSDL document based on .class file, not .java source file. So step (2) is necessary.
              (2) Java2WSDL command must be invoked in right directory. You must obey that package name must match
                   the directory hierarchy. In my case,directory tree is: D:/apps/demo/test/Add.java. When I invoke Java2WSDL,
                   current working directory is D:/apps/demo/.
(4) Use WSDL2Java (included in Axis2 package) to generate stub Java functions based on WSDL document.
There are two parts: client side and server side.
Detailed information about WSDL2Java/Java2WSDL is here: http://ws.apache.org/axis/java/user-guide.html#UsingWSDLWithAxis
(4.1) Server side
First you need to copy the generated Add.wsdl to the machine where service is to be located. In my case, the Add.wsdl is stored at D:/apps/demo/service.
(4.1.1) Generate stub functions
Again, WSDL2Java is used. Command is:    WSDL2Java -uri C:/apps/demo/test/Add.wsdl -p test -d adb -s -ss -sd -ssi
After excution of this command, directory tree looks like:
D:/apps/demo/service/
   Add.wsdl 
   resource
      Add.wsdl
      services.xml
   src
      test
         AddMessageReceiverInOut.java
         Addoperation.java
         AddoperationResponse.java
         AddSkeleton.java
         AddSkeletonInterface.java
         ExtensionMapper.java
   build.xml
(4.1.2) Add functionality of your service to generated methods
File AddSkeleton.java contains code directly related to your web service implementation. Initial content of AddSkeleton.java is:
/**
 * AddSkeleton.java
 *
 * This file was auto-generated from WSDL
 * by the Apache Axis2 version: 1.3  Built on : Aug 10, 2007 (04:45:47 LKT)
 */
package test;

/**
 *  AddSkeleton java skeleton for the axisService
 */
public class AddSkeleton implements AddSkeletonInterface {
    /**
     * Auto generated method signature
     * @param addoperation0
     */
    public test.AddoperationResponse addoperation(
        test.Addoperation addoperation0) {
        //TODO : fill this with the necessary business logic
        throw new java.lang.UnsupportedOperationException("Please implement " +
            this.getClass().getName() + "#addoperation");
    }
}

I modify AddSkeleton.java to add my web service implementation. New content is:
package test;

public class AddSkeleton implements AddSkeletonInterface {
    public test.AddoperationResponse addoperation(
        test.Addoperation addoperation0) {
         //first, get two parameters
         int op1 = addoperation0.getParam0();
         int op2 = addoperation0.getParam1();
         int sum = op1 + op2; //calculate summation
         AddoperationResponse resp = new AddoperationResponse();//construct response object
         resp.set_return( sum );//set the return value
         return resp;
    }
}
(4.1.3) Build a service
Change directory to D:/apps/demo/service/, then execute the following command:
ant jar.server
After execution, directory tree looks like:
D:/apps/demo/service/
   Add.wsdl 
   resource
      Add.wsdl
      services.xml
   src
      test
         AddMessageReceiverInOut.java
         Addoperation.java
         AddoperationResponse.java
         AddSkeleton.java
         AddSkeletonInterface.java
         ExtensionMapper.java
   build.xml
   build
      classes
         META-INF
            Add.wsdl
            services.xml
         test
            some .class files
      lib
         Add.aar
Note: Add.aar is the the archive.
(4.1.4) Deploy the service
Copy Add.arr to the "services" directory (Tomcat_directory/webapps/axis2/WEB-INF/services) of deployed axis2. Axis2 provides a web-based interface to manage the deployed services. You can use that interface to upload your .arr file.
(4.2) Client side
(4.2.1) Generate stub functions
My command used to generate client-side Web service stub functions is:
    WSDL2Java -uri Add.wsdl -p test.axis2.add -d adb -s
After execution of this command, directory tree should look like:
D:/apps/demo/ 
   Add.wsdl 
   test
      Add.java 
      Add.class
   src
      test
         AddStub.java
   build.xml
The file AddStub.java contains the newly generated client-side stub functions.
(4.2.2) develop your code to use web service
(4.2.2.1) Write source code to invoke remote web service
Create a new file called Client.java (Actually, you can use any name you like) in the directory D:/apps/demo/src/test/.
Content of the file:
package test;

import java.io.*;
import test.AddStub.Addoperation;
import test.AddStub.AddoperationResponse;

public class Client {
 public static int op1 = 0;
 public static int op2 = 0;
    public static void main(java.lang.String[] args) {
        try {
            AddStub stub = new AddStub("http://localhost:8080/axis2/services/Add");
            BufferedReader   in=new   BufferedReader(new  InputStreamReader System.in));  
            System.out.println("Please input the first operand:");
            String input = in.readLine();
            op1 = Integer.parseInt( input );
            System.out.println("Please input the second operand:");
            input = in.readLine();
            op2 = Integer.parseInt( input );
            add(stub);
        } catch (Exception e) {
            e.printStackTrace();
            System.out.println("\n\n\n");
        }
    }

    /* invoke the 'add' web service */
    public static int add(AddStub stub) {
        try {
            Addoperation req = new Addoperation();
            req.setParam0(op1);
            req.setParam1(op2);
            AddoperationResponse resp = new AddoperationResponse();
            resp = stub.addoperation(req);//invoke the web service
            System.out.println("done");
            System.out.println(op1 + "+"+ op2 + " is " + resp.get_return());
            return resp.get_return();
        } catch (Exception e) {
            e.printStackTrace();
            System.out.println("\n\n\n");
        }
        return 0;
    }
}
Note: In axis2 1.4, the generated code is different from code generated by axis2 1.3.
In axis2 1.3, parameters of a request are set using setParam0, setParam1,...
In axis2 1.4, parameters of a request are set using setter functions of those properties, e.g. setName, setId.
Also the method to get return value is different. There may be some other changes I don't know, so I encourage to consult official Axis2 document.
(4.2.2.2) compile the source file and execute it.
Change your current working directory to: C:/apps/demo/, then execute following command:
    ant jar.client
Then, a new directory "build" is created. Now the directory looks like:
D:/apps/demo/ 
   Add.wsdl 
   test
      Add.java 
      Add.class
   src
      test
         AddStub.java
   build.xml
   build
      classes
         test
            some .class files
      lib
         Add-test-client.jar
Now, you can use java command to execute the program.
Remeber: first you must set CLASSPATH to include axis2 .jar libraries and newly created Add-test-client.jar. One alternative way is to use option -cp of command java. Command to execute your program is:
    java test.Client      #with CLASSPATH being set
or
  java -cp <axis2_jars>:./Add-test-client.jar test.Client      #without CLASSPATH being set
To relieve programmers from these fussy stuff, axis2 provides a tool axis2.bat/axis2.sh. This script automatically set CLASSPATH to include all the .jar files of axis2. How can the script know where your axis2 is installed? Ha, you must create a new environment variable called AXIS2_HOME which contains the path where axis2 is installed. However, the newly created Add-test-client.jar is not included in the classpath by the axis2.bat/axis2.sh script. So you still need to add it to CLASSPATH. Execute your program using command:
    axis2.sh -cp ./Add-test-client.jar test.Client
In fact, I wrote my own script to ease invocation of that program. Now, I don't need to set long CLASSPATH or input long command line with -cp option.