Friday, November 30, 2007

Detailed Architecture

Zhenhua Guo

Job Submission


Assumption:

User has logged in.

Detailed Steps:

  1. End user submits workflow to the agent server. In the meanwhile, username must be submitted as well.

  2. Agent checks the validity of the user.

Then it generates a unique id based on the user name and generates a unique workflow id based on the workflow.

  1. Agent server sends user id, workflow id and workflow to both Status server and Executor.

  2. Executor fetches a proxy certificate from MyProxy server.

  3. Executor submits workflow and proxy certificate to back end grid infrastructure.

Then grid infrastructure executes the workflow actually.

  1. Grid infrastructure returns result to executor.

  2. Executor returns result to agent server.

  3. Agent server returns result to end user.

Note: workflow id will be used in status query.

Job Status Query


Functionality:

Support query of status of all workflows submitted by a user. Currently, user can not specify a certain workflow of which status will be queried.

Event notification:

  1. During the execution of a workflow by underlying grid infrastructure, executor is notified when status of the workflow changes.

  2. Executor sends notification to status server.

The request consists of:

  • user id: to identify the user who owns the workflow

  • workflow id: to uniquely identify a workflow submitted by the user. It is necessary because a user can submit more than one workflow.

  • element: the element in a workflow of which status changes. A workflow contains many sub tasks/elements. This piece of information identifies the element of which status changes.

  • status: the new status of an element.

Status query:

  1. Client sends job query request to agent. Note: currently only user name is submitted.

  2. Agent checks validity of the user. The user account is transformed to a user id.

  3. Agent sends the user id to status server to query status of workflows submitted by this user.

  4. Status returns status report to agent. The user id is returned as well.

  5. Agent sends result to end user.

Tuesday, November 27, 2007

Task Execution in CogKit

CogKit hides complexity of backend grid services and provides a uniform interface. To use CogKit, the first question should be: how to submit jobs.
CogKit provides several methods for users to submit jobs, which is flexible enough to satisfy almost all types of requirements.
(1) API

This interface is used by programmers. Note, in my case, I need built-in mechanism provided by CogKit to capture events in real time so that progress can be reported to end users. So, event support is considered in every kind of API.
(1.1) build jobs in program
By this set of classes, programmers can specify every aspect of a job in programs. Main classes that are involved here include Task, Specification(JobSpecification,...), Service, TaskHandler, ServiceContact, SecurityContext ... I have simply described these classes at http://zhenhua-guo.blogspot.com/2007/10/first-cog-program.html.
Sample program which uses this interface:
//create a task
Task task = new TaskImpl("mytest", Task.JOB_SUBMISSION);

//build specification about the job
JobSpecification spec = new JobSpecificationImpl();
spec.setExecutable("/bin/ls");
spec.setStdInput(null);
spec.setRedirected(false);
spec.setStdOutput("abstractions-testOutput");
spec.setBatchJob(true);

//create servic object which is local representation of remote service
Service service = new ServiceImpl(Service.JOB_SUBMISSION);
service.setProvider("GT2");

SecurityContext sc = null;
try{
 sc = AbstractionFactory.newSecurityContext("GT2");
}catch( Exception e ){
 System.exit(1);
}
sc.setCredentials(null);

ServiceContact scontact= new ServiceContactImpl("abc.com", 1234);

service.setSecurityContext(sc);
service.setServiceContact(scontact);

task.setSpecification(spec);
task.setService(Service.JOB_SUBMISSION_SERVICE,service);

TaskHandler handler = new GenericTaskHandler();

try {
    handler.submit( task );
} catch (Exception e){
 System.exit(1);
}
Event:
To add status listener, addStatusListener method can be utilized. The concrete status listener must implement StatusListener interface. However, the granularity of status change report does not satisfy my requirement. Only status "started/completed/failed" of the whole workflow can be captured. In other words, we can not get detailed progress about how things are going on inside the workflow.
(1.2)Karajan workflow language support
When using interface described above, users spend more time on writing and debugging programs than on logical representation of the job. This is not what we expect. As a result, it is not convenient and efficient to program that way. To solve this problem, CogKit team provides an additional support for workflow composition --- Karajan workflow engine. The workflow description can be written in both native format and XML format. It supports all basic elements that should be supported in a workflow engine: user-defined element, variable, function, condition statements( if...else...), loop statement(for, while, ...), sequential execution, parallel execution... In a word, the language is very powerful. Now, users mainly focus on composition of workflow instead of writing and debugging programs. Here, another question crops up: how to submit workflow description to engine?
(1.2.1) class KarajanWorkflow
org.globus.cog.karajan.KarajanWorkflow can be used to submit jobs.
Sample code likes this:
KarajanWorkflow workflow = new KarajanWorkflow();
String filename = "karajan.xml";
File workflowfile = new File(filename);
if( !workflowfile.exists() ){
 System.out.println("The karajan workflow file " + filename +" does not exist!!");
 return ;
}
workflow.setSpecification( workflowfile );
workflow.start();
workflow.waitFor();
Event:
However, there exists a big drawback here. As far as I know, programmers have no way to capture events generated during the execution of the workflow.
(1.2.2) class ExecutionContext
Actually, this calss is used by class KarajanWorkflow internally. I figured out it when I read the source code. This class provides detailed event reports about the internal execution progress of a workflow.
Sample code looks like:
//load workflow description from a file and construct a tree based
//on the content as logical representation.
ElementTree tree = Loader.load("karajan.xml");
//create execution context
ExecutionContext ec = new ExecutionContext(tree);
ec.start();
ec.waitFor();
Sample code with event handling:
//load workflow description from a file and construct a tree based
//on the content as logical representation.
ElementTree tree = Loader.load("karajan.xml");
//create execution context
ExecutionContext ec = new ExecutionContext(tree);
ec.addEventListener(this); //specify event listener
ec.setMonitoringEnabled(true);
ec.start();
ec.waitFor();
The class which handles event must implement EventListener interface. The only function must be implemented is:
public void event(Event e){
if (e instanceof StatusMonitoringEvent) {
 //do some operations
    }else if (e instanceof ProgressMonitoringEvent) {
 //do other operations
    }
}
Generally, users want to know the event is generated by which element/node in the workflow. There is a special class called FlowElement which represents a subpart(execution/transfer/echo/...) of a workflow. You can get the element corresponding to an event by invoking event.getFlowElement() function. In addition, it provides methods to get its children so that you can do traversal.
Note: After the workflow is loaded by the system, it will be converted by an internal format which is more complex and contains more elements than those you write. As a result, a lot of events will be generated even if the workflow description file is very simple. So some filtering work is needed here. My solution: all elements are stored in an ElementTree. Then when an event is received, the target/subject of the event must be checked to see whether it is part of the ElementTree. If not, just ignore it.
(2) desktop interface
This interface satisfies requirements of common users, not programmers. CogKit provides both command line tool and graphic user interface. These functionalities are written in script. Those script files first do some configuration work (mainly CLASSPATH configuration) and then execute built-in .class files in the CogKit package. In other words, it is just a thin wrapper around API.

Sunday, November 25, 2007

Refinement

Zhenhua Guo

Architecture: 1

Job Submission and Execution. 1

Query: 1

Some issues: 1

Class Design. 2

Client-side Classes: 2

Agent side Classes 3

Executor side classes 4

Architecture:

Job Submission and Execution

Query:

Some issues:

(1) Authentication

As we all know, username and password is a fundamental authentication method.

Should we support other types of authentication, especially users’ certificate authentication?

Answer is no. Generally client-side javascript is not allowed to access local file system. Although some systems provide extension to support local file access (e.g. IE ActiveX object: FileSystemObject), this is not portable. As a result, to get user’s certificate, we must provide a text area on the web page where user can paste his/her certificate. You know mostly web pages cope with printable characters. In certificate, there are lots of non-printable characters so that user can not simply copy and paste its content. In a word, lots of tricky work must be done. In the meanwhile, convenience of using out system gets worse because user must do some additional work.

Now let’s rethink over this issue. Why do we need to provide support for certificate authentication? In our system, HTTPS is used as the fundamental transmission protocol. So we don’t worry about eavesdropping attack. By imposing strength constraint on users’ password, we can guarantee desirable security level. So I think username and password authentication is enough for us.

(2) Built-in authentication vs. our own authentication

Tomcat provides built-in support for username and password authentication. However, sometimes it is not flexible to satisfy our requirements.

(3) Work flow support

Do we need to support workflow-level composition? For example, one workflow depends on another workflow or two workflows can be executed parallel.

If answer is yes, does Karajan workflow engine support it?

If not, we must support it in javascript and now we encounter another big issue: javascript does not provide multithread support. I think we can do some workaround here. But the performance and functionality may not satisfy us.

(4) Job manipulation

To manipulate a job (cancel, suspend, resume ...), client should have a valid handle to refer to the job. I am not sure how users can get this handle. I think this handle should be returned when user submits a job. Then when user wants to manipulate the job, he/she needs to send that handle to tell agent which job will be manipulated.

(5) Status checking

Does the user have to be authenticated to do status checking? In other words, do we allow status checking issued by anonymous users?

What mechanisms are used to support status checking?

Ø check on demand

Every time a client sends a status-checking request, status repository sends a similar request to executor... This mechanism guarantees that the client always get accurate and real-time result. However, it may incur unnecessary workload. Even if the status of a certain job does not change, the whole process must be performed still.

In this case, a separate status repository does not benefit us except that it results in a clearer logic representation.

Ø update on demand

Every time status of a job changes, executor will notify status repository. When a user sends a request to get status of his/her jobs, agent server gets result from status repository. However status repository does not need to sends request to executor. By this means, unnecessary communication is eliminated.

I am not sure whether Java CoG provides callback support so that a specified action will be performed if a certain event occurs.

Class Design

Client-side Classes:

Foundation Stones:

1) Workflow

Description:

This class is used to encapsulate functionality related to workflow manipulation. A workflow can contain more than one task.

Now, only Karajan workflow language is supported.

Methods:

addTask(Task); //add a task to workflow description

removeTask(Task) //remove a task from workflow description

setWorkflow(String workflowDescription); //set workflow description directly

toXML() //convert description into a string which can be transferred on the wire.

2) WFStatus

Description:

This class contains status of a workflow.

Methods:

getStatus();

3) Task

Description:

This class represents a detailed task. The task

Methods:

initialize(String description) //For example, initialize(“<transfer source=... destination=...>”);

toXML();

4) WFGraph

Description:

This class contains a list of workflows which will be executed and maintains the relationship among these workflows. The detailed representations maybe vary (queue, stack...).

Methods:

addWorkflow(Workflow);

removeWorkflow(Workflow);

5) Agent

Description:

This class communicates with agent server.

Methods:

getURL(); setURL(); //getter and setter for URL of agent

getPort(); setPort();

Workflow manipulation:

Job manipulation:

6) Submit/cancel/suspend/resume/

Description:

This class submits workflows to remote server.

Methods:

submit(WFGraph, Agent);

submit(Workflow, Agent);

...

Monitor:

7) Monitor

Description:

This class contains functionalities which are needed to monitor status of submitted workflows by a user.

Methods:

getStatus(WFGraph, Workflow, Agent); //get status of a workflow,

getStatus(WFGraph, Agent); //get status of all workflows in a workflow graph

Authentication:

8) User

Description:

This class represents a user in our system. Note: this user account does not necessarily appear in the backend grid service account list.

Methods:

getUsername();

getPassword();

toXML();

9) UPAuth (Username Password authentication)

Description:

This class handles authentication of users.

Methods:

auth(User, Agent); //send authentication request to agent server

Agent side Classes

Agent server sits between client and backend executor. So it communicates with two parties.

Communication with client:

Accept requests from clients. Requests include:

(1) Login

(2) Logout

(3) Register

(4) Job submission

(5) Job status query

(6) Job cancellation

(7) Job suspension

(8) Job resumption

In the meanwhile, agent sends requests to backend executor. Requests include:

(1) Job submission

(2) Job status query

(3) Job cancellation

(4) Job suspension

(5) Job resumption

Note: I assume agent and executor trust each other, which is guaranteed by underlying authentication system.

Executor side classes

Executor receives requests from agent and sends requests to MyProxy server or underlying grid service provider.

Requests from agent:

(1) Job submission

(2) Job status query

(3) Job cancellation

(4) Job suspension

(5) Job resumption

Requests sent to MyProxy (Java CoG is used here):

(1) Fetch a certificate

(2) Renew a certificate?

Requests sent to grid service provider (Java CoG is used here):

(1) Job submission

(2) Job status query

(3) Job cancellation

(4) Job suspension

(5) Job resumption

Monday, November 12, 2007

Design alternatives

Zhenhua Guo

Where should the control logic be located, at service side or client side?

(1) Client side

There will be lots of javascript work.

To use JavaCOG, there two choices:

(a) Support task/taskgraph submission:

In javascript, mirror related classes (Task, Specification, Taskgraph … ) of JavaCoG. Those classes in javascript just send encoded instructions/commands to agent server.

Like this:

In a work submission, a client needs to contact server many times besides status checking.


(b) Support Karajan workflow submission:

The client javascript must support the Karajan workflow language, including parsing it, generating corresponding control logic... This functionality has been implemented at JavaCoG. So, I don't think it is reasonable to reimplement it in Javascript.

One issue:

  • How to submit task to the agent?

  1. Client splits the original workflow description into separate smaller ones.

Client javascript maintains relationship (parallel or sequential …) among these small pieces. When needed, it submits a certain workflow piece to agent.

If so, maybe submitting the whole Karajan workflow is better.

  1. Convert the workflow description to detailed instructions and then send these instructions in the way described above in (a).

(2) Server side

Client side composes the Karajan workflow document based on the users' requirement and then sends it to server.


Summary:

Method (2) is preferred.

Notification:

(1) Pull

Client side javascript program periodically sends messages to server to check the status of submitted task. The interval of polling is difficult to choose.

One solution is that the pull action is driven by users. We can put a button on the webpage. If the user wants to update the status report, just click the button. Obviously, this is not user-friendly.

(2) Push

In javascript, a client program can not listen to a specified port like regular socket programming. However, we can make use of the request-reply communication style to indirectly simulate that functionality. Every time the client sends a job status checking request to the server, server does not return immediately the response message until some conditions(job is completed, job fails or time out) are satisfied.



Complex workflow support

I am not very clear about support provided by Karajan for complex workflow. Can a workflow be part of another larger workflow?

If so, we don’t bother to do it.

If not, we need to complete simple control logic in client side javascript. Support sequential execution, parallel execution...

UI consideration

Do we need to provide client side UI components to help users build workflow more quickly? Are the UI components workflow language specific?


Server side components:

  1. User management

    1. write our own management system

    2. use existing system

How to integrate it into our system?

Where to store user related data, including users’ identity information, submitted tasks...? In database, or just regular files?

  1. Job management

queue...


How to check status of submitted jobs by using CogKit?