Thursday, March 13, 2008

Some workflow related projects

DIET

http://graal.ens-lyon.fr/DIET/

DIET (Distributed Interactive Engineering Toolbox) seems to be similar to Condor. DIET is based on Grid-RPC. Clients submit computation requests to a scheduler whose goal is to find a server available on the grid. The aim of the DIET project is to develop a set of tools to build computational servers. Scheduling is frequently applied to balance the work among the servers and a list of available servers is sent back to the client; the client is then able to send the data and the request to one of the suggested servers to solve their problem. The Distributed Interactive Engineering Toolbox (DIET) project is focused on the development of scalable middleware with initial efforts focused on distributing the scheduling problem across multiple agents. DIET consists of a set of elements that can be used together to build applications using the Grid-RPC paradigm.

So the goal of DIET project is different from that of our project.


Taverna

Recently, I installed and tried Taverna. Then I investigated its functionality in detail.

Its manual is here: http://www.mygrid.org.uk/usermanual1.7.

Taverna is created by myGrid project and is a tool used for designing and executing workflows. “It provides a desktop authoring environment and enactment engine for scientific workflows expressed in SCUFL (Simple Conceptual Unified Flow Language).” SCUFL is proprietary. As a result, I can not find detailed information about SCUFL.

Services are connected with data links (providing data flow) and control links (coordination of services not connected through data flow).

Several features:

  1. Fault Tolerance

    • Retries for every processor

If a certain processor fails to execute, it will be retried several times. Users can specify maximum number of retries, delay of retries and backoff. Backoff is a factor determining how much the delay time increases for subsequent retries beyond the first.

    • Alternative processor

Users can specify an alternative processor or list of processors which perform the same task as the primary processor. And the alternate is used in place of the main processor if the latter has failed. Note: the alternate has its own definable parameters for ‘Retries’, ‘Delay’ and ‘Backoff’.

  1. Iteration

Taverna supports two kinds of iteration. They are dot and cross. Cross iteration is an all-against-all iteration which means it iterates over all combinations of input values. For dot iteration, the first item for one input is related to the first item in the other input and the second item for one input is related to the second item in the other input…

  1. Services

Taverna provides some built-in services. Among them are XML transformation, base64 encoding/decoding, write text file… Besides, beanshell and RShell are supported as well.

It also supports some well-known bioinformatics services including Soaplab, Biomart, Biomoby. I am not familiar with biology related tools. So I am not sure whether these services are based on web services.

Besides, Taverna supports a functionality called WSDL scavenger. Users can specify address of WSDL document and Taverna will automatically fetch the WSDL document and analyze its content to extract supported operations. Then supported operations in the WSDL document are added to list of available processors so that users can make use of them easily in course of workflow composition.

In addition, Taverna can scavenge existing workflow and extract processors.

Summary

Taverna is a tool designed specifically for Bioinformatics. However, some features may be also useful even if they are applied to more generic applications. These features include WSDL scavenger, dot/cross iteration, fault tolerance…

It is different from our project as Taverna is not based on Grid. The enactment engine is located in client-side machine. In our project, enactment engine is located at server side which makes use of Java CoG kit to manage execution of workflows.

Social website myExperiment

This web site supports finding and sharing of workflows and has special support for Scufl workflows. Users can download workflows posted on the site. For every workflow written in Scufl, there is a corresponding .svg image which is easier to understand and verbose xml Scufl document.

It seems that myExperiment supports almost all functionalities most web2.0 web site supports. They include user management, group management, workflow management, blog, forum, tagging, rating, and commenting. Moreover, some statistics (number of reviews, number of comments…) is done.

Moteur

This project is also based on SCUFL.


Karajan Workflow:

  1. Is there a way to invoke some operations described in a WSDL document?


No comments: