Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Wednesday, January 06, 2010

Tomcat SSL configuration. Import certificate chain to keystore (emit error “keytool error: java.lang.Exception: Input not an X.509 certificate”)

I seldom use mutual authentication in the context of SSL. Recently, in our derived project (integration with MyOSG), we need to enforce use of mutual SSL authentication.

Enable client authentication in Tomcat server

At first, I enabled it in tomcat server configuration file

An important option is "clientAuth”.
Note: before this step, I have generated and imported a certificate for tomcat server.

Import user certificate

This is done by the end users who wish to access the protected services. They need to

  1. import received certificate from CA to their browser
    Actually, both the private key and certificate need to be imported.
    For Firefox, only pkcs#12 format is supported.  If the private key has been imported, you just need to import certificate whose format can be PEM, binary, etc.
  2. import server’s certificate into trusted ca repository in browser
    The aim is to make the browser trust the certificate received from remote service. It’s useful when the service certificate is not issued by a well-known top-level CA.

After those two steps, I directed my browser to the service url. Unfortunately, I got the following error:

image

After digging a little bit, I found out the cause was that tomcat server does not trust certificate sent by my browser. So the solution is simple: add certificate chain related to my certificate to tomcat keystore.

I got the certificate chain from the issuer of my certificate. It’s in PKCS#7 format and contains two certificates in the file. You can view PKCS7-formatted file using following commands:
1) openssl pkcs7 -print_certs -text < pkcs7_cert_chain.pem
2) keytool -printcert –file pkcs7_cert_chain.pem

When I tried to import it into keystore using following command
    keytool -importcert -file pkcs7_cert_chain.pem -keystore keystore -alias test-cert –trustcacerts
I got the following error
    keytool error: java.lang.Exception: Input not an X.509 certificate
I am sure keytool can recognize the file because the following command prints out information in the file correctly.
    keytool -printcert –file pkcs7_cert_chain.pem

Solution

After reading keytool manual carefully, I found following statements:

Importing a New Trusted Certificate

    Before adding the certificate to the keystore, keytool tries to verify it by attempting to construct
    a chain of trust from that certificate to a self-signed certificate (belonging to a root CA), using
    trusted certificates that are already available in the keystore.

Importing a Certificate Reply

    ……

      o If the reply is a PKCS#7 formatted certificate chain, the chain is first ordered (with the user
        certificate first and the self-signed root CA certificate last), before keytool attempts to
        match the root CA certificate provided in the reply with any of the trusted certificates in the
        keystore or the "cacerts" keystore file (if the -trustcacerts option was specified). If no match
        can be found, the information of the root CA certificate is printed out, and the user is
        prompted to verify it, e.g., by comparing the displayed certificate fingerprints with the fin-
        gerprints obtained from some other (trusted) source of information, which might be the root CA
        itself. The user then has the option of aborting the import operation. If the -noprompt option
        is given, however, there will be no interaction with the user.

So, what I was doing is to import a trusted certificate chain. This is not allowed directly. keytool just accepts cert file that includes a single certificate.

So I extracted two certificates into two files, and fed them into keytool one by one. Details:
Use command
    openssl pkcs7 -print_certs < pkcs7_cert_chain.pem
to display the two certificates in the original pkcs7 file. And then copied and pasted each cert into an individual file.

Note: when you are importing a certificate reply from a CA, certificate chain can be imported directly into keystore. However, before doing that, you must guarantee that the corresponding private key has already been imported in to the same keystore.

Thursday, September 24, 2009

Trial of Apache Sling

Introduction

Sling is based on Felix which is an implementation of OGSi. It includes Felix webconsole (http://felix.apache.org/site/apache-felix-web-console.html) bundle to make it easy to inspect OSGi framework. Also Sling integrates Jackrabbit and wraps it as an OSGi bundle. To update Java Content Repository, the client app can send regular HTTP POST requests. In other words, it exposes Java Content Repository in a REST way.
It can be run either independently or in a servlet container (e.g. Apache Tomcat).

Resources

Sling: http://sling.apache.org/site/index.html
Useful Sling help: http://felix.apache.org/site/apache-felix-framework-usage-documentation.html
Install and upgrade bundles in Felix web console: http://sling.apache.org/site/installing-and-upgrading-bundles.html
Address of Felix Management Web Console: http://ip_address:port/system/console/


When I clicked “Configuration” link on the top, Null Pointer Exception was thrown out.
It turns out that this is problem of Felix webconsole. See
http://issues.apache.org/jira/browse/FELIX-1135
http://issues.apache.org/jira/browse/FELIX-1028

This post contains some useful information http://groups.google.com/group/sakai-kernel/web/building-3akai-sling.

Solution

Download sling (standalone distribution) and unzip the tarball:
http://mirror.cc.columbia.edu/pub/software/apache/sling/org.apache.sling.launchpad.app-5-incubator-bin.tar.gz
Download new version of felix webconsole:
https://issues.apache.org/jira/secure/attachment/12407768/org.apache.felix.webconsole-1.2.9-SNAPSHOT.jar
Assume the new version of webconsole is downloaded to dir <DIR>

Execute following commands in the directory where sling tarball is unzipped.

jar xf org.apache.sling.launchpad.app-5-incubator.jar 
cd resources/bundles/5/ 
rm org.apache.felix.webconsole-1.2.8.jar 
cp <dir>/org.apache.felix.webconsole-1.2.9-SNAPSHOT.jar ./ 
jar cMf org.apache.sling.launchpad.app-5-incubator.jar resources META-INF org

Run the server:

java -jar org.apache.felix.webconsole-1.2.8.jar -p 4040

Note: you CANNOT use arbitrary new version of felix webconsole without updating other components of felix package. The reason is that different versions of felix console requires different versions of felix core/OSGi.

Misc.

If you are using both JCR Installer and Felix web console to update a bundle, you will get into trouble.
JCR installer:
    http://sling.apache.org/site/jcr-installer-jcrjcrinstall-and-osgiinstaller.html
    JCR installer tries to install OSGi bundles/modules found in Java Content Repository.
Bug
   “JCR Install prevents update of bundle through other channels like the web console”
    http://issues.apache.org/jira/browse/SLING-1106

Wednesday, September 23, 2009

Java Conent Repository - Jackrabbit

Resources

Wiki: http://en.wikipedia.org/wiki/Content_repository_API_for_Java
JSR 170: http://jcp.org/en/jsr/detail?id=170
JSR 170 API: http://www.day.com/maven/jsr170/javadocs/jcr-1.0/index.html
JSR 283 (JCR 2.0): http://jcp.org/en/jsr/detail?id=283 (in progress)
Jackrabbit: http://jackrabbit.apache.org/
List of Jackrabbit components: http://jackrabbit.apache.org/jackrabbit-components.html

Introduction

Jackrabbit implements JSR 170 and JSR 283.

Installation problem

When I tried to install Jackrabbit on gridfarm machine which uses NFS V3, I got problem described in this bug report (http://issues.apache.org/jira/browse/JCR-1605). Basically it says Jackrabbit needs some features that are not implemented in NFS prior to V4.

Monday, August 10, 2009

MySQL Connection Timeout Error in Hibernate

Problem

I am using Hibernate and MySQL in myour project. After our application is started, it just works well. However, if the application is not used for some time (like one day), it throws connection exception when it is used again. Trace of the exception:
com.mysql.jdbc.CommunicationsException: 
The last packet successfully received from the server was seconds ago.
The last packet sent successfully to the server was 407270 seconds ago, which  is longer than the server configured value of 'wait_timeout'. 
You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.

I googled this problem, and it seems that many other persons also suffer this problem. I found some really nice blog posts about how to solve this problem.

According to the error message, I set the option autoReconnect (http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html) to true. But it still does not work. I searched in google and it shows this option works in some platforms while it doesn't work in other platforms.

Solution

Basically, the solution is 1) use a connection manager 2) tweak parameters of the connection manager.

C3P0 is a free JDBC connection manager.
Some C3p0 configuration parameters are mapped to Hibernate config. Those mapped configuration parameters are listed at http://www.mchange.com/projects/c3p0/index.html#hibernate-specific.

"Please note that configuration parameters that ARE mapped into hibernate's config file MUST be set within hibernate's configuration, or else they will be overridden by hibernate-specified defaults.
[The hibernate-mapped c3p0 parameters are minPoolSize, maxPoolSize, maxIdleTime, maxStatements, acquireIncrement, testConnectionOnCheckout, and idleConnectionTestPeriod. These map to the fllowing hibernate parameters: hibernate.c3p0.min_size, hibernate.c3p0.max_size, hibernate.c3p0.timeout, hibernate.c3p0.max_statements, hibernate.c3p0.acquire_increment, hibernate.c3p0.validate, and hibernate.c3p0.idle_test_period. DataSources configured in Tomcat should always use c3p0-native parameter names. But pools constructed under the covers by Hibernate must be configured in hibernate's config file with hibernate-defined parameter names where applicable.] "

For those C3p0 parameters that are NOT mapped, they should be specified in file WEB-INF/classes/c3p0.properties.
Note: C3P0 configuration parameter <c3p0.testConnectionsOnCheckout> is mapped to <hibernate.c3p0.validate> only in hibernate 2.x!
Also the configuration parameter c3p0.testConnectionsOnCheckout is really expensive. Users can use c3p0.maxIdleTime (it is mapped to hibernate.c3p0.timeout in hibernate) instead.

My hibernate.cfg.xml includes following configuration:
        <property name="connection.provider_class">org.hibernate.connection.C3P0ConnectionProvider</property>
        <property name="c3p0.min_size">5</property>
        <property name="c3p0.max_size">30</property>
        <property name="c3p0.timeout">600</property>
        <property name="c3p0.max_statements">0</property>
        <property name="c3p0.acquire_increment">5</property>
        <property name="c3p0.idle_test_period">60</property>

The idle test operation can be optimized by specifying option preferrredTestQuery.
I am using MySQL as underlying RDB and putting c3p0.preferredTestQuery = SELECT 1 into c3p0.properties works fine (this is stolen from post comment http://www.databasesandlife.com/automatic-reconnect-from-hibernate-to-mysql/#comment-3156).

Resources

A really good article about how to reconnect from Hibernate to MySQL: http://www.databasesandlife.com/automatic-reconnect-from-hibernate-to-mysql/
C3P0 (Connection manager): http://www.mchange.com/projects/c3p0/index.html
How to configure C3P0 in Hibernate: https://www.hibernate.org/214.html
A forum thread about C3P0 configuration in Hibernate: https://forum.hibernate.org/viewtopic.php?t=934779&highlight=c3p0

Saturday, November 08, 2008

Maven cheat sheet

Download and install maven: http://maven.apache.org/download.html.

Running Maven
http://maven.apache.org/guides/getting-started/maven-in-five-minutes.html
http://maven.apache.org/guides/getting-started/index.html
Generally the local repository is provided in USER_HOME/.m2/repository.

Configuration
http://maven.apache.org/guides/mini/guide-configuring-maven.html
Three levels:

Build your own private/internal repository:
This article introduces how to create a repository using Artifactory: http://www.theserverside.com/tt/articles/article.tss?l=SettingUpMavenRepository. In addition, the author also compares some mainstream maven remote repository managers including Standard maven proxy, Dead simple Maven Proxy, Proximity and Artifactory.
In my case, I also use Artifactory and deploy it to tomcat. It has a nice web-based interface. Artifactory uses database(derby I think) to store various repository data so a user can not know the repository content by directly looking at the directory.

Deploy your artifacts to remote repository by using maven-deploy plugin:
http://maven.apache.org/plugins/maven-deploy-plugin/usage.html
(1) If the artifacts are built by using Maven, you should use deploy:deploy Mojo.
In your pom.xml, element <distributionManagement/> should be inserted to tell Maven how to deploy current package. If your repository is secured, you may also want to configure your settings.xml file to define corresponding <server/> entries which provides authentication information.
Command: mvn deploy.
(2) If the artifacts are NOT built by using Maven, you should use deploy:deploy-file Mojo.
Sample command:
mvn deploy:deploy-file -Dpackaging=jar -Durl=file:/grids/c2/www/htdocs/maven2 
-Dfile=./junit.jar -DgroupId=gridshib -DartifactId=junit -Dversion=GTLAB

FAQ:
(1) What does maven standard directory layout look like?
http://maven.apache.org/guides/introduction/introduction-to-the-standard-directory-layout.html
(1) How to specify parent artifact in pom.xml?
Read http://maven.apache.org/guides/introduction/introduction-to-the-pom.html.
(2) If a dependent package can not be download from central Maven repository, three methods can be used to deal with it:

"
  1. Install the dependency locally using the install plugin. The method is the simplest recommended method. For example:
    mvn install:install-file -Dfile=non-maven-proj.jar -DgroupId=some.group -DartifactId=non-maven-proj -Dversion=1

    Notice that an address is still required, only this time you use the command line and the install plugin will create a POM for you with the given address.

  2. Create your own repository and deploy it there. This is a favorite method for companies with an intranet and need to be able to keep everyone in synch. There is a Maven goal called deploy:deploy-file which is similar to the install:install-file goal (read the plugin's goal page for more information).
  3. Set the dependency scope to system and define a systemPath. This is not recommended, however, but leads us to explaining the following elements:
"
(2) How to add new repository?
Put following code snippet into pom.xml or settings.xml.
<repository>
  <id>your-new-repository-id</id>
  <name>New Maven Repository </name>
  <layout>default</layout>
  <url>Address of the new repository</url>
  <snapshots>
    <enabled>enable-it?</enabled>
  </snapshots>
  <releases>
    <enabled>enable-it?</enabled>
  </releases>
</repository>
(3) How to disable default central maven repository?
Put following snippet into your pom.xml.
<repository>
  <id>central</id>
  <name>Maven Repository Switchboard</name>
  <layout>default</layout>
  <url>http://repo1.maven.org/maven2</url>
  <snapshots>
    <enabled>false</enabled>
  </snapshots>
  <releases>
    <enabled>false</enabled>
  </releases>
</repository>
(4) How can I package source code without run test?
Feed parameter -Dmaven.test.skip=true into the command line.
Note this property is defined by maven plugin surefire.
(5) Why does "mvn clean" delete my source code?
In your pom.xml, if content of element <directory> nested in element <build> is "./", "mvn clean" will delete all content in current directory including the src directory.
There are two more elements which can be used to specify locations of compiled classes.
outputDirectory:  The directory where compiled application classes are placed.
testOutputDirectory:  The directory where compiled test classes are placed.
(6) How to add resources into built package?
http://maven.apache.org/guides/getting-started/index.html#How_do_I_add_resources_to_my_JAR.
http://maven.apache.org/guides/getting-started/index.html#How_do_I_filter_resource_files

Monday, August 25, 2008

Insert pubchem data into HBase

HBase shell
HBase provides a shell utility which lets users to execute simple commands. The shell can be started up using:
${HBASE_HOME}/bin/hbase shell
Then input command help to get help document which describes usage of various supported commands. These commands can be used to manipulate data stored in HBase. E.g. command list can be used to list all tables in hbase. Command get can be used to get row or cell contents from hbase. Command put can be used to store data into a cell.

Data insertion
Data source is ftp://ftp.ncbi.nlm.nih.gov/pubchem/. I modified python scripts and C source code given by Rajarshi.
Data retrieval and processing steps:

  1. Download all information about compounds from ftp://ftp.ncbi.nlm.nih.gov/pubchem/Compound/CURRENT-Full/SDF. I finally got 123 GB.
  2. Decompress those files
  3. Extract information from these .sdf files and write it to .txt files. Library openbabel is used to compile the C++ code.
  4. Combine those .txt files generated in step 3 into one big .dat file
  5. Write a ruby script to insert all data in the .dat file into HBase.
    Command is like this: ${HBASE_HOME}/bin/hbase org.jruby.Main rubyscript.rb

Why did I write Ruby script instead of Java program in step 5?
HBase is written in Java and so provides Java API. However, to compile Java programs is kind of cumbersome -- set lengthy CLASSPATH ... 
So I chose to write scripts which can be executed directly by HBase shell. I found useful information on this page. There is a section called "scripting" in that post. But the information there is far from complete. It does not tell readers how to write the scripts. At first, I wrote a script which included some shell commands, one command per line, and then fed it to hbase shell. Unfortunately, it didn't work. After enumerous trials, I found that Ruby scripts could be fed to shell. Ruby scripts cannot make use of existing shell commands directly. Ruby binding of original Java APIs must be used.

I have not learnt Ruby at all before. So I must teach myself to grasp basic knowledge about Ruby. Ruby is sort of different in terms of syntactic flexibility. It supports so many shorthands to improve productivity. Anyway, "Ruby is easy to learn, but hard to master". By the way, Ruby documents seem not to be abundant compared with Python, Perl...

How to write Ruby scripts for HBase?
This site http://wiki.apache.org/hadoop/Hbase/JRuby contains related information. But I could not run the sample script successfully because of errors in the script!!! Damn it! I wonder whether the author had tested the code before he released it. Some errors are so obvious.
After the ruby script is completed, it can be executed using:
${HBASE_HOME}/bin/hbase org.jruby.Main rubyscript.rb 
Java API:
http://hadoop.apache.org/hbase/docs/current/api/index.html

My ruby script:

#!/usr/bin/ruby -w

include Java
import org.apache.hadoop.hbase.HBaseConfiguration
import org.apache.hadoop.hbase.HColumnDescriptor
import org.apache.hadoop.hbase.HTableDescriptor
import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.client.HTable
import org.apache.hadoop.hbase.io.BatchUpdate
import org.apache.hadoop.io.Text

pubchem_compound_fields = [
    'cid',
    'iupac_openeye_name',
    'iupac_cas_name',
    'iupac_name',
    'iupac_systematic_name',
    'iupac_traditional_name',
    'nist_inchi',
    'cactvs_xlogp',
    'cactvs_exact_mass',
    'openeye_mw',
    'openeye_can_smiles',
    'openeye_iso_smiles',
    'cactvs_tpsa',
    'total_charge',
    'heavy_atom_count']

compound_table_name = 'compound'

numfields = pubchem_compound_fields.length

path = "/home/zhguo/BigTable/BigTable-Pubchem/data/"
filename = "#{path}compound.dat"
file = File.new(filename, 'r')
counter = 0

conf = HBaseConfiguration.new
tablename = compound_table_name
tablename_text = Text.new(tablename)
desc = HTableDescriptor.new(tablename)
coltextarr = Array.new
pubchem_compound_fields.each_with_index do |v, i|
    if (i == 0) then next; end
    desc.addFamily(HColumnDescriptor.new("#{v}:"))
    coltextarr << Text.new("#{v}:")
end

admin = HBaseAdmin.new(conf)
if !admin.tableExists(tablename_text) then
    admin.createTable(desc)
=begin
    puts "deleting table #{tablename_text}"
    admin.disableTable(tablename_text)
    admin.deleteTable(tablename_text)
    puts "deleted table #{tablename_text} successfully"
=end
end

#admin.createTable(desc)
table = HTable.new(conf, tablename_text)

startind = 1641500 #from which line should we start.This
                   #is useful when you don't want to start
                   #from the beginning of the data file.

nlines = `cat #{filename} | wc -l`

logfilename = 'updatedb.log'
logfile = File.new(logfilename, "a")
while (line = file.gets) #&& (counter < 20)
    counter += 1
    if (counter < startind) then
        next
    end
    msg = "processing line #{counter}/#{nlines}"
    logfile.puts msg
    if counter%100 == 0 then
        print  msg
        STDOUT.flush
        logfile.flush
    end

    arr = line.split("\t")
    len = arr.length
        if (numfields != len) then
        next
    end
    rowindex = 0
    rowname = arr[rowindex]
    arr.delete_at(rowindex)
    row = Text.new(rowname)
    b = BatchUpdate.new(row)

    arr.each_with_index do |v, i|
        str = java.lang.String.new(v)
        b.put(coltextarr[i], str.getBytes("UTF-8"))
    end
    table.commit(b)
end

Sunday, August 24, 2008

Installation and configuration of Hadoop and Hbase

Hadoop

Installation
Hadoop installation instructions: http://hadoop.apache.org/core/docs/current/quickstart.html and http://hadoop.apache.org/core/docs/current/cluster_setup.html.
To set up hadoop cluster, generally two configuration files should be modified:
hadoop-site.xml and slaves.
(1) My hadoop-site.xml looks like:
<configuration>
  <property>
    <name>fs.default.name</name>
    <value>pg3:9000</value>
  </property>
  <property>
    <name>mapred.job.tracker</name>
    <value>pg3:9001</value>
  </property>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>
Read file hadoop-default.xml for all available options.
(2) My slaves file looks like:
localhost
pg1
pg2

I need to install Hadoop on three machines now and I use rsync to make these machines synchronized with eath other in terms of configuration.

Commands
(*) Format a new file system: hadoop namenode -format
(*) Start/stop Hadoop
start-dfs.sh/stop-dfs.sh
start up the distributed file system (HDFS)
start-mapred.sh/stop-mapred.sh
start up map reduce service.
start-all.sh/stop-all.sh
start up both HDFS and map reduce service

Hadoop reads content in file slaves to get all nodes and then starts up all these nodes.

Check status of the services
HDFS: http://domain:50070/
MapReduce: http://domain:50030/

HBase

Installation instructions: http://hadoop.apache.org/hbase/docs/current/api/overview-summary.html#overview_description
The configuration file is hbase-site.xml. My hbase-site.xml looks like
<configuration>
  <property>
    <name>hbase.master</name>
    <value>pg3:60000</value>
    <description>The host and port that the HBase master runs at.</description>
  </property>

  <property>
    <name>hbase.rootdir</name>
    <value>hdfs://pg3.ucs.indiana.edu:9000/hbase</value>
    <description>The directory shared by region servers.</description>
  </property>
</configuration>

Commands
start-hbase.sh    starts up hbase service
stop-hbase.sh      stop hbase service

Note: hbase bases its functionalities on hadoop. Sometimes it is necessary for hbase to know the configuration of hadoop. Following statements are excerpted from hbase document which I think is important:

"Of note, if you have made HDFS client configuration on your hadoop cluster, hbase will not see this configuration unless you do one of the following:
  • Add a pointer to your HADOOP_CONF_DIR to CLASSPATH in hbase-env.sh
  • Add a copy of hadoop-site.xml to ${HBASE_HOME}/conf, or
  • If only a small set of HDFS client configurations, add them to hbase-site.xml
An example of such an HDFS client configuration is dfs.replication. If for example, you want to run with a replication factor of 5, hbase will create files with the default of 3 unless you do the above to make the configuration available to hbase. "

Thursday, March 27, 2008

Workaround of bug in integration of db4o and axis2

I have been trying to solve the problem described in my last post.

I tried different versions of tomcat and Db4o which did not help me out. I posted this problem in forum of Db4o to ask for help. However, no one supplied right solution. I tried countless possible solutions. Finally, I got it to work. Following is the procedure how I discovered my solution.

To enhance speed of development , I tried to find some software support. I have been using Eclipse as my IDE. So naturally WTP(Web Tool Platform) is a great plug-in to support development of web applications in Eclipse. WTP provides support for Tomcat and Axis2 which are being used in my project. Here(http://www.eclipsecon.com/webtools/community/tutorials/BottomUpAxis2WebService/bu_tutorial.html) is a great tutorial about how to deploy and start Axis2 web service in Eclipse.

Then I doubt that the problem maybe results from Axis2. So I decided to write a simple servlet to do similar job. In other words, Axis2 was not used. And test result showed that everything worked correctly with Db4o. So this meaned that it is highly possible that Axis2 is cause of the problem. I did further investigation to uncover what is wrong under the hood. I built the same Axis2 service in Eclipse and deployed the service to Eclipse's temporary publish directory. Surprisingly, it worked!!! However, if I built web service using ADB(Apache Databinding) in Axis2, wrapped it to a .aar archive file and deployed it into specific directory (services), it did not work!!! So, I made sure that the procedure of deployment of Axis2 web service in Eclipse MUST be different from what I did before. Finally, I found that Eclipse does not wrap web service implementation into a .aar archive file. Instead the deployment directory layout is:

axis2
    - WEB-INF
        - lib
        - conf
        - services
            - Sample                        //Sample is the name of this web service
                - META-INF
                    - services.xml       //this file describes the information of this web service.
        - classes
            - package                     //this is path corresponding to package
                - *.class                    //These .class files are implementation of web service.
    - META-INF
    - axis2-web

The original layout is:
axis2
    - WEB-INF
        - lib
        - conf
        - services
            - Sample.aar               //the .aar archive file
        -classes
    - META-INF
    - axis2-web

Difference is highlighted in blue.
The services.xml is also different. The new services.xml is:

<service name="Sample" >
    <description>
        Please Type your service description here
    </description>
    <messageReceivers>
        <messageReceiver mep="http://www.w3.org/2004/08/wsdl/in-only" class="org.apache.axis2.rpc.receivers.RPCInOnlyMessageReceiver" />
        <messageReceiver  mep="http://www.w3.org/2004/08/wsdl/in-out"  class="org.apache.axis2.rpc.receivers.RPCMessageReceiver"/>
    </messageReceivers>
    <parameter name="ServiceClass" locked="false">package.Sample</parameter>
</service>

Besides, in my previous using of Axis2, I used tool provided by Axis2 to construct automatically a stub class and some other auxiliary classes of client side. Then I added my implementation code to that stub class (this stub class contains Axis2-specific stuff). In this new deployment, I did not rely on any Axis2-specific functionality. I just wrote my implementation and compiled it into .class files. And then copied these .class files to directory ${TOMCAT_ROOT}/axis2/classes/package-path/. In addition, I needed to manually create and edit corresponding services.xml file. In my previous deployment, this file is generated automatically by Axis2 tool.
To sum up, this new deployment method eases development of web service because no Axis2-specific stuff is involved during development. The drawback is it is not compact considering that those files are scattered in different places. With regard to this deployment, I did not find much useful information on web. Maybe this method is not recommended, who knows...

However, I have no choice because only one of them works well.

Monday, March 24, 2008

DB4O corrupts in Tomcat with Axis2

I integrated Db4o in our project. Db4o is used to persist state information of workflow. After I integrated it, it seemed to work OK. State information can be successfully stored into its object database. And information can be retrieved successfully. However, after I restarted Tomcat, Db4o corrupted during retrieval of data from database. I tried different kinds of query language Db4o supports. Unfortunately, none of these methods work.

For Native Query, the error in Tomcat log is:

java.lang.IllegalArgumentException: argument type mismatch
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:585)
    at com.db4o.query.Predicate.appliesTo(Unknown Source)
    at com.db4o.inside.query.PredicateEvaluation.evaluate(Unknown Source)
    at com.db4o.Platform4.evaluationEvaluate(Unknown Source)
    at com.db4o.QConEvaluation.visit(Unknown Source)
    at com.db4o.Tree.traverse(Unknown Source)
    at com.db4o.QCandidates.filter(Unknown Source)
    at com.db4o.QConEvaluation.evaluateEvaluationsExec(Unknown Source)
    at com.db4o.QCon.evaluateEvaluations(Unknown Source)
    at com.db4o.QCandidates.evaluate(Unknown Source)
    at com.db4o.QCandidates.execute(Unknown Source)
    at com.db4o.QQueryBase.executeLocal(Unknown Source)
    at com.db4o.QQueryBase.execute1(Unknown Source)
    at com.db4o.QQueryBase.getQueryResult(Unknown Source)
    at com.db4o.QQueryBase.execute(Unknown Source)
    at com.db4o.inside.query.NativeQueryHandler.execute(Unknown Source)
    at com.db4o.YapStreamBase.query(Unknown Source)
    at com.db4o.YapStreamBase.query(Unknown Source)
    at org.cogkit.cyberaide.axis2ws.StatusDB.getStatusByUID(StatusServiceInterfaceSkeleton.java:1204)
    at org.cogkit.cyberaide.axis2ws.StatusServiceInterfaceSkeleton.getJSONStatusByUID(StatusServiceInterfaceSkeleton.java:184)
    at org.cogkit.cyberaide.axis2ws.StatusServiceInterfaceMessageReceiverInOut.invokeBusinessLogic(StatusServiceInterfaceMessageRece
iverInOut.java:80)
    at org.apache.axis2.receivers.AbstractInOutSyncMessageReceiver.invokeBusinessLogic(AbstractInOutSyncMessageReceiver.java:42)
    at org.apache.axis2.receivers.AbstractMessageReceiver.receive(AbstractMessageReceiver.java:96)
    at org.apache.axis2.engine.AxisEngine.receive(AxisEngine.java:145)
    at org.apache.axis2.transport.http.HTTPTransportUtils.processHTTPPostRequest(HTTPTransportUtils.java:275)
    at org.apache.axis2.transport.http.AxisServlet.doPost(AxisServlet.java:120)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:710)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:803)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:263)
    at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:844)
    at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:584)
    at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
    at java.lang.Thread.run(Thread.java:595)

It seems that the error results from incompatible type conversion. I am sure I followed the instructions elaborated in the official document.

Then, I tried SODA. The output was weirder. The program could not function as I expected. State information could not be retrieved successfully. However, when I investigated tomcat log, no error report existed in log!!! So, I had no way to tell what happened under the hood. This was pretty annoying. It seemed that Db4o indeed retrieved something. However, the fields of the object retrieved are invalid.

I read almost all posts in Db4o forum/community. I tried all suggested solutions. But it still doesn't work. I tried different versions of Db4o, none of them solved my problem.

It has taken me lots of time. Currently, I am not sure whether I can solve it finally...

Monday, March 17, 2008

State Information Persistence (Status server)

Previously, state information of all workflows in status server is kept in memory. As you know, this strategy is not practical if there is too much information so that it cannot fit into memory. In other words, data persistence is necessary. Actually, after careful investigation, I found two solutions.
(1) Cache
    This strategy stores some data in hard disk. And the data recently used is saved in memory to enhance performance. So it is kind of data-centric mechanism making use of locality.
    Ehcache(http://ehcache.sourceforge.net/) is a project based on this strategy.
    In order to put data into cache, you MUST construct a element object which contains key and value. From ehcache 1.2 the key and value can be serializable objects which improve the flexibility. The unique way for data query is to specify a key object. Obviously, much work must be done to compose complex query. Ehcache provides ways for users to control every aspect of cache behaviors.
(2) Database
    This strategy makes use of database to store information. There is a corresponding name : object database. The database provides interfaces by which common database operations (insert/query/update/delete) can be invoked. However, this kind of database is different from traditional relation database. The unit of manipulation in object database is an object. In other words, you can insert/query/delete objects instead of tuples. In addition to retrieval of field value in a specific object, member functions of the object can also be invoked.
Db4o(http://www.db4o.com/) is an object database. Here is a simple introduction written by me: http://zhenhua-guo.blogspot.com/2008/03/db4o-introduction.html.

    Actually, these two strategies are not competitive. Implementation of object database may make use of cache mechanism. So it is not surprising to hear that some great applications (hibernate, Spring...) use Ehcache. For upper level programmers, I think object database should be easier to work on.

    At last, I decided to apply DB4O to our project. One important reason is that this strategy lets programmers write code in high level. I don't need to care about any details of database infrastructure. As a result, modification of my original code is little so that it can be done quickly. Moreover, maintenance of code is easier. We store both data and related operations in database instead of scattered places. What's more, support of query is more powerful in DB4O than in Ehcache. It supports three kinds of query languages: Query By Example, Native Queries and SODA. And complex query can be composed easily.

One lesson: when dealing with data retrieved from object database, users MUST be careful about type conversion so that incompatible conversion won't occur. Because this kind of errors don't appear until run time, debugging becomes more difficult especially for web applications. I was aware of cause of the error by careful reading lengthy Tomcat log.

Result:
   State information of workflows is stored in hard disk and cached in memory by using DB4O.

DB4O introduction

Db4o is a high-performance object database for Java and .NET.

Ø         Open db

     ObjectContainer db = Db4o.openFile(filename);

 

Ø         Insert

Objects are inserted by using set() method.

ClassName obj = new ClassName(parameters);

db.set(obj);

 

Ø         Retrieve

(1)    Query by Example (QBE)

       Create a prototypical object for db4o to use as an example of what you wish to retrieve. Db4o will return all of the objects which match all non-default field values. The results will be returned as an ObjectSet instance.

ClassName obj = new ClassName(values…); //prototypical object

ObjectSet result = db.get( obj );

listResult( result );

Db4o supplies a shortcut to retrieve all instances of a class:

ObjectSet result = db.get(ClassName.class);

Following code can be used to iterate over the results:

while( result.hasNext() ){

    System.out.println( result.next() );

}

(2)    Native Query(NQ) --- main db4o querying interface.

       Native Queries provide the ability to run one or more lines of code against all instances of a class. Native query expressions return true to mark specific instances as part of the result set.

    List<ClassName> objs = db.query( new Predicate<ClassName>() {

           public Boolean match(ClassName obj){

                  return obj.getProperty() == value;

           }

    }

       Users must be very careful with side effects --- especially those that might affect persistent objects.

(3)    SODA Query API

 

Ø         Update

       Updating objects is as easy as storing them. You use the same set() method to update objects: just call set() again modifying any object.

ObjectSet result = db.get(new ClassName(parameters));

ClassName found = (ClassName)result.next();

found.methodName(parameters);

db.set(found);

Note: we query the object first. If the object is not ‘known’ (having been previously stored or retrieved during the current session), db4o will insert a new object instead of updating existing object. In this case, db4o think that you want to insert a new object which has the same field values.

 

Ø         Delete

       Objects are removed by using delete() method.

ObjectSet result = db.get( new ClassName(…));

ClassName found = (ClassName)result.next();

db.delete( found );

If you want to tune DB4O to get higher performance, you need to change the default configuration.

Friday, February 15, 2008

RESTful web services in Java

Recently, I read some articles about RESTful web services and I am looking for some java libraries which have great support for REST. REST is supported by Axis2. However, the support is very limited. First, document(http://ws.apache.org/axis2/1_3/rest-ws.html) of REST support in the official web site is horrible. The content is very very brief so that I have more questions and confusions than what have been addressed by that document. Support for REST in Axis2 relies on a new feature in WSDL2 which enables HTTP binding. Here is a good article about it:http://www.ibm.com/developerworks/webservices/library/ws-rest1/. However, HTTP binding doesn't enable programmers to implement a full REST style system. I did not dig into WSDL2 to get more knowledge about its HTTP binding. Here(http://wso2.org/blog/footballsoccerpainting/949) is an article from someone else who complains Axis2.
Then, I found that JCP published a Java API specification about RESTful web services. It is JSR-311(http://jcp.org/en/jsr/detail?id=311). That looks pretty good because now we have a standard about how to use REST in Java. Jersey(https://jersey.dev.java.net/) is reference implementation of the specification. Besides Jersey, Restlet(http://www.restlet.org/) is another implementation which provides more features. Note that the specification itself is still in beta phase.

I decided to try Jersey. First download and unpack it.
Sample code:

@Path("/")
public class RESTTest{
    @HttpContext UriInfo uriInfo;	
    
    @GET
    @ProduceMime("text/plain")
    public String getUserAll(){
    	return "You want to retrieve information about all users.";
    }
    
    @Path("{user}")
    public UserResource getUserInfoAsText(@UriParam("user") String userid){
    	return new UserResource(userid);
    }
}
Most readers have noticed that Java annotation is used frequently. I think this method is handy and convenient.