Showing posts with label http. Show all posts
Showing posts with label http. Show all posts

Wednesday, April 08, 2009

Port attribute in HTTP Cookie

Recently, I have encountered a http session problem in jetty.
I start two Jetty instances on the same machine. These two jetty instances share the same base URL, but port numbers are different.
    http://example.com:8000/
    http://example.com:9000/
When I visit http://example.com:8000/, a cookie (called JSESSIONID) is set by the server automatically.
After that I visit the other URL http://example.com:9000/, the cookie set for http://example.com:8000/ is sent to the server by user agent. The the server gets confused :-(

After debugging it, I found the reason is that the server does NOT include port number in the Set-Cookie header. According to section 3.3.1 in RFC 2965, if the port attribute is missing in the Set-Cookie header, the user agent would react based on following description:

   Port    The default behavior is that a cookie MAY be returned to any
           request-port.

That means if a cookie is set for address A, the cookie also matches those addresses which share all the same URL components with A but port number.

For tomcat, I searched its email list
http://www.google.com/search?hl=en&client=firefox-a&rls=org.mozilla%3Aen-US%3Aofficial&q=cookie+port+site%3Ahttp%3A%2F%2Fmail-archives.apache.org%2Fmod_mbox%2Ftomcat-users%2F&btnG=Search
It seems that port number currently is not supported in cookie management.

Solution
(1) use different domain names which map to the same IP
(2) use different paths.
E.g.
    http://example.com:8000/webapp1
    http://example.com:9000/webapp2

Thursday, April 10, 2008

A Simple Web 2.0 Framework

I have completed a simple application which supports common web 2.0 features: rating, commenting, tagging.
Demo address is: http://156.56.104.196:8080/tagService/index_ui.html.

The basic unit in my system is called record. A record is abstraction of an object. Theoretically, it can be of any type: image, text, presentation... However, the key issue is how to present different types of records to end users. Text is easy to present and it can be displayed to users directly. For image, img element can be used but it requires that the image be stored somewhere on the website so that it can be accessed by visiting a URL. Currently, only text is supported. When a user retrieves some record, text content of that record is displayed directly in browser.

Operations:
(1) Add a new record
Users add a new record by uploading a file to server and specifying name, description and tags. Then a unique id is generated for this new record at server side.
(2) List all records
Get all records in the system.
(3) Get a record based on its id
According to record id, get corresponding id.
(4) Get records according to tag
According to user-specified tag, corresponding records are returned. Currently, users can only specify a single tag.
(5) Post comment and rating of a certain record
Users can post comments and ratings about records.
(6) Comment and rating of comment.
Besides comments of records, comments of comments are supported. It means that users can post comments about existing comments besides records.

Architecture:
web20_arch_ws 
Client sends request to a server which uses servlet to handles it. Then the servlet accesses a web service which provides web 2.0 functionalities.

Presentation location:
What is stored in backend database is raw data. And it does not contain information about how to present it to end users. E.g. layout, font-size...
There are two strategies to transform raw data into final presentation data: server-side servlet and client-side javascript.
(1) Server-side servlet

web20_arch_html_ws
When servlet responds to user's request, a HTML document is returned. The HTML document does not only contains data but also contains presentation information. So it can be displayed directly. In this method, servlet formats the raw data received from web service into a corresponding HTML document.

(2) Client-side javascript

web20_arch_json_ws 
In this method, servlet just returns raw data in JSON format. It means returned data does not contain any presentation information. And client side javascript displays the raw data in a specific manner. Actually, to use this method, AJAX should be used. Object XmlHttpRrequest can be used to make asynchronous request without refreshing whole page.

Personally, I prefer the second method.
I remember there are some server-side Java libraries which can be used to make transformation execute with ease.

AJAX
    AJAX is used frequently. However, the format of request is not XML. I manually compose POST query in Javascript and then send it by using XmlHttpRequest. Actually, I use application/x-www-form-urlencoded which is the default data encoding of form submission. I chose this encoding because it is simple compared to the other encoding - multipart/form-data. To make it work well, the data must be escaped.
    Responses from server are in JSON format. So javascript decodes JSON string and dynamically modifies web page accordingly.
    An alternative method is to use XML-RPC.

Possible future work:
(1) User management
Currently, the system is open and everyone can use it anonymously.
(2) RESTful access
(3) Support image type in presentation layer.
(4) Persistence of server data.

Sunday, April 06, 2008

HTTP Form POST Encoding

Form is a common way to submit requests to server with various parameters. In W3C HTML4(http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4), form element has a special attribute "enctype" to indicate how to encode form data for submission.

(1) application/x-www-form-urlencoded (default type)

    from W3C specification:
  1. Control names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., `%0D%0A').
  2. The control names/values are listed in the order they appear in the document. The name is separated from the value by `=' and name/value pairs are separated from each other by `&'.
Examples
    home=Cosby&favorite+flavor=flies 

(2) multipart/form-data
From W3C specification:
"he content type "application/x-www-form-urlencoded" is inefficient for sending large quantities of binary data or text containing non-ASCII characters. The content type "multipart/form-data" should be used for submitting forms that contain files, non-ASCII data, and binary data.

The content "multipart/form-data" follows the rules of all multipart MIME data streams as outlined in [RFC2045]. The definition of "multipart/form-data" is available at the [IANA] registry.

A "multipart/form-data" message contains a series of parts, each representing a successful control. The parts are sent to the processing agent in the same order the corresponding controls appear in the document stream. Part boundaries should not occur in any of the data; how this is done lies outside the scope of this specification.

As with all multipart MIME types, each part has an optional "Content-Type" header that defaults to "text/plain". User agents should supply the "Content-Type" header, accompanied by a "charset" parameter.

Each part is expected to contain:

  1. a "Content-Disposition" header whose value is "form-data".
  2. a name attribute specifying the control name of the corresponding control. Control names originally encoded in non-ASCII character sets may be encoded using the method outlined in [RFC2045]. "
"

Examples:

    Content-Type: multipart/form-data; boundary=AaB03x
   --AaB03x
   Content-Disposition: form-data; name="submit-name"

   Larry
   --AaB03x
   Content-Disposition: form-data; name="files"; filename="file1.txt"
   Content-Type: text/plain

   ... contents of file1.txt ...
   --AaB03x--