Wednesday, December 13, 2006

Status Report for Date 11/30/2006 to Date 12/13/2006

Recently, I must prepare for three final examinations(for courses I selected) simultaneously. So, less time was spent on Maze. What I have done is listed below:
1) make maze interface display in English
Firstly, I tried to compile and build maze client from source code. Here is the detailed procedure: http://zhenhua-guo.blogspot.com/2006/12/compile-and-build-maze-client.html .
Secondly, I changed the interface to support English. Original language supported is Chinese. The resource files that describe the windows, forms and so on are easy to change because they are separate files. However. many displayed strings are dynamically specified in program source code during execution time. This kind of display is hard to modify because these strings scatter all over the source code. I have tried to make all these strings displayed in English. However, there may be some strings left.
2) Try to make an install package
Firstly, I must find a free software to complete this task.
Secondly, because many optional actions can be done during making an install package, for example, wirting registry, creating directory, I need to ask the members of maze group to know what exactly happens during making an install package.
3) I am writting a simple user manual.
This manual will be used to tell rookies how to use Maze.
4) I will write a list of features of Maze.
Although I have written a description about Maze, I don't think I have learned about all its features. Because there is no systematic technical specification for Maze, it is hard for me to catch all features by reading source code. I think I will contact members of Maze group to get more information. However, I am not sure whether they have full description for Maze.

Friday, December 08, 2006

Compile and Build Maze Client

These days, I am continuously working on modifying, compiling and building maze client. Because there is no document and some additional packages are used, it is difficult to build successfully from souce code.
I downloaded a Entriprise Trial edition(C++ Builder 6.0) from website of Borland company. During the installation of C++ Builder, I selected all optional packages in case some of them are needed later.
After installation of C++ Builder:
1) ShellControls package must be installed.
This package is not installed as standard package when C++ builder is installed.
This package is located at "CB_DIR/Examples/ShellControls"(CB_DIR is the directory where you install C++ Builder to). There are two packages in this directory: bcbshlctrls.bpk and dclshlctrls.bpk. Package bcbshlctrls.bpk should be opened in C++ Builder and compiled. Package dclshlctrls.bpk should be opened in C++ Builder, compiled and installed.
(dclshlctrls.bpk is a design time package)

2) I think the project file of maze is destroyed and I must modify it manually.
In maze project, shellctrls.obj is needed. However, in project file, there is no statement to indicate compile how to build shellctrls.obj. So, I add several directives to tell C++ Builder how to build shellctrl.obj.
In section <OBJFILES> , I added "cbdebug\ShellCtrls.obj"
In section <FILELIST>, I added <FILE FILENAME="ShellCtrls.cpp" FORMNAME="" UNITNAME="ShellCtrls.cpp" CONTAINERID="CCompiler" DESIGNCLASS="" LOCALCOMMAND="">

3) If compile errors occur, for example, some macros in ShellCtrls.cpp are not defined...:
then you should add a macro definition in this file.
"#define NO_WIN32_LEAN_AND_MEAN" should be added to ShellCtrls.cpp.
Note!!! this definition must be put at the beginning of the file. If you put it otherwhere, strange errors may occur!!(Lots of time was consumed when I debugged it here).

4) To make sure everything is OK, I put all source files including some provided by C++ Builder needed into project directory. Those files provided by C++ Builder are shellconsts.h, ShellCtrls.h, ShellCtrls.cpp.

Wednesday, November 29, 2006

Status Report for Date 11/16/2006 to Date 11/29/2006

I have done several tasks in these two weeks:
  1. I got Maze run successfully. All its functions can be used now.
  2. For future maintaining, I wrote down all logs which recorded procedure of setting up system in detail. http://zhenhua-guo.blogspot.com/2006/11/memorandum-for-maze.html http://zhenhua-guo.blogspot.com/2006/11/search-engine-in-maze.html Some discussions are included in articles above as well.
  3. Refine my documents and get deeper understanding on Maze.
  4. Do some test to verify the functions of simultaneous downloading, searching and chatting.

There are several points in Maze that may be improved.

  1. The search engine can work better. The problem is described in my previous article(http://zhenhua-guo.blogspot.com/2006/11/search-engine-in-maze.html). I think the reason is text segment and handle for file name. When program does text segment, this kind of search string--"a.b.c" is segmented into string 'a', string 'b', and string 'c'.
  2. Most of work needed by setting up Maze system has to be done manually and this is not friendly to administrators. I must manually create corresponding directories, etc. A package can be built to make this procedure short and easy.
  3. There are some bugs in Maze Client, such as abnormally high CPU utilization ratio. Maze client is built by Borland C++ Builder.
  4. There are no setting up documents. In other words, there is no description to tell administrator how to install maze.
  5. There are no detailedly and systematically technical description of Maze system in English. There is one master degree thesis(written in Chinese) concentrating on the details which was written by one of the original developers of Maze.
  6. Unit of sharing in Maze is file. Its transfer speed is slower than that of some other P2P protocols, such as BitTorrent because BitTorrent splits a file into data blocks and these blocks can be accessed simultaneously.

Sunday, November 26, 2006

search engine in maze

With Directory server, we can collect information of sharing files of all users. By running Indexer program on collected information, index is built. Then, by running index server program, we can provide search service. However, this service is not oritented to common users.
Search requests from maze client are sent to a CGI program(ftpsearch) on a specified httpd server. Then, this CGI program formats the search request string and forwards the formatted request to Index Server. So, Index Server implementes searching function actually. After Index server handles this request, it sends result back to the CGI program(ftpsearch). Then, this CGI program formats the output and sends final result to maze client. The Index server sends result to CGI program in XML format.

Why are CGI program and Index Server separated? Why don't we write both CGI program and Index server program in a single program?
I think of one reason:
This is in line with current web development trend. So far, the MVC(Modle-View-Control) framework is proved to be a good architecture. By separating the components of view formatting and transaction handling, we can modify either part easily and don't need to change the other part. As a result, it is easy to maintain.

Detailed Set Up Procedure:
First, we need to run a httpd server. Apache is a good choice. CGI program ftpsearch should be built successfully. Then we configure apache to specify cgi directory, root directory and so on. Then we put CGI program in the specified directory. Besides that, we also should write a configuration file for this CGI program to tell it the addresses of Index Server ... The configuration file looks like:
wwwpath=/home/webdoor/wwwroot/
confpath=/home1/maze/Maze/bin/
logpath=/home1/maze/Maze/log/
NoSumUp=ON
servernum=10
server0=162.105.146.85
serverport0=23007
serverproto0=
server1=162.105.146.86
serverport1=23005
serverproto1=
server2=162.105.146.87
serverport2=23000
serverproto2=
server3=162.105.146.3
serverport3=23003
serverproto3=
server4=162.105.146.46
serverport4=23001
serverproto4=
server5=162.105.146.85
serverport5=23009
serverproto5=
server6=162.105.146.86
serverport6=23006
serverproto6=
server7=162.105.146.87
serverport7=23004
serverproto7=
server8=162.105.146.3
serverport8=23008
serverproto8=
server9=162.105.146.46
serverport9=23002
serverproto9=

server: IP address of index server
serverport: service port of index server(This must agree with the port which index server listens to)
serverproto: protocol of communication(ftp or http)
One more file called "xmlresul.xml" is needed by CGI program to generate the final result. Option "wwwpath" in configuration file above indicates where this xml file is stored.

However, I find an interesting phenomenon.
When I search key word "latex", two files are returned. They are "latex.doc" and "latex.txt".
When I search key word "latex.doc", no files are returned.
I guess the index server or cgi program maybe does not handle those special characters(such as ".") properly.

Wednesday, November 22, 2006

memorandum for maze

First, I set up and start services-"HeartBeatSvr", "UserSvr","DServer".
Second,
I added a new user called "maze".
Then I made directories "resource" and "MazeSearch".
Then I made ten sub directories a0-a9 under directory "resource".
I made directories basepath0-basepath9, conf0-conf9, cache0-cache9, and bin under directory "MazeSearch".
After that, the structure of directory looks like this:

/home/maze/resource/a0/
....
/home/maze/resource/a9/
/home/maze/MazeSearch/bashpath0/
...
/home/maze/MazeSearch/bashpath9/
/home/maze/MazeSearch/conf0
...
/home/maze/MazeSearch/conf9
/home/maze/MazeSearch/cache0
...
/home/maze/MazeSearch/cache9
/home/maze/MazeSearch/bin

Now, let me introduct the usage of these directories.
/home/maze/MazeSearch/bin: this directory contains binary executable file:IndexSvr and Indexer.

/home/maze/resource/a0-a9: these ten directories contain information about sharing files of each user The hash function between users and corresponding directories : userid mod 10.
(Note: The path is hard coded in program and can not be configured)

/home/maze/MazeSearch/conf0-conf9: these ten directories contain configure files need by IndexSvr program.
/home/maze/MazeSearch/basepath0-bashpath9: these ten directories contain index files.

Work Procedure:
How to build index for sharing files of users?
command line: Indexer -d confile_file
Program: Indexer
Argument: the path of configure file for IndexSvr

Content of Configure file is like:
DataTimeOutDay=30
respath=/home/maze/resource/a3/
basepath=../basepath3/
cachepath=../cache3/
logpath=../log/
confpath=../conf3/
binpath=../bin/
restrictlist=/home/maze/MazeSearch/md5.list
MinSiteSize=1
ftpserverport=23006
testsiteserver=162.105.80.121
testsiteport=30607

Three important configure options:
respath: indicates the directory for which index will be built. In other words, the input directory for IndexSvr program.
basepath: indicates the directory where generated index file is saved. In other words, the output directory for IndexSvr program.
confpath:indicates where configure file is saved.

How to start up IndexSvr?
command line: IndexSvr path_for_indexfile service_port
Arguments:
path_for_indexfile: indicates where the index files are stored
service_port: indicates which port this IndexSvr listens.
(note: configuration for CGI program ftpsearch(described below) should match with this configuration)

The search function in maze is implemented by CGI. In other words, a httpd server is needed and search requests are delivered to this httpd server. Then corresponding CGI program called ftpsearch is invoked to do the search. CGI program "ftpsearch" makes use of service provided by IndexSvr. So configuration for httpd server is necessary to make ftpsearch can work correctly and seamlessly with IndexSvr.


There are ten directories for storing index files. So, we should run IndexSvr ten times each of which provides search service for specified index files.
When a search request from user is delivered to ftpsearch, ftpsearch sends ten requests to these ten IndexSvr processes separately. Then ftpsearch waits for the response.

Client configuration:
After maze-5.6 is installed on windows, There is a configure file called config.xml in the directory where you installed maze. This file should be modified to update information about maze servers. Ip addresses and service ports of maze services must be configured correctly. And search CGI must also be configured to use the search engine.

Wednesday, November 15, 2006

Status Report for Date 11/2/2006 to Date 11/15/2006

Recently, I keep on working on Maze. I have done following tasks this two weeks:
1.
I write an detailed article to introduce Maze. The content includes architecture, componentss, implementation and so on. This article almost covers all aspects of Maze and I will refine it in following days.
Link: http://156.56.104.196/introduction2maze.pdf
2.
To make future upgrade easy, I change OS from redhat to ubuntu. For future study and research, because software environment in which maze runs is obsolete I upgraded the enviroment. I installed new version of library, complier and so on.
3.
At server side, I compile and install all components of Maze.

Solved Difficulty:
The greatest difficulty results from the jumbled source code package which Maze group sent to us. The structure of directories is chaotic and files for server side and client side intertwine together. Some errors in makefile and souce code. Almost in every step, I needed to modify makefile or source code. So progress was slow. I sent email to members of maze group and they told me they were organizing maze souce code and it would take a long time. However, after hard work, I have installed all components successfully.

Next Step:
Configure and try to run maze!!

Brief Introduction to Maze:
Architecture Overview:

Components:
l. Directory server:
collect sharing directories of every peer and build a central database. If a user create or modify sharing directories, the update information will be sent to Directory Server antomatically to update the database.
2. Index Server
Based on database on directory server, Index Server builds index for all files of all peers and provide search service.
Procedure of building index is not triggered every time sharing directory of a certain user changes. The reason is performance. So index is built periodically according to actual condition.
Note:
When providing search service, index server needs to access heart server to narrow search result into files of current on-line users.
3. User Server
Implement user registry and user certification.
4. Heart Server maintain list of current on-line users and corresponding status. Periodically(several seconds) client sends short message to heart server to let server know that this client is on-line.

Full Text can be obtained here:
http://156.56.104.196/introduction2maze.pdf

Wednesday, November 01, 2006

Status Report for Date 10/18/2006 to Date 11/1/2006

I read the documentation contained in Maze souce code package. And I have learned about it more deeply. Then I wanted to install it on redhat linux. However when I compiled a certain directory, error occurred. I checked the makefile and found that one of its source code files was missed. So, I could not go on with it. After that, I sent an email to one student of Maze Group to ask for explanation. And now I am waiting for his reply.