Gerald Guo CGL Report: Hadoop Log

Sunday, December 26, 2010

Hadoop Log

When Hadoop is started, it sets hadoop.log.dir using -Dhadoop.log.dir=$HADOOP_LOG_DIR.
If you don't set environment variable HADOOP_LOG_DIR explicitly, it will be $HADOOP_HOME/logs. If you don't specify HADOOP_HOME, Hadoop will guess it by using path of the script that you use to start Hadoop. So if you install Hadoop to dir <hadoop_dir>, and HADOOP_LOG_DIR is not set, then the log dir is <hadoop_dir>/logs.

If you want to change root log dir, change file 'conf/hadoop-env.sh'. Add a line similar to

export HADOOP_LOG_DIR=/your/local/log/dir

In following table, you should replace those variables which are enclosed in angle brackets.
<jobid>: id of a job
<username>: username of the user who starts up Hadoop.
<host>: host name of the node which runs the process.

Direcotory	Description	Related config parameters
<hadoop.log.dir>	Log of various daemons
hadoop-<username>-jobtracker-<host>.log	Log of jobtracker daemon
hadoop-<username>-namenode-<host>.log	Log of namenode daemon
hadoop-<username>-secondarynamenode-<host>.log	Log of secondarynamenode daemon
hadoop-<username>-tasktracker-<host>.log	Log of tasktracker daemon
hadoop-<username>-datanode-<host>.log	Log of datanode daemon
job_<jobid>_conf.xml	Configuration file of a job	Only exists when the job is running.

<hadoop.log.dir>/history		mapreduce.jobtracker.jobhistory.location
job_<jobid>_conf.xml		Only exists when the job is running.
job_<jobid>_<username>		Only exists when the job is running.

<hadoop.log.dir>/done	log of completed jobs	mapreduce.jobtracker.jobhistory.completed.location
job_<jobid>_<username>	Event logging. It includes all events of the job (e.g. job started, task started).
job_<jobid>_conf.xml	Job conf file. It includes all configurations of the job.

<hadoop.log.dir>/userlogs	Log of attempts. Stored on each task tracker.
job_<jobid>	Each directory contains log of all attempts of the job.

/jobtracker/jobsInfo (in HDFS)	Job Status Store	mapreduce.jobtracker.persist.jobstatus.active mapreduce.jobtracker.persist.jobstatus.hours mapreduce.jobtracker.persist.jobstatus.dir
<jobId>.info	job status of a job

Job logs in <hadoop.log.dir>/history/done directory are kept for mapreduce.jobtracker.jobhistory.maxage. Default value is 1 week.

6 comments:

Unknown said...: Thanks for your post. It exactly explains my questions.; 11:00 PM
Hadoop online training said...: Hi,
Thanks for providing nice information the best way to learn big data training on
hadoop online training
also provides real time projects; 6:20 AM
Unknown said...: Uniqe informative article and of course True words, thanks for sharing. Today I see myself proud to be a hadoop professional with strong dedication and will power by blasting the obstacles. Thanks to Hadoop Training Chennai; 2:11 AM
Unknown said...: Thanks for sharing the information about the hadoop.I get a lot of great information from this blog.
AWS Training in chennai | AWS Training chennai | AWS course in chennai; 2:19 AM
Unknown said...: I known the lot of information and how it works then what are benefits by applying this application through this article.A great thanks for a valuable information.
VMWare Training in chennai | VMWare Training chennai | VMWare course in chennai; 2:20 AM
Mervin Parmar said...: Using big data analytics may give the companies many fruitful results, the findings can be implemented in their business decisions so as to minimize their risk and to cut the costs.
hadoop training in chennai|big data training|big data training in chennai; 8:06 AM

Gerald Guo CGL Report

Sunday, December 26, 2010

Hadoop Log

6 comments:

About Me

Blog Archive

Web Related

Data Formatter

Html-Code convert

Misc

Keywords/Tags

Followers

Gerald Guo CGL Report

Sunday, December 26, 2010

Hadoop Log

6 comments:

About Me

Subscribe To

Blog Archive

Web Related

Data Formatter

Html-Code convert

Misc

Keywords/Tags

Followers