When Hadoop is started, it sets hadoop.log.dir using -Dhadoop.log.dir=$HADOOP_LOG_DIR.
If you don't set environment variable HADOOP_LOG_DIR explicitly, it will be $HADOOP_HOME/logs. If you don't specify HADOOP_HOME, Hadoop will guess it by using path of the script that you use to start Hadoop. So if you install Hadoop to dir <hadoop_dir>, and HADOOP_LOG_DIR is not set, then the log dir is <hadoop_dir>/logs.
If you want to change root log dir, change file 'conf/hadoop-env.sh'. Add a line similar to
export HADOOP_LOG_DIR=/your/local/log/dir
In following table, you should replace those variables which are enclosed in angle brackets.
<jobid>: id of a job
<username>: username of the user who starts up Hadoop.
<host>: host name of the node which runs the process.
Direcotory | Description | Related config parameters |
<hadoop.log.dir> | Log of various daemons | |
hadoop-<username>-jobtracker-<host>.log | Log of jobtracker daemon | |
hadoop-<username>-namenode-<host>.log | Log of namenode daemon | |
hadoop-<username>-secondarynamenode-<host>.log | Log of secondarynamenode daemon | |
hadoop-<username>-tasktracker-<host>.log | Log of tasktracker daemon | |
hadoop-<username>-datanode-<host>.log | Log of datanode daemon | |
job_<jobid>_conf.xml | Configuration file of a job | Only exists when the job is running. |
<hadoop.log.dir>/history | mapreduce.jobtracker.jobhistory.location |
|
job_<jobid>_conf.xml |
Only exists when the job is running. | |
job_<jobid>_<username> |
Only exists when the job is running. | |
<hadoop.log.dir>/done | log of completed jobs |
mapreduce.jobtracker.jobhistory.completed.location |
job_<jobid>_<username> | Event logging. It includes all events of the job (e.g. job started, task started). | |
job_<jobid>_conf.xml | Job conf file. It includes all configurations of the job. | |
<hadoop.log.dir>/userlogs | Log of attempts. Stored on each task tracker. | |
job_<jobid> | Each directory contains log of all attempts of the job. | |
/jobtracker/jobsInfo (in HDFS) | Job Status Store |
mapreduce.jobtracker.persist.jobstatus.active mapreduce.jobtracker.persist.jobstatus.hours mapreduce.jobtracker.persist.jobstatus.dir |
<jobId>.info | job status of a job |
Job logs in <hadoop.log.dir>/history/done directory are kept for mapreduce.jobtracker.jobhistory.maxage. Default value is 1 week.