I recently attended Cloudera’s Hadoop Training for Administrators course in Columbia, MD. You can read my recap of the first and second days here and here. On the third day, we covered cluster maintenance, monitoring, and benchmarking, job logging, and data importing. Cluster Maintenance
- Common tasks include checking HDFS status, copying data between clusters, adding and removing nodes, rebalancing the cluster, namenode metadata backup and cluster upgrading.
- HDFS clusters can become unbalanced when new nodes added to the cluster leading to performance issues.
- Clusters can be rebalanced using the balancer command, which adjusts block placements on nodes within a set threshold value. The balancer command should only be used after adding new nodes to a cluster
- Single point of failure (at this point in time).
- If namenode metadata is lost then the cluster is lost.
- The fsimage and edits file are the two primary files that write metadata on disk. The fsimge file doesn’t write every change to HDFS file metadata onto disk; rather it appends changes incrementally to the edits log.
- At startup, the namenode loads the fsimage into ram then replays all entries from the edits log.The two logs are merged at set intervals on the Checkpoint node (aka secondary name node). This node copies both files loads them into RAM, merges them, then copies the results back to the Namenode.
- The checkpoint node does store a copy of these two files but depending on the last merge the data could be stale. It’s not meant to be a failover node, but more of a housekeeper.
- Wrigley recommends writing to two local directories on difference physical volumes and to a NFS directory. You can also retrieve copies of the namenode meta data over HTTP:
- fsimage: http://<namenode>:50070/getimage?getimage=1
- edits: http://<namenode>:50070/getimage?getedit=1
Cluster monitoring and Troubleshooting
- Use a general system monitoring tool like Nagios, Cacti, etc.. to monitor your cluster.
- Monitor Hadoop daemons, disk and disk partitions, cpu usage, and swap network transfer speeds.
- Log location is controlled in hadoop-dev.sh
- Typically set to /var/log/hadoop
- Each daemon writes to logs
- .log is the first port of call for diagnosis issues. It uses log4j. The Configuration for log4j is stored at conf/log4j.properties.
- Logs are rotated daily.
- Old logs are not deleted.
- .out is the combination of stdout and stderr, and doesn’t usually contain much output.
- Appenders are the destination for log messages
- The one that ships with Hadoop, DailyrollingFileAppender, is limited. It rotates the log file daily, but doesn’t limit size of log or number of files—the admin has to provide scripts to manage this.
- CDH ships an alternate appender called RollingFileAppender that addresses the limitation of the default appender.
Job logs created by Hadoop
- When a job runs 2 files are created, the Job XML configuration file and the job status file.These files are stored in multiple places on local disk and in HDFS:
- Hadoop_log_dir/<job_id>_conf.xml (default is 1 day)
- Hadoop_log_dir/history (default is 30 days)
- <job_output_dir_in_HDFS)/_logs/history (default is forever)
- Jobtracker also keeps them in memory for a limited time
- Developer logs are stored in the Hadoop_log_dir/userlogs (location is hardcoded). You should be wary of large developer logs files, as they can cause salve nodes to run out of space. By default, dev logs are deleted every 24 hours.
Users or monitoring: Monitoring the Cluster with Ganglia
- We discussed several general system monitoring tools, but none of them integrate with Hadoop.
- Ganglia is designed for clusters. It integrates with Hadoop’s metrics-collection system, but doesn’t provide alerts.
Benchmarking a cluster
- Standard benchmark is Terasort
- Example: Generate a 10,000,000 line file, each line containing 100 bytes, then sort the file:
Hadoop jar $HADOOP_HOME/hadoop-*-examples.jar teragen 10000000 input-dir
Hadoop jar $HADOOP_HOME/hadoop-*-examples.jar terasort input-dir output-dir
- Predominantly benchmarks are used to test network and disk i/o.
- You should test clusters before and after adding nodes to establish a baseline. It’s also good to do before and after upgrades.
- Cloudera is working on a benchmarking guide.
Populating HDFS from External Resources
- Flume is a distributed, reliable, available service for moving large amounts of data as it is produced. It was created at Cloudera as a spinoff of Facebook’s Scribe.
- Flume is ideally suited for gathering logs from multiple systems as they are generated.
- It’s configurable through a web browser or CLI., and can be extended by adding connectors to existing storage layers or data platforms.
- General sources already provided include data form files, syslog, and stdout from a process.
- Wrigley said there were some latency issues with Flume that are being fixed in the next minor version.
- SQOOP is the SQL-to-Hadoop database import tool. It was developed at Cloudera, is open-source, and is included as port of CDH (It’s about to become a top level Apache project).
- Sqoop uses JDBC to connect to RDBMS.
- It examines each table and automatically generates a Java class to import into HDFS then creates and runs a Map-only MR job to import the data. (Aside: per Mike Olson, you would have to be crack-pipe crazy to run MapReduce 2 in production.)
- By default, four mappers connect to the RDBMs, and each imports a quarter of the data.
- Sqoop features:
- Imports a single table, or all tables in a database
- Can specify which rows to import with a WHERE clause
- Can specify columns to import
- Can provide an arbitrary SELECT statement
- Can automatically create a Hive table based on imported data
- Supports incremental imports of data
- Can export data from HDFS back to a database table
- Cloudera has partnered with third parties (Oracle, MicroStrategy, and Netezza) to create native Sqoop connectors that are free but not open source.
- MicroStrategy has their own version of SQOOP for SQL server derived from SQOOP open source.
Best practices for importing data
- Import data to an intermediate directory in HDFS; then once it’s completely uploaded in HDFS, move it to the final destination. This prevents other clients from believing the file is there until it is completely there and ready to be processed.
Installing and managing other Hadoop projects
- Hive metabase should be stored in RDBMS such as MySQL. This is a simple configuration:
- Create a user and database in RDBMs
- Modify hive-site.xml on each user’s machine to point to the shared Metastore