Hadoop Training Course Details
This hands-on Hadoop training course teaches Data Analysts, BI Analysts, BI Developers, SAS Developers and other types of analysts who need to answer questions and analyze Big Data stored in a Hadoop cluster how to develop applications and analyze Big Data stored in Apache Hadoop using Hive. Students will learn the details of Hadoop 2.0, YARN, the Hadoop Distributed File System (HDFS), an overview of MapReduce, and a deep dive into using Hive to perform data analytics on Big Data.
Students will work through lab exercises using the Hortonworks Data Platform for Windows to issue HDFS commands to add/remove files and folders from HDFS, run and monitor MapReduce jobs, retrieve HCatalog schemas from within a Pig script, perform a join of datasets and use advanced Hive features like windowing, views and multi-file inserts. At the completion of the course students will be able to:
- Understand the architecture of the Hadoop Distributed File System (HDFS) and how HDFS Federation works in Hadoop 2.0
- Use the Hadoop client to input data into HDFS
- Understand the various tools and frameworks in the Hadoop 2.0 ecosystem
- Use Sqoop to transfer data between Hadoop and a relational database
- Understand the architecture of MapReduce and run a MapReduce job on Hadoop 2.0
- Understand how Hive tables are defined and implemented
- Write efficient Hive queries and use Hive to run SQL-like queries to perform data analysis
- Perform data analytics on Big Data using Hive
- Use HCatalog with Hive
This is an instructor-led training (ILT) class and is available for onsite and online delivery.