Email - info@dbauniversity.com

Phone -  720 934 1260, Chicago, USA.
Google Rating
4.8

Big Data using Cloudera Hadoop Training

Google Rating
4.8

 

Click here for quick registration

12 months of on-demand access to 20 videos.

Remote Desktop Connection provided for lab work.

3 Node Cloudera Hadoop cluster install Lab work.

Training cost is $449 only.

Download Course Brochure – PDF

Big Data using Cloudera Hadoop Training

Big Data with Hadoop Training

We are currently offering a world class Big Data with Hadoop training program for interested students and professionals. Registration for our training courses is open for anyone in the world because it is an online course. Click here to purchase the course for $449.

 Course Details

1. Taught by Srini Ramineni (Founder, DBA University).
2. One year of on demand access to our  training videos developed using White Board technology to give you a real classroom like experience.
3. Cloud LAB access to each student through Remote Desktop Connection.
4. Additional LAB Work includes how to install and maintain a 3 Node Cloudera Hadoop cluster (CDH) in the Amazon cloud.
5. Lab exercises with real world data sets to challenge students on course topics.

Market demand for Big Data Professionals

Welcome to a new world. Welcome to the world of Big Data. As per a recent McKinsey Global Institute report, there are almost 200,000 Big Data analytical talent positions available and 1.5 million more data-savvy managers needed to take full advantage of Big Data in the United States. The transformational potential of Big Data is in the below five domains.

1.Health Care (United States).
2.Public sector administration.(European Union).
3.Retail (United States).
4.Manufacturing (Global).
5.Personal location data (Global).

Course Topics

Introduction to Big Data and the Hadoop Framework.

What is Big Data and what are the 3 characteristics of Big Data.
Introduction to Apache Hadoop.
History and current popular distributions of Hadoop.
Big Data with Hadoop job market and current trends and future predictions.
What are the use cases of Hadoop and learn about the entire Apache Hadoop ecosystem.
Lab Practice : Connect to the DBA University single node Hadoop server and browse its setup. Fully-distributed Hadoop cluster lab work will follow.
Lab Practice : How to setup a single node Hadoop server on your own PC.

The Hadoop File System (HDFS).

Introduction to the Hadoop Distributed File System (HDFS).
What is replication factor in HDFS and learn about best practices in HDFS design.
What is the Name Node, Secondary Name Node and what are Data Nodes in HDFS.
Browse the HDFS using the web interface.
Identify configuration parameters for the Namenode and Datanode.
High Availability of Data and Metadata (Name Node) in HDFS.
Practice lab exercises working with HDFS using City of Chicago data sets.

Map Reduce computation paradigm.

Introduction the Map Reduce computation paradigm for Big Data processing.
What are mappers and reducers.
Learn about the distributed data processing in Map Reduce.
Understand the differences between Map Reduce 1.0 and the latest Map Reduce with Yarn version.
Learn about the different components of the Map Reduce computation framework.

Apache Sqoop and Hadoop.

Introduction to Apache Sqoop tool.
Prerequisites for the Sqoop data connector for Oracle and Hadoop.
How to import data from a relational database to Hadoop using Sqoop.
How to export data from Hadoop to a relational database using Sqoop.
Practice lab exercises with Apache Sqoop and an Oracle database.

Data warehousing in Hadoop (Apache Hive)

Introduction to Apache Hive.
Understand the components and architecture of Apache Hive.
The command line interfaces for running HiveQL: hive and beeline.
Learn about Hive Partitions and Buckets.
Learn and practice HiveQL statements.
How to work with the Twitter API to download tweets data.
Practice lab exercises working with real time data sets in Hive.

Apache Pig

What is Apache Pig.
Learn about the Pig Data Model.
What are the rules and syntax of the Pig Latin language.
What is a JSON data object and how to load and analyze JSON data sets using Pig.
Practice lab exercises working with real time JSON data sets in Pig.

Install Cloudera Hadoop cluster in the cloud.

Choosing the hardware and compute resources for the servers (nodes) in a Hadoop cluster.
Software installation prerequisites of the Cloudera Hadoop cluster (CDH).
Understand Cloudera Director and Cloudera Manager software components.
Learn how to perform Cloud Computing using Amazon Web Services (AWS).
Lab practice : How to install a 3 node Cloudera Hadoop cluster (CDH) in the cloud (AWS).
Lab practice : How to administer, manage and monitor the Hadoop cluster nodes using Cloudera Manager.
Learn about Apache Hue web interface.
Lab practice : Use Hue web interface to input Hive, Sqoop and Pig commands.

Apache Spark

Introduction to Apache Spark.
Compare Apache Spark and Map Reduce computational framework.
Learn about Spark SQL and DataFrames.
How to store and analyze JSON documents using Apache Spark software framework.

Cloudera Impala

Introduction to Cloudera Impala.
Key Features of Cloudera Impala.
Cloudera Impala vs Map Reduce computational framework.
Comparision among Apache Hive, Pig and Impala.
Practice lab exercises using Cloudera Impala.

Apache Flume

Apache Flume and real world use cases.
What are the various components of Apache Flume.
Flume agent configuration.
Practice lab exercises using Apache Flume.

Contact Details

DBA University
605 W Madison St Suite 1108
Chicago IL 60661
United States of America

Ph 720 934 1260
Email info@dbauniversity.com

Setup Menus in Admin Panel

X
Bitnami