0% found this document useful (0 votes)
19 views

BigData and Hadoop - Syllabus

This document provides details about a course on Big Data and Hadoop including course objectives, units of study covering topics like HDFS, MapReduce, Hive, and outcomes which include understanding fundamentals of big data architectures and using tools like HDFS, MapReduce, Hive to solve complex problems requiring massive computation.

Uploaded by

Avish Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

BigData and Hadoop - Syllabus

This document provides details about a course on Big Data and Hadoop including course objectives, units of study covering topics like HDFS, MapReduce, Hive, and outcomes which include understanding fundamentals of big data architectures and using tools like HDFS, MapReduce, Hive to solve complex problems requiring massive computation.

Uploaded by

Avish Khan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Chhattisgarh Swami Vivekanand Technical University, Bhilai

Name of Program: Bachelor of Technology.


Branch: Common to all Branches Semester: VII
Big data and Hadoop
Subject: Code: D000719(022)
40 Total Tutorial Ten (Minimum)
Total Theory Periods:
Periods:
Class Tests: Two (Minimum) Assignments: 2 (Minimum)
ESE Duration: Three Hours Max Marks:100 Min Marks: 35

Course Objectives
This course introduces the fundamental concepts of cloud and lays a strong foundation of Apache Hadoop (Big
data framework).
1. The HDFS file system, MapReduce frameworks are studied in detail.
2. Hadoop tools like Hive, and Hbase, which provide interface to relational databases, are also covered as
part of this course work.
3. Analyzing data with unix tools
4. Sorting. Map side and Reduce side joins.
5. Implementation. Java and Mapreduce clients

Introduction to Big Data. What is Big Dat?. Why Big Data is Important. Meet Hadoop. Data.
Data Storage and Analysis. Comparison with other systems. Grid Computing. A brief history of
UNIT I
Hadoop. Apache Hadoop and the Hadoop Eco System. Linux refresher; VMWare Installation of
Hadoop.
The design of HDFS. HDFS conceptsCommand-linene interface to Hadoop Distributed File
System (HDFS). Hadoop File systems. Interfaces. Java Interface to Hadoop. Anatomy of a file
UNIT II
read. Anatomy of a file writes. Replica placement and Coherency Model. Parallel copying with
distcp, Keeping an HDFS cluster balanced.

Introduction. Analyzing data with Unix tools. Analyzing data with hadoop. Java
MapReduce classes (new API). Data flow, combiner functions, Running a distributed Map
UNIT III Reduce Job. Configuration API. Setting up the development environment. Managing
configuration. Writing a unit test with MRUnit. Running a job in local job runner. Running on a
cluster. Launching a job. The Map Reduce Web UI.

Classic Map Reduce. Job submission. Job Initialization. Task Assignment. Task execution
UNIT IV .Progress and status updates. Job Completion. Shuffle and sort on Map and reducer side.
Configuration tuning. Map Reduce Types. Input formats. Output formats, Sorting. Map side and
Reduce side joins.
The Hive Shell. Hive services. Hive clients. The meta store. Comparison with traditional
databases. HiveQl. Hbasics. Concepts. Implementation. Java and Mapreduce clients. Loading
UNIT V
data, web queries.
Text books:
1. Tom White, Hadoop, “The Definitive Guide”, 3rd Edition, O’Reilly Publications, 2012
2. Dirk deRoos, Chris Eaton, George Lapis, Paul Zikopoulos, Tom Deutsch , “Understanding Big Data:
Analytics for Enterprise Class Hadoop and Streaming Data”, McGraw Hill Osborne Media; 1 edition, 2011
REFERENCES:
1. http://www.cloudera.com/content/cloudera-content/clouderadocs/HadoopTutorial/CDH4/Hadoop-
Tutorial.html
2. https: //www.ibm.com / developerworks / community / blogs / Susan VisserEditionntry/flash book
understanding big data analytics for enterprise class hadoop and streaming data? langen

Course outcome [After undergoing the course, students will be able to:]
1. Understand the fundamentals of Big cloud and data architectures.
2. Understand HDFS file structure and Mapreduce frameworks, and use them to solve complex problems, which
require massive computation power.
3. Use relational data in a Hadoop environment, using Hive and Hbase tools of the Hadoop Ecosystem.
4. Understand The Hive Shell.
5. Understand the Comparison with traditional databases.

You might also like