0% found this document useful (0 votes)
35 views

LP BigData

cxsdvkjxco hdizfghtribjubv

Uploaded by

Akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
35 views

LP BigData

cxsdvkjxco hdizfghtribjubv

Uploaded by

Akash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 5
Big Data (KCS-061) Course Outcome ( CO) Bloom’s Knowledge Level (KL) At the end of course , the student will be able to: co1 Demonstrate knowledge of Big Data Analytics concepts and its applications in business. K1, we coz Demonstrate funetions and components of Map Reduce Framework and HDFS. Ki, K2 co3 Discuss Daia Management concepts in NoSQL environment, K6 co4 Explain process of developing Map Reduce based distributed processing applications. K2, KS cos Explain process of developing applications using HBASE, Hive, Pig ete. K2, KS DETAILED SYLLABUS 3-0-0 | Topic Proposed Lecture Tatroduction to Big Data: Types of digital data, history of Big Data innovation, introduction to Big Data platform, drivers for Big Data, Big, Data architecture and characteristics, 5 Vs of Big Data, Big Data technology components, Big Data importance and applications, Big Data features ~ Security, Compliance, auditing and protection, Big Data | privacy and ethics, Big Data Analytics, Challenges of conventional | systems, intelligent data analysis, nature of data, analytic processes and tools, analysis vs reporting, modern data analytic tools. 06 Hadoop: History of Hadoop, Apache Hadoop, the Hadoop Distributed File System, components of Hadoop, data format, analyzing data with Hadoop, scaling out, Hadoop streaming, Hadoop pipes, Hadoop Echo System. Map Reduce: Map Reduce framework and basics, how Map Reduce works, developing a Map Reduce application, unit tests with MR unit, test data and local tests, anatomy of a Map Reduce job run, failures, job scheduling, shuffle and sort, task execution, Map Reduce types, input formats, output formats, Map Reduce features, Real-world Map Reduce. 08 | HDFS (Hadoop Distributed File System): Design of HDFS, HDFS concepts, benefits and challenges, file sizes, block sizes and block abstraction in HDFS, data replication, how does HDFS store, read, and write files, Java interfaces to HDES, command line interface, Hadoop file system interfaces, data flow, data ingest with Flume and Scoop, Hadoop archives, Hadoop I/O: compression, serialization, Avro and file-based data structures. Hadoop Environment: Setting up a Hadoop cluster, cluster specification, cluster setup and installation, Hadoop configuration, security in Hadoop, administering Hadoop, HDFS monitoring & maintenance, Hadoop benchmarks, Hadoop in the cloud Hadoop Eco System and YARN: Hadoop ecosystem components, schedulers, fair and capacity, Hadoop 2.0 New Features - NameNode high availability, HDFS federation, MRv2, YARN, Running MRv! in YARN. NoSQL Databases: Introduction to NoSQL = MongoDB: Introduction, data types, creating, updating and deleing documents, querying, introduction to indexing, capped collections Spark: Installing spark, spark applications, jobs, stages and tasks, Resilient Distributed Databases, anatomy of a Spark job run, Spark on YARN SCALA: Introduction, classes and objects, basic types and operators, built-in control structures, functions and closures, inheritance. 5 Hadoop Eco System Frameworks: Applications on Big Data 09 using Pig, Hive and HBase Pig - Introduction to PIG, Execution Modes of Pig, Comparison of with Databases, Grunt, Pig Latin, User Defined Functions, Data Processing operators, Hive - Apache Hive architecture and installation, Hive shell, Hive services, Hive metastore, comparison with traditional databases, HiveQL, tables, querying data and user-defined functions, sorting and aggregating, Map Reduce scripts, joins & subqueries. HBase — Hbase concepts, clients, example, Hbase vs RDBMS, advanced usage, schema design, advance indexing, Zookeeper — | how it helps in monitoring a cluster, how to build applications with Zookeeper. IBM Big Data strategy, introduction to Infosphere, BigInsights and Big Sheets, introduction to Big SQL. Text books and References: 1. Michael Minelli, Michelle Chambers, and Ambis Business Intelligence and Analytic Trends for Tod: 2. Big-Data Black Book, DT Editorial Services, Wiley 3. Dirk deRoos, Chris Eaton, George Lapis, Paul Zikopoulos, Tom Deutsch, “Understanding Big Data Analyties for Enterprise Class Hadoop and Streaming Data”, McGrawHill. 4, Thomas Erl, Wajid Khattak, Paul Buhler, “Big Data Fundamentals: Concepts, Drivers and Techniques”, Prentice Hall 5. Bart Baesens “Analytics in a Big Data World: The Essential Guide to Data Science and its Applications (WILEY Big Data Series)”, John Wiley & Sons 6. ArshdeepBahga, Vijay Madisetti, “Big Data Science & Analytics: A HandsOn Approach “, VPT 7. Anand Rajaraman and Jeffrey David Ullman, “Mining of Massive Datasets”, CUP 8. Tom White, "Hadoop: The Definitive Guide", O'Reilly. 9, Eric Sammer, "Hadoop Operations", O'Reilly. 10. Chuck Lam, “Hadoop in Action”, MANNING Publishers 11. Deepak Vohra, “Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools”, Apress 12. E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive", O'Reilly 13. Lars George, "HBase: The Definitive Guide", O'Reilly. 14, Alan Gates, "Programming Pig", OReilly. 15, Michael Berthold, David J. Hand, “Intelligent Data Analysis”, Springer 16. Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, John Wiley & sons 17, Glenn J. Myatt, “Making Sense of Data”, John Wiley & Sons 18. Pete Warden, “Big Data Glossary”, O'Reilly Dhiraj, "Big Data, Big Analytics: Emerging 's Businesses", Wiley ABES ENGINEERING COLLEG, GHAZIABAD Department of Computer Science Lecture Plan Fan couse Cneatyy |___ Evaluation Scheme coane Program | Sem | “ampe_ | CourseCode TNT" | SemionalMaris | y. | “Tou | credit eT | TA | Total as vi |Bigbata | Kcs-o61 |3] 0 | 0 | 30 | 20 | s0 | 100 | 150 3 fan Date of Total leet Date of i eo otal lectures te of Tey] Mamet Faculty) Vertiat Head | Commencement | planned | Consusion Dr Pankaj Kumar 5 3] 0 | o Sharma co Peal | ano 40 Mr, Ashwin Pert Schule Name of the Topic as given in the Syllabus KL ciacr wri Introduction to Big Data K2 a Types of digital data, history of Big Data innovation, |, introduction to Big Data platform, drivers for Big Data 12 | Big Data architecture and characteristics, 5 Vs of Big Data, Big | > Data technology components, Big Data importance and applications, Big Data features — 1 13 Security, Compliance, auditing and protection, Big Data privacy | K2 ‘Assign and ethics. QUIZ 7 ig Data Analytics, Challenges of conventional systems, a intelligent data analysis 7 L5___| Nature of data, analytic processes and tools Ki 16 ‘Analysis vs reporting, modern data analytic tools. K2 UNIT=U Basic Structural Modeling and behavioral Modeling K2,K3, 17 | Hadoop: History of Hadoop, Apache Hadoop, the Hadoop |,» . Distributed File System ‘components of Hadoop, data format, analyzing data with 1s K2 Hadoop. 19 _| Scaling out, Hadoop streaming K2 Lio Hadoop pipes, Hadoop Echo System. K3 Li | Map Reduce: Map Reduce framework and basies, how Map| > acs Reduce works, developing a Map Reduce application, Li2 Unit tests with MR unit, test data and local tests, k2 ‘Anatomy of a Map Reduce job run, failures, job scheduling, LI3 5 Ko shuffle and sort, task execution, Lia __| Map Reduce types. input formats, output formats, Map Reduce = features, Real-world Map Reduce. Ge a [2 Sessional Test Object Oriented Analysis ; NIE Structured analysis and structure design (SA/SD) we HIDFS (Hadoop Distributed File System): Design of HDFS, Lis _ | HDFS concepts, benefits and challenges, file sizes, block sizes | K2 and block abstraction in HDFS, data replication, L16 how does HDFS store, read, and write files, K2 Java interfaces to HDFS, command line interface, Hadoop file L17__| system interfaces, data flow, data ingest with Flume and Scoop, | __K3 Hadoop archives, Lis Hadoop I/O: compression, serialization, Avro and file-based K2 data structures. Lig | Hadoop Environment: Setting up a Hadoop cluster, cluster | specification, [20 _| Cluster setup and installation, Hadoop configuration, security in| 5 Hadoop _ _ _ | Lan Administering Hadoop, HDFS K2 monitoring & maintenance, Hadoop benchmarks, Hadoop in L2 K2 the cloud UNIT-1V ‘Hadoop Eco System Hadoop Eco System and YARN: Hadoop ecosystem 13 : K2 components, schedulers, fair and capacity, 124 | Hadoop 2.0 New Features - NameNode high availability K2 L25 HDFS federation, MRv2 K2,K3 126 _| YARN, Running MRv1 in YARN. K2, K3 | 127 | NoSQt Databases: introduction to NoSQL K2, K3 | ‘MongoDB: Introduction, data types, creating, updating and aoe | Las deleing documents, querying, introduction to indexing, capped | 2, K3 Quiz4 collections Spark: Installing spark, spark applications, jobs, stages and [eo [Set aa Se On tO. BS SS 2g 130 | Databases, anatomy of a Spark job run, Spark on VARN K2,K3 SCALA: Introduction, classes and objects, basic types and 131 ‘operators, built-in control structures, functions and closures, 3, Ka inheritance. Test UNIT-V, Hadoop eco-system Framework L32__ | Hadoop Feo System Frameworks: Applications on Big Data aS ig Pig. and HBase 133 Pig - Introduction to PIG, Execution Modes of Pig ‘K2, K3 134 | Comparison of Pig with Databases, Grunt, Pig Latin K2,K3_| gkm 135 __| User Defined Functions, Data Processing operators KIS | pesag met] Hive - Apache Hive architecture and installation, Hive shell, : 136 | Hive services, Hive metastore, comparison with traditional | K2,K3 | GAS databases. HiveQL, tables, querying data and user-defined functions, 137 | sorting and aggregating, Map Reduce scripts, joins & | K2,K3 subqueries. 138 | HBase — Hbase concepts, clients, example, Hbase vs RDBMS, | _K2, K3 aw oa [ advanced usage, schema design, advance indexing Zookeeper ~ how it helps in monitoring a cluster, how to build applications with Zookeeper. 139 K2,K3 IBM Big Data strategy, introduction to Infosphere, Biginsights 140 | and Big sheets, introduction to Big SQL. K2,K3 PRE-UNIVERSITY EXAMINATION. KL- Bloom's Knowledge Level (K;, Ke, Ka, Ks, Ks, Ke) K,~ Remember K2— Understand Ks Apply K4~ Analyze K5— Evaluate Ke— Create Text Books: TI. Michael Minelli, Michelle Chambers, and Ambiga Dhiraj, "Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley ‘2. Big-Data Black Book, DT Editorial Services, Wiley T3. Ditk deRoos, Chris Eaton, George Lapis, Paul Zikopoulos, Tom Deutsch, “Understanding Big Data ‘Analytics for Enterprise Class Hadoop and Streaming Data’, McGrawHiil T4, Thomas Erl, Wajid Khattak, Paul Buhler, “Big Data Fundamentals: Concepts, Drivers and Techniques”, Prentice Hall. TS. Bart Baesens “Analytics in a Big Data World: The Essential Guide to Data Science and its. Applications (WILEY Big Data Series)”, John Wiley & Sons 6. ArshdeepBahga, Vijay Madisetti, “Big Data Science & Analytics: A HandsOn Approach “, VPT 17. Anand Rajaraman and Jeffrey David Ullman, “ ‘of Massive Datasets”, CUP T8. Tom White, "Hadoop: The Definitive Guide", O'Reilly. 9, Eric Sammer, "Hadoop Operations", O'Reilly. Reference Books: RI. Chuck Lam, “Hadoop in Action”, MANNING Publishers. R2. Deepak Vohra, “Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools”, Apress R3. E, Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive", O'Reilly RA, Lars George, "HBase: The Definitive Guide", O'R RS. Alan Gates, "Programming Pig", O'R R6. Michael Berthold, David J. Hand, “Intelligent Data Analysis”, Springer R7. Bill Franks, “Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics”, John Wiley & sons R8. Glenn J, Myatt, “Making Sense of Data”, John Wiley & Sons R9. Pete Warden, “Big Data Glossary”, O'Reilly Web references: Cloud-Scale Analytics | Microsoft Azure What is = ices (AWS) Deca Ze von onte ohcokeruoo wn cet

You might also like