Data Analytics Virtual Internship Report
Data Analytics Virtual Internship Report
On
BACHELOR OFTECHNOLOGY
In
INFORMATION TECHNOLOGY
By
BONAFIDE CERTIFICATE
EXTERNAL EXAMINER
VASIREDDY VENKATADRI INSTITUTE OF TECHNOLOGY
Department of Information Technology
CERTIFICATE OF AUTHENTICATION
We solemnly declare that this Internship report “CLOUD VIRTUAL INTERNSHIP”
is the Bonafide work done purely by us, carried out under the Point Of Contact of Mr. P.
Nagababu, towards partial fulfillmentof the requirements of the Degree of Bachelor of
TechnologyinInformation Technology fromJawaharlal Nehru Technological University,
Kakinada during the year 2021-22.
Y.PRUDHVI -20BQ1A12I5________.
ABSTRACT
The objective of this Data Analytics Virtual Internship provides the strong foundation
about AWS fundamentals which offers common cloud concepts, Core AWS services, AWS
platform, including available services and their common use cases, AWS Cloud architectural
principles (at the conceptual level), account security, and compliance through Cloud Foundations.
Solution architecture provides the ground for software development projects by tailoring IT
solutions to specific business needs and defining their functional requirements and stages of
implementation. It is comprised of many subprocesses that draw guidance from various enterprise
architecture viewpoints. With this, people may Understand the requirements and the process to
deploy various AWS services like Development & Management, Application Services and
Foundation Services. So,A Capstone Project proposal is to create a platform that can be used to
build, test, deploy and operate docker applications at scale using AWS. The solution will offer a
framework that will simplify building of infrastructure at a click of a button and automate
maintenance and scaling based on volumes.
Using Docker and Amazon Web Services, using these two technologies we intent to
create a powerful framework and toolset that can be used for building deploying, testing and
operating any application.
LETTER OF UNDERTAKING
To
The Principal
Vasireddy Venatadri Institute of Technology
Namburu,
Guntur.
We would like to take this opportunity to express our thanks to the teaching and
nonteaching staff in the Department of Information Technology, VVIT for their invaluable
help and support.
Y.PRUDHVI
Table of Contents
AWSAcademyCloudFoundations(ACFv2EN-17474):
Module1 CloudConceptsOverview
Introductiontocloudcomputing
AdvantagesoftheCloud
IntroductiontoAWS
Movingto theAWS Cloud
Module2 CloudEconomicsandBilling
Introduction
FundamentalsofPricing
TotalCostofOwnership
AWSOrganizations
AWSBilling&CostManagement
TechnicalSupportModels
Module3 AWSGlobalInfrastructure
Introduction
AWSGlobal Infrastructure
AWSServices&Service Categories
Module4 AWSCloudSecurity
Introduction
AWSSharedResponsibilityModel
AWSIAM
IdentityandAccessManagement
SecuringaNewAWSAccount
SecuringAccounts
SecuringData
Workingto EnsureCompliance
Module5 NetworkingandContentDelivery
Introduction
NetworkingBasics
AmazonVPC
VPCWizard
VPCNetworking
VPCSecurity
Route 53
CloudFront
Module6 Compute
Introduction
AmazonEC2 Part 1
AmazonEC2 Part 2
AmazonEC2 Part 3
IntroductiontoAmazonEC2
AmazonEC2 Cost
OptimizationContainerServices
IntroductiontoAWSLambda
Introduction to AWS
ElasticBeanstalk
Module7 Storage
Introduction
AWSEBS
WorkingwithEBS
AWSS3
AWSEFS
S3andEFS
AWSS3Glacier
Module8 Databases
Introduction
AmazonRDS
BuildaDatabaseServer
AmazonDynamoDB
Amazon Redshift
AmazonAurora
Module9 CloudArchitecture
Introduction
AWSWell-
ArchitectedFrameworkDesignPrinc
iples
OperationalExcellence
Security
Reliability
PerformanceEfficiency
CostOptimization
Reliability&HighAvailability
AWSTrusted Advisor
Module10 AutoScalingandMonitoring
Introduction
ElasticLoadBalancing
AmazonCloudWatch
AmazonEC2 Auto Scaling
AWS Academy Data Analytics :
Module 1 Introduction
Big Data
• Big Data Pipeline
• Big Data Tools
• Big Data Collection
• Big Data Storage
• Big Data Ingestion
• Big Data Processing and Analysis
• Big Data Visualization
Module 2 Lab 1
Lab 1 Introduction
Store Data in Amazon S3
Module 3 Lab 2
Lab 2 Introduction
Query Data in Amazon Athena
Module 4 Lab 3
Lab 3 Introduction
Query data in Amazon S3 with Amazon
Athena and AWS Glue
Module 5 Lab 4
Lab 4 Introduction
Analyze Data with Amazon Redshift
Module 6 Lab 5
Lab 5 Introduction
Analyze Data with Amazon Sagemaker,
Jupyter Notebooks and Bokeh
Module 7 Lab 6
Lab 6 Introduction
Automate Loading Data with the AWS
Data Pipeline
Module 8 Lab 7
Lab 7 Introduction
Analyze Streaming Data with Amazon
Kinesis Firehose, Amazon Elasticsearch
and Kibana
Module 9 Lab 8
Lab 8 Introduction
Analyze IoT Data with AWS IoT
Analytics
About AICTE
History
The beginning of formal technical education in India can be dated back to the mid-19th
century. Major policy initiatives in the pre-independence period included the appointment of the
Indian Universities Commission in 1902, issue of the Indian Education Policy Resolution in 1904,
and the Governor General’s policy statement of 1913 stressing the importance of technical
education, the establishment of IISc in Bangalore, Institute for Sugar, Textile & Leather Technology
in Kanpur, N.C.E. in Bengal in 1905, and industrial schools in several provinces.
Initial Set-up
All India Council for Technical Education (AICTE) was set up in November 1945 as a
national-level apex advisory body to conduct a survey on the facilities available for technical
education and to promote development in the country in a coordinated and integrated manner. And
to ensure the same, as stipulated in the National Policy of Education (1986), AICTE was vested
with:
Statutory authority for planning, formulation, and maintenance of norms & standards
Quality assurance through accreditation
Funding in priority areas, monitoring, and evaluation
Maintaining parity of certification & awards
The management of technical education in the country
The Government of India (the Ministry of Human Resource Development) also constituted a
National Working Group to look into the role of AICTE in the context of proliferation of technical
institutions, maintenance of standards, and other related matters. The Working Group recommended
that AICTE be vested with the necessary statutory authority for making it more effective, which
would consequently require restructuring and strengthening with the necessary infrastructure and
operating mechanisms.
Overview of AICTE Internship Program
The most crucial element of internships is that they integrate classroom knowledge and
theory with practical application and skills developed in professional or community settings.
Organizations are getting familiar, that work these days is something other than an approach
to win your bread. It is a dedication, an awareness of others’ expectations, and a proprietorship. In
order to know how the applicant might "perform" in various circumstances, they enlist assistants
and offer PPOs (Pre-Placement Offer) to the chosen few who have fulfilled every one of their
necessities.
For getting a quicker and easier way out of such situations, many companies and students
have found AICTE to be of great help. Through its internship portal, AICTE has provided them with
the perfect opportunity to emerge as a winner in these trying times. The website provides the perfect
platform for students to put forth their skills & desires and for companies to place the intern
demand. It takes just 15 seconds to create an opportunity, auto-match, and an auto-post to google,
bing, glassdoor, Linkedin, and similar platforms. The selected intern's profile and availability are
validated by their respective colleges before they join or acknowledge the offer. Shortlisting the
right resume, with respect to skills, experiences, and location just takes place within seconds.
Nothing but authentic and verified companies can appear on the portal.
Additionally, there are multiple modes of communication to connect with interns. Both
claiming to be satisfied in terms of time management, quality, security against frauds, and
genuineness.
Fill in all the details, send in your application or demand, and just sit back & see your vision take a
hike.
AICTE Internship Platforms
About EduSkills
EduSkills is a Non-profit organization which enables Industry 4.0 ready digital workforce in
India. Our vision is to fill the gap between Academia and Industry by ensuring world class
curriculum access to our faculties and students.
We want to completely disrupt the teaching methodologies and ICT based education system
in India. We work closely with all the important stakeholders in the ecosystem Students, Faculties,
Education Institutions and Central/State Governments by bringing them together through our
skilling interventions.
Our three-pronged engine targets social and business impact by working holistically on
Education, Employment and Entrepreneurship.
With a vision to create an industry ready workforce who will eventually become leaders in
emerging technologies, EduSkills & AICTE launches Virtual Internship program on Machine
learning, supported by AWS Academy.
About Aws Academy:
b) The starting and ending data of our internship are from march to may. Me first
startedcloudfoundationandwecompleteditinoneandhalfmonth.LaterwestartedCloudArchitectin
g. This also took one and one and half month. Finally at the end of may we completed our
internship.
The professors also assisted us in completing the labs, which made cloud
architectingmuch easier. Because of the faculty's supervision, we were ableto finishthe
second portionofthe internship, cloudarchitecting, with ease, andit was donebythe endofMay.
Training Program
a) I worked In DATA ANALYTICS INTERNSHIP ,by EduSkills in AICTE platform.
I. DepartmentgaveanAWSLMSaccounttotrainustocompletetheinternship.AWSAcademy
Architecting are the two courses that are mandatory to complete the internshipprogram. Each
of the course includes knowledge checks, and labs to give us a practicalexperienceof
workingwithcloud.
II. Department guided us with online classes scheduled a week per each course in the
coursecompletion. AWS cloud foundation online classes were scheduled in the span of one
weekfrom 13th April to 19th April, in which they have given an over wise of the cloud
foundation course with detailed explanation and guided us in all the 6 labs .AWS cloud
architecting online classes were scheduled for a week from 27 thApril to 3rdMay with a detailed
explanation of guided and challenged labs. Provided a sheet to complete capstone
project.They have given a maximum level of understanding of the course to smoothly
complete thecourse ontime.
Amazon S3 overview :
Amazon S3 offers a range of object-level storage classes that are designed for different use cases:
Amazon S3 Standard
Amazon S3 Intelligent-Tiering
Amazon S3 Standard-Infrequent Access (Amazon S3 Standard-IA)
Amazon S3 One Zone-Infrequent Access (Amazon S3 One Zone-IA)
Amazon S3 Glacier
Amazon S3 Glacier Deep Archive
In this lab, Amazon S3 is used throughout the course, you must know how to create Amazon S3
buckets and load data for subsequent labs. using the AWS management console to create an Amazon S3
bucket, add an IAM user to a group that has full access to the Amazon S3 service, upload files to
Amazon S3, and run simple queries on the data in Amazon S3.
Lab 2 introduces you to Amazon Athena, which is the first analysis service can use Amazon
Athena to query structured, unstructured, and semi-structured data. Amazon Athena integrates with
AWS Glue, In this lab you will practice using the AWS management console to create an Amazon S3
bucket, add an IAM user to a group that has full access to the Amazon S3 service, upload files to
Amazon S3, and run simple queries on the data in Amazon S3.
Amazon Athena is an interactive query service that makes it easy to analyze data in Amazon S3
using standard SQL. Athena is serverless, so there is no infrastructure to manage, and you pay only for
the queries that you run. Athena is easy to use. Simply point to your data in Amazon S3, define the
schema, and start querying using standard SQL. Most results are delivered within seconds. With
Athena, there’s no need for complex ETL jobs to prepare your data for analysis. This makes it easy for
anyone with SQL skills to quickly analyze large-scale datasets.
Benefits:
Fast , accurate
Lab 3 introduces you to AWS Glue. Lab 3 builds on that idea to show how to use AWS Glue to infer the
schema from the data. This lab includes:
Lab 4 introduces you to Amazon Redshift. The lab addresses the Volume aspect of big data
problems. Amazon Redshift can accommodate very large datasets.Amazon Redshift uses columnar
storage to scale to very large datasets. Although the lab does not focus on the creation or management
of Amazon Redshift clusters, you should also review the overall architecture of an Amazon Redshift
solution.
An architectural diagram is included in the lab instructions, and additional resources are listed
to provide more background material Amazon Redshift is a fast, fully managed data warehouse that
makes it simple and cost-effective to analyse all your data by using standard SQL and your existing
business intelligence (BI) tools. Amazon Redshift is compatible with the tools that you already know
and use. Amazon Redshift supports standard SQL. It also provides high-performance Java Database
Connectivity (JDBC) and Open Database Connectivity (ODBC) connectors, which enable you to use the
SQL clients and BI tools of your choice.
Lab 5 :Analyse data with Amazon sage maker , jupyter Notebooks and Bokeh
Lab 5 introduces you to Amazon Sage Maker, Jupyter notebooks, and the Bokeh Python
package. Amazon Sage Maker is a fully managed machine learning service. Though machine learning is
not a part of this course, this lab uses Amazon Sage Maker as a way of hosting a Jupyter notebook for
the learners to work with .The main purpose of this lab is to provide you with an opportunity to
visualize data and practice using visualizations to support a business decisions.
Introduce you to Amazon sage maker ,jupyter notebooks ,and the bokeh python
package .Amazon sage maker is fully managed machine learning service .Though machine learning is
not a part of this course ,this lab uses Amazon Sage maker as a way of hosting a jupyter notebook for
the learners to work with it.
Lab 6 introduces you to the AWS Data Pipeline. The AWS Data Pipeline is a web service you can
use to migrate and transform data. The main purpose of this lab is to provide learners with an
opportunity to automate moving data and to understand how this service fits into the larger context of
data analysis.
Lab 7 :Analyse Streaming Data with Amazon Kinesis Firehose , Amazon Elasticsearch and
Kibana
Lab 7 introduces you to Amazon Kinesis, Amazon Elasticsearch Service (Amazon ES), and
Kibana. This lab addresses the Velocity aspect of big data problems. Amazon Kinesis is a suite of
services for processing streaming data. With Amazon Kinesis, you can ingest real-time data such as
video, audio, website clickstreams, or application logs. You can process and analyse the data as it
arrives, instead of capturing it all to storage before you begin analysis.
In this lab, you use the Amazon Kinesis Data Firehose service to read data from an application
log and then send the data through Amazon ES to Kibana. Kibana is an open-source visualization and
analytics platform that integrates with Amazon ES.
Access Amazon Kinesis Data Firehose and Amazon Elasticsearch Service (Amazon ES) in
the AWS Management Console.
Create a Kinesis Data Firehose delivery stream.
Integrate a Kinesis Data Firehose delivery stream with Amazon ES.
Build visualizations with Kibana.
Lab 8 introduces you to AWS IoT Analytics and AWS IoT Core. AWS IoT Analytics automates
the steps required for analysing IoT data. You can filter, transform, and enrich the data before storing it
in a time-series data store. AWS IoT Core provides connectivity between IoT devices and AWS Services.
IoT Core is fully-integrated with IoT Analytics.
In this lab you setup the components of an AWS IoT Analytics implementation and then used a
Python script to simulate loading data into AWS IoT Core. After you have loaded the data into AWS IoT
Analytics you perform queries to analyse the data.
AWS IoT Analytics is a fully-managed service that makes it easy to run and operationalize
sophisticated analytics on massive volumes of IoT data without having to worry about the cost and
complexity typically required to build an IoT analytics platform. It is the easiest way to run analytics on
IoT data and get insights to make better and more accurate decisions for IoT applications and machine
learning use cases.
IoT data is highly unstructured which makes it difficult to analyse with traditional analytics and
business intelligence tools that are designed to process structured data. IoT data comes from devices
that often record fairly noisy processes (such as temperature, motion, or sound). The data from these
devices can frequently have significant gaps, corrupted messages, and false readings that must be
cleaned up before analysis can occur. Also, IoT data is often only meaningful in the context of
additional, third party data inputs. For example, to help farmers determine when to water their crops,
vineyard irrigation systems often enrich moisture sensor data with rainfall data from the vineyard,
allowing for more efficient water usage while maximizing harvest yield.
Work Samples
Compile at least 2 samples of your work during your internship. Some examples of work samples include:
news stories, articles, interviews, spreadsheets, log sheets, correspondence, videos, CDs with audio or visual
clips, photos, layouts, press releases, media lists, speeches etc. Each work sample should have a short
description of your role in that work sample or how you used the sample.
Capstone Project:
This project provides me with an opportunity to demonstrate the solution design skills that I
have developed throughout this course.
Assignment is to design and deploy a solution for the following case:
StrengthsofcompletingAWSvirtualInternship:
1) After completing this internship one can be offered a cloud deployment engineer,cloud
application engineer,cloud platform engineer.
2) With the help of this internship,we can totally understand the concept of cloud.
3) Thisinternshiphelpsustoget thepractical knowledgealongwiththetheoreticalknowledge.
4) This helps us to enhance our employability and flexibility.
Weaknesses:
Opportunities:
Threats:
1)For suppose after completing the lab,if we don’t click on end lab the charges for the lab go
eson as a result, we need to start from the beginning the whole course.This is a big threat.
Conclusion
As a result, I'd like to conclude that internship played a critical part in not only expanding my
theoretical but also practical knowledge.
By pursuing this internship, I was able to get data analytics-based knowledge. As data analytics is a
popular technology, it is both beneficial and promising in the future. Because of its established
platforms such as databases and compilers, this platform is user friendly and simple to use.