100% found this document useful (7 votes)
4K views

Seminar On Data Science

Data science involves extracting meaningful insights from massive amounts of structured and unstructured data using scientific methods, technologies, and algorithms. It is a multidisciplinary field that uses tools to manipulate data to find new and meaningful patterns. Data science components include statistics, domain expertise, data engineering, visualization, advanced computing, mathematics, and machine learning. Popular tools used in data science include R, Python, SQL, and Tableau. The data science lifecycle includes discovery, data preparation, model planning, model building, operationalizing models, and communicating results.

Uploaded by

kebe Aman
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
100% found this document useful (7 votes)
4K views

Seminar On Data Science

Data science involves extracting meaningful insights from massive amounts of structured and unstructured data using scientific methods, technologies, and algorithms. It is a multidisciplinary field that uses tools to manipulate data to find new and meaningful patterns. Data science components include statistics, domain expertise, data engineering, visualization, advanced computing, mathematics, and machine learning. Popular tools used in data science include R, Python, SQL, and Tableau. The data science lifecycle includes discovery, data preparation, model planning, model building, operationalizing models, and communicating results.

Uploaded by

kebe Aman
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 25

SEMINAR-2

Data Science
Introduction
◦ Data Science has become the most demanding job of the 21st century.
◦ Every organization is looking for candidates with knowledge of data science.
In this seminar, we are going to introduce data science,
◦ Tools for data science,
◦ Components of data science,
◦ Application, etc.
What is Data Science?
◦ Data science is a deep study of the massive amount of data, which involves
extracting meaningful insights from raw, structured, and unstructured data that
is processed using the scientific method, different technologies, and
algorithms.
◦ It is a multidisciplinary field that uses tools and techniques to manipulate the
data so that you can find something new and meaningful.
◦ Data science uses the most powerful hardware, programming systems, and
most efficient algorithms to solve the data related problems. It is the future of
artificial intelligence.
Cont’d…
◦ Data science is all about:
• Asking the correct questions and analyzing the raw data.
• Modeling the data using various complex and efficient algorithms.
• Visualizing the data to get a better perspective.
• Understanding the data to make better decisions and finding the final result.
Need for Data Science:
Cont’d…
◦ Some years ago, data was less and mostly available in a structured form, which
could be easily stored in excel sheets, and processed using BI tools.
◦ But in today's world, data is becoming so vast, i.e., approximately 2.5 quintals
bytes of data is generating on every day, which led to data explosion. It is
estimated as per researches, that by 2020, 1.7 MB of data will be created at
every single second, by a single person on earth. Every Company requires data
to work, grow, and improve their businesses.
◦ Now, handling of such huge amount of data is a challenging task for every
organization. So to handle, process, and analysis of this, we required some
complex, powerful, and efficient algorithms and technology, and that
technology came into existence as data Science.
Reasons for using data science technology:
• With the help of data science technology, we can convert the massive amount of
raw and unstructured data into meaningful insights.
• Data science technology is opting by various companies, whether it is a big
brand or a startup. Google, Amazon, Netflix, etc, which handle the huge amount
of data, are using data science algorithms for better customer experience.
• Data science is working for automating transportation such as creating a self-
driving car, which is the future of transportation.
• Data science can help in different predictions such as various survey, elections,
flight ticket confirmation, etc.
Data science Jobs:
Data Analyst
performs mining of huge amount of data, models the data, looks for patterns,
relationship, trends, and so on. At the end of the day, he comes up with
visualization and reporting for analyzing the data for decision making and
problem-solving process.
Data engineer
A data engineer works with massive amount of data and responsible for
building and maintaining the data architecture of a data science project. Data
engineer also works for the creation of data set processes used in modeling,
mining, acquisition, and verification.
Cont’d…
Data scientist
A data scientist is a professional who works with an enormous amount of data
to come up with compelling business insights through the deployment of
various tools, techniques, methodologies, algorithms, etc.
Machine Learning Expert
The machine learning expert is the one who works with various machine
learning algorithms used in data science such as regression, clustering,
classification, decision tree, random forest, etc.
Difference between BI and Data Science
Criterion Business intelligence Data science
Data science deals with structured and
Business intelligence deals with structured
Data Source unstructured data, e.g., weblogs,
data, e.g., data warehouse.
feedback, etc.
Scientific(goes deeper to know the
Method Analytical(historical data)
reason for the data report)
Statistics, Visualization, and Machine
Statistics and Visualization are the two skills
Skills learning are the required skills for data
required for business intelligence.
science.
Data science focuses on past data,
Business intelligence focuses on both Past
Focus present data, and also future
and present data
predictions.
Data Science Components:
Components of Data Science
1. Statistics: Statistics is one of the most important components of data science. Statistics is a
way to collect and analyze the numerical data in a large amount and finding meaningful
insights from it.
2. Domain Expertise: In data science, domain expertise binds data science together. Domain
expertise means specialized knowledge or skills of a particular area. In data science, there
are various areas for which we need domain experts.
3. Data engineering: Data engineering is a part of data science, which involves acquiring,
storing, retrieving, and transforming the data. Data engineering also includes metadata (data
about data) to the data.
4. Visualization: Data visualization is meant by representing data in a visual context so that
people can easily understand the significance of data. Data visualization makes it easy to
access the huge amount of data in visuals.
Cont’d…
5. Advanced computing: Heavy lifting of data science is advanced
computing. Advanced computing involves designing, writing, debugging,
and maintaining the source code of computer programs.
6. Mathematics: Mathematics is the critical part of data science. Mathematics
involves the study of quantity, structure, space, and changes. For a data
scientist, knowledge of good mathematics is essential.
7. Machine learning: Machine learning is backbone of data science. Machine
learning is all about to provide training to a machine so that it can act as a
human brain. In data science, we use various machine learning algorithms to
solve the problems.
Tools for Data Science
• Data Analysis tools: R, Python, Statistics, SAS, Jupyter, R Studio, MATLAB, Excel,
RapidMiner.
• Data Warehousing: ETL, SQL, Hadoop, Informatica/Talend, AWS Redshift
• Data Visualization tools: R, Jupyter, Tableau, Cognos.
• Machine learning tools: Spark, Mahout, Azure ML studio.
Machine learning in Data Science
◦ To become a data scientist, one should also be aware of machine learning and its algorithms,
as in data science, there are various machine learning algorithms which are broadly being
used. Following are the name of some machine learning algorithms used in data science:
• Regression
• Decision tree
• Clustering
• Principal component analysis
• Support vector machines
• Naive Bayes
• Artificial neural network
• Apriori
Data Science Lifecycle
phases of data science life
1. Discovery: involves
acquiring data from all the identified internal & external sources which helps you to
answer the business question.
◦ The data can be:
• Logs from webservers
• Data gathered from social media
• Census datasets
• Data streamed from online sources using APIs
2. Data preparation: Data
can have lots of inconsistencies like missing value, blank columns, incorrect data
format which needs to be cleaned. You need to process, explore, and condition data before modeling.
The cleaner your data, the better are your predictions.
In this phase, we need to perform the following tasks:
• Data cleaning
• Data Reduction
• Data integration
• Data transformation,
◦ After performing all the above tasks, we can easily use this data for our further processes.
Cont’d…
3. Model Planning: In this phase, we need to determine the various methods and techniques to establish
the relation between input variables. We will apply Exploratory data analytics(EDA) by using various
statistical formula and visualization tools to understand the relations between variable and to see what
data can inform us. Common tools used for model planning are:
◦ SQL Analysis Services
◦ R
◦ SAS
◦ Python

4. Model-building: In this phase, the process of model building starts. We will create datasets for training
and testing purpose. We will apply different techniques such as association, classification, and
clustering, to build the model.
◦ SAS Enterprise Miner
◦ WEKA
◦ SPCS Modeler
◦ MATLAB
Cont’d…
5. Operationalize: In this phase, we will deliver the final reports of the project, along with
briefings, code, and technical documents. This phase provides you a clear overview of
complete project performance and other components on a small scale before the full
deployment.
6. Communicate results: In this phase, we will check if we reach the goal, which we have
set on the initial phase. We will communicate the findings and final result with the
business team.
Applications of Data Science:
Image recognition and speech recognition:
◦ Data science is currently using for Image and speech recognition. When you upload an image on Facebook and start
getting the suggestion to tag to your friends. This automatic tagging suggestion uses image recognition algorithm,
which is part of data science.
◦ When you say something using, "Ok Google, Siri, Cortana", etc., and these devices respond as per voice control, so
this is possible with speech recognition algorithm.
Internet search or Search recommenders
◦ When we want to search for something on the internet, then we use different types of search engines such as Google,
Yahoo, Bing, Ask, etc. All these search engines use the data science technology to make the search experience better,
and you can get a search result with a fraction of seconds.
Transport:
◦ Transport industries also using data science technology to create self-driving cars. With self-driving cars, it will be
easy to reduce the number of road accidents.
Cont’d…
Healthcare:
◦ In the healthcare sector, data science is providing lots of benefits. Data science is being used for tumor
detection, drug discovery, medical image analysis, virtual medical bots, etc.
Recommendation systems:
◦ Most of the companies, such as Amazon, Netflix, Google Play, etc., are using data science technology for
making a better user experience with personalized recommendations. Such as, when you search for
something on Amazon, and you started getting suggestions for similar products, so this is because of data
science technology.
Fraud, Risk detection:
◦ Finance industries always had an issue of fraud and risk of losses, but with the help of data science, this can
be rescued.
◦ Most of the finance companies are looking for the data scientist to avoid risk and any type of losses with an
increase in customer satisfaction.
◦ Digital Advertisement
THANK YOU!

You might also like