0% found this document useful (0 votes)
2 views

Intro to Data Science - LVC1 With Markings

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Intro to Data Science - LVC1 With Markings

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Data Science

Data Science

• Introduction to Data Science


• Artificial Intelligence
• Machine Learning
• Big Data
• Data Science vs Statistics
• Data Science vs Bussiness Intelligence

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Data Science

• Data science is the domain of study that deals with vast volumes of data using modern tools
and techniques to find unseen patterns, derive meaningful information, and make business
decisions.
• Data science uses complex machine learning algorithms to build predictive models.
• The data used for analysis can come from many different sources and presented in various
formats.

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Prerequisites for Data Science

• Machine Learning
Machine learning is the backbone of data science.
Data Scientists need to have a solid grasp of ML in addition to basic knowledge of statistics.

• Modeling
Mathematical models enable you to make quick calculations and predictions based on what you
already know about the data.
Modeling is also a part of Machine Learning and involves identifying which algorithm is the most
suitable to solve a given problem and how to train these models.

• Statistics
Statistics are at the core of data science.
A sturdy handle on statistics can help you extract more intelligence and obtain more meaningful
results. DO NOT WRITE ANYTHING
HERE. LEAVE THIS SPACE FOR
WEBCAM
Prerequisites for Data Science

• Programming
Some level of programming is required to execute a successful data science project.
The most common programming languages are Python, and R.
Python is especially popular because it’s easy to learn, and it supports multiple
libraries for data science and ML.

• Databases
A capable data scientist needs to understand how databases work, how to manage them, and
how to extract data from them.

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Data Science Lifecycle

• Capture:
Data Acquisition, Data Entry, Signal Reception, Data Extraction.
This stage involves gathering raw structured and unstructured data.

• Maintain:
Data Warehousing, Data Cleansing, Data Staging, Data Processing, Data Architecture
This stage covers taking the raw data and putting it in a form that can be used.

• Process:
Data Mining, Clustering/Classification, Data Modeling, Data Summarization.
Data scientists take the prepared data and examine its patterns, ranges, and biases to
determine how useful it will be in predictive analysis.

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Data Science Lifecycle

• Analyze:
Exploratory/Confirmatory, Predictive Analysis, Regression, Text Mining, Qualitative Analysis.
Here is the real meat of the lifecycle.
This stage involves performing the various analyses on the data.

• Communicate:
Data Reporting, Data Visualization, Business Intelligence, Decision Making.
In this final step, analysts prepare the analyses in easily readable forms such as charts, graphs,
and reports.

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Data Science vs Business Intelligence

S. No. Factor Data Science Business Intelligence

It is a field that uses mathematics, It is basically a set of technologies,


statistics and various other tools to applications and processes that are
1. Concept
discover the hidden patterns in the used by the enterprises for business
data. data analysis.

2. Focus It focuses on the future. It focuses on the past and present.

It deals with both structured as well It mainly deals only with structured
3. Data
as unstructured data. data.

Data science is much more flexible It is less flexible as in case of


4. Flexibility as data sources can be added as business intelligence data sources
per requirement. need to be pre-planned.

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Data Science vs Business Intelligence

It makes use of the scientific


5. Method It makes use of the analytic method.
method.

It has a higher complexity in It is much simpler when compared to


6. Complexity
comparison to business intelligence. data science.

7. Expertise It’s expertise is data scientist. It’s expertise is the business user.

It deals with the questions of what It deals with the question of what
8. Questions
will happen and what if. happened.

The data to be used is disseminated Data warehouse is utilized to hold


9. Storage
in real-time clusters. data.

The ELT (Extract-Load-Transform) The ETL (Extract-Transform-Load)


process is generally used for the process is generally used for the
10. Integration of data
integration of data for data science integration of data for business
applications. intelligence applications.
DO NOT WRITE ANYTHING
HERE. LEAVE THIS SPACE FOR
WEBCAM
Data Science vs Business Intelligence

It’s tools are InsightSquared Sales


It’s tools are SAS, BigML, MATLAB,
11. Tools Analytics, Klipfolio, ThoughtSpot,
Excel, etc.
Cyfe, TIBCO Spotfire, etc.

Companies can harness their Business Intelligence helps in


potential by anticipating the future performing root cause analysis on a
12. Usage
scenario using data science in order failure or to understand the current
to reduce risk and increase income. status.

Business Intelligence has lesser


Greater business value is achieved
business value as the extraction
with data science in comparison to
13. Business Value process of business value carries
business intelligence as it
out statically by plotting charts and
anticipates future events.
KPIs (Key Performance Indicator).

The technologies such as Hadoop


The sufficient tools and technologies
are available and others are
14. Handling data sets are not available for handling large
evolving for handling
data sets.
understandingItsItsarge data sets. DO NOT WRITE ANYTHING
HERE. LEAVE THIS SPACE FOR
WEBCAM
Data Science vs Statistics
• Interdisciplinary Approach: Data science is an interdisciplinary field that incorporates
elements of computer science, statistics, and domain expertise to extract insights, patterns,
and knowledge from data.
• Big Data Emphasis: Data science often deals with large and complex datasets, leveraging
technologies like Hadoop, Spark, and distributed computing to process and analyze vast
amounts of information.
• Predictive Modeling and Machine Learning: Data science places a significant emphasis on
predictive modeling using machine learning algorithms. It involves training models on data to
make predictions, classify information, or uncover hidden patterns.
• Focus on Actionable Insights: The goal of data science is to generate actionable insights for
decision-making, often in a business context. It goes beyond statistical analysis to provide
practical recommendations and solutions.
• Full Data Lifecycle: Data science encompasses the entire data lifecycle, from data collection
and cleaning to exploratory data analysis, feature engineering, model building, and
deployment. It involves continuous iteration and improvement.
Artificial Intelligence

• Definition: Artificial Intelligence refers to the development of computer systems capable of


performing tasks that typically require human intelligence, such as visual perception, speech
recognition, decision-making, and language translation.
• Objective: The primary goal of AI is to create machines that can simulate and replicate
human-like intelligence to solve complex problems, learn from experience, and adapt to
changing environments.
• Examples: AI applications include natural language processing (NLP), image and speech
recognition, expert systems, and autonomous vehicles.
• Subfields: AI encompasses subfields such as machine learning, robotics, computer vision,
and natural language processing, contributing to various applications across industries.
• Ethical Considerations: The rapid advancement of AI raises ethical concerns, including
issues related to job displacement, bias in algorithms, privacy concerns, and the potential for
autonomous systems to make morally significant decisions.
Machine Learning

• Definition: Machine Learning is a subset of AI that focuses on the development of algorithms


and statistical models that enable computer systems to improve their performance on a
specific task over time without being explicitly programmed.
• Training Data: ML algorithms learn from data, using patterns and statistical inference to
make predictions or decisions without explicit programming.
• Types: Supervised learning, unsupervised learning, and reinforcement learning are common
types of machine learning, each with its own approach to training and problem-solving.
• Applications: ML is used in various applications, such as recommendation systems, fraud
detection, image and speech recognition, and autonomous vehicles.
• Challenges: Challenges in machine learning include overfitting, bias in training data,
interpretability of models, and the need for large and diverse datasets for effective training.
Big Data

• Definition: Big Data refers to extremely large and complex datasets that cannot be easily
processed, managed, or analyzed using traditional data processing tools and methods.
• Volume, Velocity, Variety: Big Data is characterized by the three Vs—volume (large amount
of data), velocity (speed at which data is generated), and variety (diversity of data types and
sources).
• Processing Technologies: Technologies such as Hadoop, Spark, and NoSQL databases are
commonly used to store, process, and analyze big data.
• Applications: Big Data analytics is applied in areas like business intelligence, healthcare,
finance, and scientific research to extract valuable insights, patterns, and trends.
• Challenges: Challenges in dealing with Big Data include data security concerns, scalability
issues, data quality assurance, and the need for advanced analytics tools and skilled
professionals.
Introduction to Python

• Python is a popular programming language. It was created by Guido van Rossum, and
released in 1991.

• It is used for:
✔ web development (server-side),
✔ software development,
✔ mathematics,
✔ system scripting.

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Introduction to Python

What can Python do?

•Python can be used on a server to create web applications.


•Python can be used alongside software to create workflows.
•Python can connect to database systems. It can also read and modify files.
•Python can be used to handle big data and perform complex mathematics.
•Python can be used for rapid prototyping, or for production-ready software development .

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Introduction to Python

Why Python?

✔ Python works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc).
✔ Python has a simple syntax similar to the English language.
✔ Python has syntax that allows developers to write programs with fewer lines than some other
programming languages.
✔ Python runs on an interpreter system, meaning that code can be executed as soon as it is
written. This means that prototyping can be very quick.
✔ Python can be treated in a procedural way, an object-oriented way or a functional way.

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Python IDEs

•An IDE enables programmers to combine the different aspects of writing a computer program.

•IDEs increase programmer productivity by introducing features like editing source code, building
executables, and debugging.

•IDEs used for Python :


✔ IDLE
✔ PyCharm
✔ Visual Studio Code
✔ Jupyter
✔ Spyder
✔ PyDev
✔ Online Editors

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Python Variables

Variables
Variables are containers for storing data values.

Creating Variables
A variable is created the moment you first assign a value to it.
Variables do not need to be declared with any particular type, and can even change type after
they have been set.

Example
x=5
y = "John"
print(x)
print(y)
DO NOT WRITE ANYTHING
HERE. LEAVE THIS SPACE FOR
WEBCAM
Python Operators

• Operators are used to perform operations on variables and values.


• In the example below, we use the + operator to add together two values:

• Example
• print(10 + 5)

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Python Operators

Python divides the operators in the following groups:

✔ Arithmetic operators
✔ Assignment operators
✔ Comparison operators
✔ Logical operators
✔ Identity operators
✔ Membership operators
✔ Bitwise operators

DO NOT WRITE ANYTHING


HERE. LEAVE THIS SPACE FOR
WEBCAM
Thank You

You might also like