0% found this document useful (0 votes)
34 views

LO2b) - Data Scientist Vs Data Engineer New

Uploaded by

Ali Azgar Katha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views

LO2b) - Data Scientist Vs Data Engineer New

Uploaded by

Ali Azgar Katha
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

LO 2

Data Scientist vs Data Engineer

Artificial Intelligence and Data Analytics


Objective
After attending this session, you should be able
• Tasks of Data scientist
• Responsibilities of Data scientist
• Tasks of Data Engineer
• Responsibilities of Data Engineer
• Differentiate data Scientist from Data engineer

Artificial Intelligence and Data Analytics


Data Scientist
• Data science is a field that extracts the knowledge and insights from the raw data. To do
so, it uses mathematics, statistics, computer science, and programming language
knowledge.
• A person who has all these skills is known as a data scientist. A data scientist is all about
being curious, self-driven, and passionate about finding answers.
• Most data scientists in the industry have advanced training in statistics, math, and
computer science.
• Their experience is a vast horizon that also extends to data visualization, data mining,
and information management.
• The primary job of a data scientist is to ask the right question. It’s about surfacing hidden
insight that can help enable companies to make smarter business decisions.

Artificial Intelligence and Data Analytics


• The job of a data scientist is not bound to a particular domain.
• Apart from scientific research, they are working in various domains
including shipping, healthcare, e-commerce, aviation, finance, education,
etc. They start their work by understanding the business problem and then
they proceed with data collection, reading the data, transforming the data
in the required format, visualizing, modeling, and evaluating the model
and then deployment.

Artificial Intelligence and Data Analytics


Responsibilities of Data Scientists
● Get data that has passed a first round of cleaning and manipulation,
which they can use to feed to sophisticated analytics programs and
machine learning and statistical methods to prepare data for use in
predictive and prescriptive modelling.
○ To build models, they need to do research industry and business questions, and
they will need to leverage large volumes of data from internal and external sources
to answer business needs. This also sometimes involves exploring and examining
data to find hidden patterns.
● Will need to present a clear story to the key stakeholders and when the
results get accepted, they will need to make sure that the work is
automated so that the insights can be delivered to the business
stakeholders on a daily, monthly or yearly basis.
Artificial Intelligence and Data Analytics
Artificial Intelligence and Data Analytics source: https://www.youtube.com/watch?v=qrhRfPY4F4w
Artificial Intelligence and Data Analytics
Responsibilities of Data Engineers
● Develops, constructs, tests and maintains architectures, such as databases and large-
scale processing systems.
○ Data scientist, on the other hand, is someone who cleans, massages, and
organizes (big) data.
● Deal with raw data that contains human, machine or instrument errors
○ Data might not be validated and contain suspect records; It will be
unformatted and can contain codes that are system-specific.
● Need to recommend and sometimes implement ways to improve data reliability,
efficiency, and quality.
● Need to ensure that the architecture that is in place supports the requirements of
the data scientists and the stakeholders, and the business.
● Will need to develop data set processes for data modeling, mining, and production

Artificial Intelligence and Data Analytics


Artificial Intelligence and Data Analytics source:https://www.youtube.com/watch?v=LgSHaOvNodA&feature=emb_title
Artificial Intelligence and Data Analytics
Artificial Intelligence and Data Analytics
Team work makes the dream work
● It is clear that both parties need to work together to wrangle the data
and provide insights to the business.
● There is a overlap in skillsets, but the two are gradually becoming more
distinct in the industry, generally:
○ The data engineer will work with database systems, data API's and tools for ETL
purposes, and will be involved in data modeling and setting up data warehouse
solutions
○ The data scientist needs to know about stats, math and machine learning to build
predictive models

Artificial Intelligence and Data Analytics


Artificial Intelligence and Data Analytics
Languages, Tools & Software
● Often data engineers working with tools such as SAP, Oracle,
Cassandra, MySQL, Redis, Riak, PostgreSQL, MongoDB, neo4j,
Hive, and Sqoop.
● Data scientists will make use of languages such as SPSS, R, Python,
SAS, Stata and Julia to build models. The most popular tools here are,
without a doubt, Python and R.
○ SAS and SPSS as well, but also other tools such as Tableau, Rapidminer, Matlab,
Excel, Gephi will find their way to the data scientist's toolbox.

https://app.datacamp.com/learn/

Artificial Intelligence and Data Analytics


Quiz
Arrange the below items in order

Artificial Intelligence and Data Analytics


Difference of AI, ML, DL, DS
• Artificial Intelligence –its enable machine to think without human
intervention (self driving car)
• Machine learning-its provide statistical tools to explore and analysis data
(supervised, unsupervised, semi-supervised)
• Deep learning- mimic the human brain using multi neural network (ANN-
numbers, CNN-images/videos, RNN-time series data)
• Data science- all above techniques used by data scientist (with statistic,
probability, algebra)

Artificial Intelligence and Data Analytics


Bibtex
• BibTeX'' stands for a tool and a file format which are used to describe
and process lists of references, mostly in conjunction with LaTeX
documents

• CSV is semi structured data.

Artificial Intelligence and Data Analytics


Data Scientists vs Data Analysts
• Data Scientists use a combination of Mathematical, Statistical, and Machine
Learning techniques to clean, process, and interpret data to extract insights from
it. They design advanced data modeling processes using prototypes, ML
algorithms, predictive models, and custom analysis.
• Data analysts examine data sets to identify trends and draw conclusions, Data
Analysts collect large volumes of data, organize it, and analyze it to identify
relevant patterns. After the analysis part is done, they strive to present their
findings through data visualization methods like charts, graphs
• Data Analysts transform the complex insights into business-savvy language that
both technical and non-technical members of an organization can understand.

Artificial Intelligence and Data Analytics


Artificial Intelligence and Data Analytics
Artificial Intelligence and Data Analytics
Summary

• Tasks and Responsibilities of Data scientist


• Tasks and Responsibilities of Data Engineer
• Difference between data Scientist and Data engineer

Artificial Intelligence and Data Analytics


Himanshu Patel, Instructor
Saskatchewan Polytechnic
email: [email protected]
Faculty office, Mining building, Saskatoon

Artificial Intelligence and Data Analytics

You might also like