Objective After attending this session, you should be able • Tasks of Data scientist • Responsibilities of Data scientist • Tasks of Data Engineer • Responsibilities of Data Engineer • Differentiate data Scientist from Data engineer
Artificial Intelligence and Data Analytics
Data Scientist • Data science is a field that extracts the knowledge and insights from the raw data. To do so, it uses mathematics, statistics, computer science, and programming language knowledge. • A person who has all these skills is known as a data scientist. A data scientist is all about being curious, self-driven, and passionate about finding answers. • Most data scientists in the industry have advanced training in statistics, math, and computer science. • Their experience is a vast horizon that also extends to data visualization, data mining, and information management. • The primary job of a data scientist is to ask the right question. It’s about surfacing hidden insight that can help enable companies to make smarter business decisions.
Artificial Intelligence and Data Analytics
• The job of a data scientist is not bound to a particular domain. • Apart from scientific research, they are working in various domains including shipping, healthcare, e-commerce, aviation, finance, education, etc. They start their work by understanding the business problem and then they proceed with data collection, reading the data, transforming the data in the required format, visualizing, modeling, and evaluating the model and then deployment.
Artificial Intelligence and Data Analytics
Responsibilities of Data Scientists ● Get data that has passed a first round of cleaning and manipulation, which they can use to feed to sophisticated analytics programs and machine learning and statistical methods to prepare data for use in predictive and prescriptive modelling. ○ To build models, they need to do research industry and business questions, and they will need to leverage large volumes of data from internal and external sources to answer business needs. This also sometimes involves exploring and examining data to find hidden patterns. ● Will need to present a clear story to the key stakeholders and when the results get accepted, they will need to make sure that the work is automated so that the insights can be delivered to the business stakeholders on a daily, monthly or yearly basis. Artificial Intelligence and Data Analytics Artificial Intelligence and Data Analytics source: https://www.youtube.com/watch?v=qrhRfPY4F4w Artificial Intelligence and Data Analytics Responsibilities of Data Engineers ● Develops, constructs, tests and maintains architectures, such as databases and large- scale processing systems. ○ Data scientist, on the other hand, is someone who cleans, massages, and organizes (big) data. ● Deal with raw data that contains human, machine or instrument errors ○ Data might not be validated and contain suspect records; It will be unformatted and can contain codes that are system-specific. ● Need to recommend and sometimes implement ways to improve data reliability, efficiency, and quality. ● Need to ensure that the architecture that is in place supports the requirements of the data scientists and the stakeholders, and the business. ● Will need to develop data set processes for data modeling, mining, and production
Artificial Intelligence and Data Analytics
Artificial Intelligence and Data Analytics source:https://www.youtube.com/watch?v=LgSHaOvNodA&feature=emb_title Artificial Intelligence and Data Analytics Artificial Intelligence and Data Analytics Team work makes the dream work ● It is clear that both parties need to work together to wrangle the data and provide insights to the business. ● There is a overlap in skillsets, but the two are gradually becoming more distinct in the industry, generally: ○ The data engineer will work with database systems, data API's and tools for ETL purposes, and will be involved in data modeling and setting up data warehouse solutions ○ The data scientist needs to know about stats, math and machine learning to build predictive models
Artificial Intelligence and Data Analytics
Artificial Intelligence and Data Analytics Languages, Tools & Software ● Often data engineers working with tools such as SAP, Oracle, Cassandra, MySQL, Redis, Riak, PostgreSQL, MongoDB, neo4j, Hive, and Sqoop. ● Data scientists will make use of languages such as SPSS, R, Python, SAS, Stata and Julia to build models. The most popular tools here are, without a doubt, Python and R. ○ SAS and SPSS as well, but also other tools such as Tableau, Rapidminer, Matlab, Excel, Gephi will find their way to the data scientist's toolbox.
Difference of AI, ML, DL, DS • Artificial Intelligence –its enable machine to think without human intervention (self driving car) • Machine learning-its provide statistical tools to explore and analysis data (supervised, unsupervised, semi-supervised) • Deep learning- mimic the human brain using multi neural network (ANN- numbers, CNN-images/videos, RNN-time series data) • Data science- all above techniques used by data scientist (with statistic, probability, algebra)
Artificial Intelligence and Data Analytics
Bibtex • BibTeX'' stands for a tool and a file format which are used to describe and process lists of references, mostly in conjunction with LaTeX documents
• CSV is semi structured data.
Artificial Intelligence and Data Analytics
Data Scientists vs Data Analysts • Data Scientists use a combination of Mathematical, Statistical, and Machine Learning techniques to clean, process, and interpret data to extract insights from it. They design advanced data modeling processes using prototypes, ML algorithms, predictive models, and custom analysis. • Data analysts examine data sets to identify trends and draw conclusions, Data Analysts collect large volumes of data, organize it, and analyze it to identify relevant patterns. After the analysis part is done, they strive to present their findings through data visualization methods like charts, graphs • Data Analysts transform the complex insights into business-savvy language that both technical and non-technical members of an organization can understand.
Artificial Intelligence and Data Analytics
Artificial Intelligence and Data Analytics Artificial Intelligence and Data Analytics Summary
• Tasks and Responsibilities of Data scientist
• Tasks and Responsibilities of Data Engineer • Difference between data Scientist and Data engineer
Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques
Mastering Data Science with Python: The Ultimate Guide: Unlock the Power of Data Analysis and Visualization with Python's Cutting-Edge Tools and Techniques