0% found this document useful (0 votes)
0 views

What Is Data Science Module1

Uploaded by

fathimohamed2384
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

What Is Data Science Module1

Uploaded by

fathimohamed2384
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 33

IBM DS0101EN

What is Data Science?

• Eng. Sherif Salem

12/01/2024 1
• Benefits of Enrolling in a Course:
• Define data science and its importance in today’s data-driven world.
• Describe the various paths that can lead to a career in data science.
• Summarize advice given by seasoned data science professionals to data
scientists who are just starting out.

• Course Modules:
• Define data science & What Data Scientists Do
• Data science Topics
• Applications and Careers in Data Science
• Data Literacy for Data Science

12/01/2024 Course Overview 2


Module I
Define data
science &
What Data
Scientists Do
12/01/2024 3
Module I
Define data science & What
Data Scientists Do
What is Data Science?

12/01/2024 4
Understanding Data Science
• Data Science is a continuous process of utilizing data to gain insights.
• It involves validating hypotheses or models using available data.
• The goal is to uncover trends and insights hidden within datasets.
• Data is transformed into compelling narratives through storytelling.
• These insights drive strategic decision-making for organizations.
• It encompasses extracting and analyzing data in structured and unstructured
forms.

12/01/2024 What is Data Science? 5


The Essence of Data Science
• Data Science is akin to studying data, like other sciences study their subjects.
• It involves exploration, manipulation, and analysis of data to find answers.
• Today, data is abundant, algorithms are available, and tools are accessible.
• The affordability and accessibility of data and tools make data science
relevant.
• It's a time of unprecedented opportunity for those interested in data science.
• Data science thrives on curiosity, exploration, and leveraging available
resources.
12/01/2024 What is Data Science? 6
Module I
Define data science & What
Data Scientists Do
Fundamentals of Data Science

12/01/2024 7
Understanding Data Science
• Data Science encompasses significant data analysis across various sources.
• It leverages vast quantities of data from diverse sources like social media and
sales.
• Advancements in computing power enable meaningful analysis and new
discoveries.
• Data science aids organizations in understanding their environments and
uncovering opportunities.
• Data scientists investigate data to add value and insight to the organization's
knowledge.
• The process starts with clarifying the organization's question or problem.
12/01/2024 Fundamentals of Data Science 8
The Data Science Process
• Data scientists identify the necessary data and its sources to solve the problem.
• They analyze structured and unstructured data from various sources using different
methods.
• Employing multiple models helps explore data, revealing patterns and outliers.
• Insights from data analysis sometimes confirm suspicions but can also lead to new
approaches.
• Data scientists play a crucial role as storytellers, communicating results effectively to
stakeholders.
• Powerful data visualization tools aid in conveying insights and recommending actions to
stakeholders.
Fundamentals of Data Science
12/01/2024 Fundamentals of Data Science 9
Module I
Define data science & What
Data Scientists Do
The Many Paths to Data Science

12/01/2024 10
Evolution of Data Science Careers
• Data science was not a recognized field until around 2009-2011.
• DJ Patil and Andrew Gelman are credited with coining the term.
• Before data science, statistics was a prevalent field.
• Individuals often pursued business or other quantitative analysis disciplines.
• Exposure to data science often occurred during academic or professional
endeavors.
• The term "data science" gained prominence in various industries over time.

12/01/2024 The Many Paths to Data Science 11


Personal Journeys into Data Science
• Many individuals stumbled into data science through academic or professional
paths.
• Backgrounds varied from engineering to business, economics, and analytics.
• Exposure to data science often occurred during higher education or internships.
• Practical applications in fields like transportation engineering introduced
individuals to data science.
• Gradual immersion in data analysis and modeling paved the way for careers in data
science.
• The journey into data science showcases diverse paths and backgrounds
converging into the field.
12/01/2024 The Many Paths to Data Science 12
Module I
Define data science & What
Data Scientists Do
Advice for New Data Scientists

12/01/2024 13
Essential Qualities of a Data Scientist
• Curiosity is fundamental for exploring and understanding complex data.
• Being judgmental helps in forming hypotheses and initial assumptions.
• Argumentativeness aids in advocating for a specific direction and learning
from data.
• Comfort and flexibility with analytics platforms are valuable secondary skills.
• The ability to take positions and modify assumptions based on data is crucial.
• Starting with a strong position and evolving through the learning process is
essential.

12/01/2024 Advice for New Data Scientists 14


Career Development Strategies for Data Scientists
• Identify your competitive advantage and preferred industry focus.
• Tailor your analytical skills to match the needs of your chosen field.
• Acquire proficiency in industry-specific analytics platforms and tools.
• Apply your skills to real-world problems to demonstrate your capabilities.
• Develop storytelling abilities to effectively communicate insights and findings.
• Continuously refine and adapt your skills to stay relevant and competitive in
the field.

12/01/2024 Advice for New Data Scientists 15


Module I
Define data science & What
Data Scientists Do
Lesson Summary: Defining Data Science

12/01/2024 16
Understanding Data Science
• Data science studies data to understand the world around us.
• It uncovers insights and trends hidden within vast amounts of data.
• Recent advancements in computing power enable deeper analysis and new
knowledge.
• Data scientists play a crucial role in translating data into actionable insights.
• The process involves problem clarification, data collection, analysis, and
visualization.
• Curiosity, argumentation, and judgment are key traits for successful data
scientists.
12/01/2024 Lesson Summary: Defining Data Science 17
Developing Skills and Career Paths
• Skilled data scientists possess versatile knowledge beyond statistics and
programming.
• They come from diverse backgrounds such as economics, engineering, or
medicine.
• Mastery of data analysis tools and techniques is essential for success.
• Specialization in a particular field enhances expertise and industry relevance.
• Certification may become necessary as companies prioritize qualified
candidates.
• Future data scientists will adapt to evolving technology and changing job roles
for successful business outcomes.
12/01/2024 Lesson Summary: Defining Data Science 18
Module I
Define data science & What
Data Scientists Do
A Day in the Life of a Data Scientist

12/01/2024 19
Real-Life Applications of Data Science
• Built recommendation engine for large organization, providing simple yet
efficient solution.
• Used artificial neural networks to predict algae blooms, aiding water
treatment companies.
• Analyzed complaints data for Toronto Transit Commission, revealing weather
correlation.

12/01/2024 A Day in the Life of a Data Scientist 20


Problem-Solving with Data Science
• Toronto Transit Commission faced complaints data analysis challenge with half
a million entries.
• Discovered correlation between extreme weather and high complaint days
through data analysis.
• Utilized environmental data from Environment Canada to uncover weather-
complaint relationship.
• Identified top complaint days coinciding with unexpected rain, extreme
temperature drops, and windy conditions.
• Presented findings to TTC executives, offering insights into complaint patterns
and weather impact.
12/01/2024 A Day in the Life of a Data Scientist 21
Module I
Define data science & What
Data Scientists Do
Data Science Skills & Big Data

12/01/2024 22
Data Science Skills
• Data Analysis: Ability to analyze large datasets using statistical methods and machine learning
algorithms.
• Programming Skills: Proficiency in languages like Python, R, or SQL for data manipulation and
analysis.
• Data Visualization: Creating visual representations of data to communicate insights effectively.
• Domain Knowledge: Understanding of the specific industry or domain to interpret data in
context.
• Problem-Solving: Applying analytical skills to solve complex business problems using data-
driven approaches.
• Communication: Effectively communicating findings and insights to stakeholders through
reports and presentations.

12/01/2024 Data Science Skills & Big Data 23


Big Data
• Volume: Dealing with large volumes of data that traditional systems cannot handle
efficiently.
• Variety: Handling diverse data types such as structured, semi-structured, and
unstructured data.
• Velocity: Processing data streams in real-time or near real-time to derive timely
insights.
• Veracity: Ensuring data quality and reliability in a big data environment.
• Value: Extracting actionable insights and value from big data for business decisions.
• Tools & Technologies: Using platforms like Hadoop, Spark, and cloud services for
big data processing and analytics.
12/01/2024 Data Science Skills & Big Data 24
Q&A

12/01/2024 Q&A 25
Module I
Define data science & What
Data Scientists Do
Understanding Different Types of File Formats

12/01/2024 26
Understanding File Formats
• Data professionals work with various file types and formats.
• Importance of understanding file structure, benefits, and limitations.
• Choosing suitable formats for data and performance requirements.
• Covered file formats: Delimited text, XLSX, XML, PDF, JSON.
• Delimited text files: Rows with values separated by delimiters like comma or
tab.
• CSVs and TSVs are common in this category, suited for straightforward
information.

12/01/2024 Understanding Different Types of File Formats 27


Overview of File Formats
• XLSX: Microsoft Excel Open XML format, organized into worksheets.
• XML: Markup language for encoding data, readable by humans and
machines.
• PDF: Developed by Adobe for presenting documents consistently across
devices.
• JSON: Text-based standard for transmitting structured data over the web.
• JSON is language-independent, easy to use, and widely compatible.
• Understanding popular file formats is crucial for effective data handling and
analysis.
12/01/2024 Understanding Different Types of File Formats 28
Module I
Define data science & What
Data Scientists Do
Data Science Topics and Algorithms

12/01/2024 29
Key Concepts in Data Science
• Regression: Fundamental concept aiding understanding of data relationships.
• Data Visualization: Essential for conveying messages effectively to diverse audiences.
• Artificial Neural Networks: Mimicking biological brain behavior for innovative
applications.
• Data Visualization with R: Utilizing R for powerful and insightful data representation.
• Nearest Neighbor: Simple yet effective algorithm often outperforming complex ones.
• Structured vs. Unstructured Data: Tabular vs. non-tabular data formats and their
characteristics.

12/01/2024 Data Science Topics and Algorithms 30


Simplifying Regression
• Regression explained in simple terms using cab ride analogy.
• Base fare represents constant element in regression analysis.
• Relationship between distance/time traveled and fare elucidated through
regression.
• Regression uncovers unknown constants and relationships within data.
• Offers intuitive understanding without delving into complex statistical
distributions.
• Simplification aids in grasping regression's practical significance in data
analysis.
12/01/2024 Data Science Topics and Algorithms 31
Q&A

12/01/2024 Q&A 32
Thank you!

12/01/2024 33

You might also like