0% found this document useful (0 votes)
9 views24 pages

1 Statistics

This document discusses statistics and sampling methods in data science. It defines statistics as tools for deriving insights from data through mathematical computations. A population is defined as all individuals relevant to a statistical question, while a sample is a smaller subset of a population. Probability sampling methods covered include simple random sampling, stratified sampling, cluster sampling, systematic sampling, and multi-stage sampling. Simple random sampling involves randomly selecting units from a population with equal probability, while stratified sampling divides a population into subgroups and randomly samples from each.

Uploaded by

Vishal Shivhare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views24 pages

1 Statistics

This document discusses statistics and sampling methods in data science. It defines statistics as tools for deriving insights from data through mathematical computations. A population is defined as all individuals relevant to a statistical question, while a sample is a smaller subset of a population. Probability sampling methods covered include simple random sampling, stratified sampling, cluster sampling, systematic sampling, and multi-stage sampling. Simple random sampling involves randomly selecting units from a population with equal probability, while stratified sampling divides a population into subgroups and randomly samples from each.

Uploaded by

Vishal Shivhare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Data Science

Sta*s*cs
Sta*s*cs
• Statistics is a collection of tools and methods that you can use to derive
meaningful insights from data by performing mathematical computations on it.
Sta*s*cs in Data Science
• “Data Scientist is a person who is better at statistics than any programmer and
better at programming than any statistician.”
Popula*on & Sample
Popula*on & Sample
• A set of all individuals relevant to a particular statistical question is called a
population.
• A smaller group selected from a population is called a sample.
Parameter & Sta*s*c
• Parameter is descriptive measure of population and statistic is descrptive
measure of sample.
Sampling
Sampling Methods
• Probability sampling or Random Sampling
• Non-probability sampling
Probability sampling
• Simple Random Sampling
• Stratified sampling
• Cluster Sampling
• Systematic sampling
• Multi stage Sampling
Simple Random Sampling
Simple Random Sampling
• Every unit of the population has an equal chance of being selected
Simple Random Sampling
• Sampling without replacement
• Sampling with replacement
Simple Random Sampling
• Sampling Error
Stra*fied Sampling
Stra*fied Sampling

Image Source - https://en.wikipedia.org/wiki/Stratified_sampling


Stra*fied Sampling
• Subsets of the data sets or population are created based on a common factor,
and samples are randomly collected from each subgroup.
• Should be used when we want some members from every group
Examples - Stra*fied Sampling
• Music taste of students
• Average Salary
• Average package offered
Cluster Sampling
Cluster Sampling

Image Source - https://en.wikipedia.org/wiki/Cluster_sampling


Cluster Sampling
• Single Stage Cluster Sampling
• Two Stage Cluster Sampling
Systema*c Sampling
Systema*c Sampling

Image Source - https://stepupanalytics.com/wp-content/uploads/2018/08/5-3-300x190.jpg


Probability sampling
• Simple Random Sampling
• Stratified sampling
• Cluster Sampling
• Systematic sampling
• Multi stage Sampling

You might also like