Data Science
Sta*s*cs
Sta*s*cs
• Statistics is a collection of tools and methods that you can use to derive
meaningful insights from data by performing mathematical computations on it.
Sta*s*cs in Data Science
• “Data Scientist is a person who is better at statistics than any programmer and
better at programming than any statistician.”
Popula*on & Sample
Popula*on & Sample
• A set of all individuals relevant to a particular statistical question is called a
population.
• A smaller group selected from a population is called a sample.
Parameter & Sta*s*c
• Parameter is descriptive measure of population and statistic is descrptive
measure of sample.
Sampling
Sampling Methods
• Probability sampling or Random Sampling
• Non-probability sampling
Probability sampling
• Simple Random Sampling
• Stratified sampling
• Cluster Sampling
• Systematic sampling
• Multi stage Sampling
Simple Random Sampling
Simple Random Sampling
• Every unit of the population has an equal chance of being selected
Simple Random Sampling
• Sampling without replacement
• Sampling with replacement
Simple Random Sampling
• Sampling Error
Stra*fied Sampling
Stra*fied Sampling
Image Source - https://en.wikipedia.org/wiki/Stratified_sampling
Stra*fied Sampling
• Subsets of the data sets or population are created based on a common factor,
and samples are randomly collected from each subgroup.
• Should be used when we want some members from every group
Examples - Stra*fied Sampling
• Music taste of students
• Average Salary
• Average package offered
Cluster Sampling
Cluster Sampling
Image Source - https://en.wikipedia.org/wiki/Cluster_sampling
Cluster Sampling
• Single Stage Cluster Sampling
• Two Stage Cluster Sampling
Systema*c Sampling
Systema*c Sampling
Image Source - https://stepupanalytics.com/wp-content/uploads/2018/08/5-3-300x190.jpg
Probability sampling
• Simple Random Sampling
• Stratified sampling
• Cluster Sampling
• Systematic sampling
• Multi stage Sampling