Open In App

Sampling with or without Replacement

Last Updated : 09 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Sampling is a technique used to select a subset of data points from a larger dataset or population to make inferences. It can be implemented using two approaches, with replacement and without replacement. Understanding these helps ensure accurate statistical analysis and modeling.

Sampling_with_or_without_Replacement
Demonstration of Sampling with and without Replacement

What is Sampling with Replacement?

Sampling with replacement refers to the process where an item is selected from a population, and after being selected, it is "replaced" back into the population before the next selection. This means that the same item can be chosen multiple times in the same sampling process. This method is used in techniques like bootstrap sampling.

Steps for Sampling with Replacement

  1. Select an item randomly from the population.
  2. Record the selected item.
  3. Replace the item into the population.
  4. Repeat the process until the desired sample size is achieved.

Different Techniques of Sampling with Replacement

Resampling or Sampling with Replacement can be performed in various ways. Some of these are:

  1. Using Numpy
  2. Using Pandas

Let's see how these are implemented:

1. Sampling with Replacement using Numpy

Explanation

  • a=10: You’re sampling from numbers 0 to 9.
  • size=10: You want 10 samples.
  • replace=True: Allows repeated selections.
Python
import numpy as np

np.random.seed(10)
sample = np.random.choice(a=10, size=10, replace=True)
print("NumPy Sample with Replacement:", sample)

Output
NumPy Sample with Replacement: [9 4 0 1 9 0 1 8 9 0]

2. Sampling with Replacement using Pandas

Explanation

  • We’re selecting only a few specific columns.
  • sample(n=6, replace=True) means: randomly select 6 rows with replacement.
  • So, the same row may appear multiple times in the result.
  • This creates a bootstrapped dataset (same size as original but with repetition).
Python
import pandas as pd

d = {  'ID': [1, 2, 3, 4, 5, 6],'Age': [23, 31, 45, 22, 35, 29],
    'Salary': [50000, 62000, 80000, 45000, 70000, 58000],'Department': ['HR', 'IT', 'Finance', 'HR', 'IT', 'Finance']   }
df = pd.DataFrame(d)

# Sample 6 rows with replacement
sample_df = df.sample(n=6, replace=True, random_state=5)
print("\nPandas Sample with Replacement:")
print(sample_df)

Output
Pandas Sample with Replacement:
   ID  Age  Salary Department
3   4   22   45000         HR
5   6   29   58000    Finance
0   1   23   50000         HR
1   2   31   62000         IT
0   1   23   5000...

The sample has been extracted and we can observe that 1 sample is selected 2 times, which implies Replacement after selection.

What is Sampling without Replacement?

Sampling without replacement refers to the process where an item, once selected, is not returned to the population for further selection. This means that once an item is selected, it cannot be chosen again in the same sampling process. It’s commonly used in real-world surveys and randomized splits.

Steps for Sampling without Replacement

  1. Select an item randomly from the population.
  2. Record the selected item.
  3. Remove the selected item from the population.
  4. Repeat the process until the desired sample size is achieved.

Different Techniques of Sampling without Replacement

Resampling or Sampling without Replacement can be performed in various ways. Some of these are:

  1. Using Numpy
  2. Using Pandas

Let's see how these are implemented:

1. Using Numpy for Sampling Without Replacement

Explanation

  • a=10: Sample from numbers 0 to 9.
  • size=6: Take 6 samples.
  • replace=False: No repetition allowed.
Python
import numpy as np

np.random.seed(20)

# Sample 6 unique values from 0 to 9
sample = np.random.choice(a=10, size=6, replace=False)
print("NumPy Sample without Replacement:", sample)

Output
NumPy Sample without Replacement: [7 1 8 5 0 2]

2. Using Pandas for Sampling Without Replacement

Python
import pandas as pd

d = {  'ID': [1, 2, 3, 4, 5, 6],'Age': [23, 31, 45, 22, 35, 29],
    'Salary': [50000, 62000, 80000, 45000, 70000, 58000],'Department': ['HR', 'IT', 'Finance', 'HR', 'IT', 'Finance']   }
df = pd.DataFrame(d)

# Sample 5 rows without replacement
sample_df = df.sample(n=5, replace=False, random_state=15)
print("\nPandas Sample without Replacement:")
print(sample_df)

Output
Pandas Sample without Replacement:
   ID  Age  Salary Department
3   4   22   45000         HR
2   3   45   80000    Finance
5   6   29   58000    Finance
1   2   31   62000         IT
4   5   35   7...

The sample has been extracted and we can observe that no sample is selected more than once, which implies that the selected sample is not replaced after selection.

Key Differences Between Sampling with and without Replacement

Aspect

Sampling with Replacement

Sampling without Replacement

Item Selection

Item can be selected multiple times.

Item can only be selected once.

Population Size

Remains constant during sampling.

Decreases as items are selected.

Use Case

Bootstrapping, Monte Carlo methods.

Lottery draws, survey sampling.

Output Variability

Higher chances of repeated items.

No repeated items in the sample.

Real-World Applications of Sampling with replacement

  • Bootstrapping: A revival method used in data where samples are designed with replacement to estimate the distribution of a statistical.
  • Monte Carlo Simulation: Used in simulation where different scenarios require random samples with replacement to model.

Real-World Applications Sampling without replacement

  • Lottery Draws: Drawing lottery numbers without replacement ensures that no number can appear more than once.
  • Survey Sampling: Selecting participants for a survey where no individual can be chosen more than once.

You can refer to some related articles: Population vs Samples, Bootstrapping, Methods of Sampling.


Next Article
Practice Tags :

Similar Reads