How to Perform Grubbs’ Test in Python
Last Updated :
23 Feb, 2022
Prerequisites: Parametric and Non-Parametric Methods, Hypothesis Testing
In this article, we will be discussing the different approaches to perform Grubbs’ Test in Python programming language.
Grubbs’ Test is also known as the maximum normalized residual test or extreme studentized deviate test is a test used to detect outliers in a univariate data set assumed to come from a normally distributed population. This test is defined for the hypothesis:
- Ho: There are no outliers in the data set
- Ha: There is exactly one oiler in the database
Method 1: Performing two-side Grubbs’ Test
In this method to perform the grubb's test, the user needs to call the smirnov_grubbs.test() function from the outlier_utils package passed with the required data passed as the parameters.
Syntax: smirnov_grubbs.test(data, alpha)
Parameters:
- data: A numeric vector of data values
- alpha: The significance level to use for the test.
Example:
In this example, we are performing the two-sided Grubbs test, which will detect outliers on both ends of the dataset using the smirnov_grubbs.test() function in the python programming language.
Python
import numpy as np
from outliers import smirnov_grubbs as grubbs
# define data
data = np.array([20, 21, 26, 24, 29, 22,
21, 50, 28, 27])
# perform Grubbs' test
grubbs.test(data, alpha=.05)
Output:
array([20, 21, 26, 24, 29, 22, 21, 28, 27])
Method 2: Performing one-side Grubbs’ Test
In this approach to get the one-side grubb's test, the user needs to call either grubbs.min_test() function to get the min. the outlier of the given data set or the grubbs.max_test() to get the max. outlier out from the given data set.
Syntax:
grubbs.min_test(data, alpha)
grubbs.max_test(data, alpha)
Example 1:
Under this example, we will be performing a one-side Grubbs’ Test using the grubbs.min_test() function of the given data in the python programming language.
Python
import numpy as np
from outliers import smirnov_grubbs as grubbs
# define data
data = np.array([20, 21, 26, 24, 29,
22, 21, 50, 28, 27, 5])
print("Data after performing min one-side grubb's test: ")
# perform min Grubbs' test
grubbs.min_test(data, alpha=.05)
Output:
Data after performing min one-side grubb's test:
array([20, 21, 26, 24, 29, 22, 21, 50, 28, 27, 5])
Example 2:
Under this example, we will be performing a one-side Grubbs’ Test using the grubbs.max_test() function of the given data in the python programming language.
Python
import numpy as np
from outliers import smirnov_grubbs as grubbs
# define data
data = np.array([20, 21, 26, 24, 29, 22,
21, 50, 28, 27, 5])
print("Data after performing min one-side grubb's test: ")
# perform max Grubbs' test
grubbs.max_test(data, alpha=.05)
Output:
Data after performing min one-side grubb's test:
array([20, 21, 26, 24, 29, 22, 21, 28, 27, 5])
Method 3: Extract the Index of the Outlier using the gribb's test
In this approach, the user needs to follow the below syntax to get the index at which the outlier is present of the given data.
grubbs.max_test_indices() function: This function returns the index of the outlier present in the array.
Syntax: grubbs.max_test_indices(data,alpha)
Python
import numpy as np
from outliers import smirnov_grubbs as grubbs
# define data
data = np.array([20, 21, 26, 24, 29, 22,
21, 50, 28, 27, 5])
grubbs.max_test_indices(data, alpha=.05)
Output:
[7]
Method 4: Extract the value of the Outlier using the grubb's test
In this approach, the user needs to follow the below syntax to get the value at which the outlier is present of the given data.
grubbs.max_test_outlines() function: This function returns the value of the outlier present in the array.
grubbs.max_test_outlines(data,alpfa)
Python
import numpy as np
from outliers import smirnov_grubbs as grubbs
# define data
data = np.array([20, 21, 26, 24, 29, 22,
21, 50, 28, 27, 5])
grubbs.max_test_outliers(data, alpha=.05)
Output:
[50]
Similar Reads
How to Perform Grubbsâ Test in R
Grubbsâ Test, named after Frank E. Grubbs, is a statistical test used to detect outliers in a dataset. Outliers are those points in the dataset that differ from the rest of the dataset and do not follow a certain trend. These points can alter the analysis leading to incorrect solutions and predictio
7 min read
How to Perform an F-Test in Python
In statistics, Many tests are used to compare the different samples or groups and draw conclusions about populations. These techniques are commonly known as Statistical Tests or hypothesis Tests. It focuses on analyzing the likelihood or probability of obtaining the observed data that they are rando
10 min read
How to Perform a Breusch-Pagan Test in Python
Heteroskedasticity is a statistical term and it is defined as the unequal scattering of residuals. More specifically it refers to a range of measured values the change in the spread of residuals. Heteroscedasticity possesses a challenge because ordinary least squares (OLS) regression considers the r
4 min read
How to Perform a Brown â Forsythe Test in Python
Prerequisites: Parametric and Non-Parametric Methods, Hypothesis Testing In this article, we will be looking at the approach to perform a brown-Forsythe test in the Python programming language. BrownâForsythe test is a statistical test for the equality of group variances based on performing an Anal
4 min read
How to perform testing in PyCharm?
PyCharm is a powerful integrated development environment (IDE) designed specifically for Python programming. Developed by JetBrains, PyCharm provides a comprehensive set of tools to streamline the development process, from code writing to testing and debugging. In this article, we will focus on the
4 min read
How to Perform a Shapiro-Wilk Test in Python
In this article, we will be looking at the various approaches to perform a Shapiro-wilk test in Python. Shapiro-Wilk test is a test of normality, it determines whether the given sample comes from the normal distribution or not. Shapiro-Wilkâs test or Shapiro test is a normality test in frequentist s
2 min read
How to Perform a Chi-Square Goodness of Fit Test in Python
In this article, we are going to see how to Perform a Chi-Square Goodness of Fit Test in PythonThe Chi-Square Goodness of fit test is a non-parametric statistical hypothesis test that's used to determine how considerably the observed value of an event differs from the expected value. it helps us che
3 min read
How to Perform a One-Way ANOVA in Python
One-Way ANOVA is a statistical test used to check if there are significant differences between the means of three or more groups i.e analysis of variance. It helps us to find whether the variations in data are due to different treatments or random chance.Hypotheses in One-Way ANOVAOne-way ANOVA has
2 min read
How to Perform Manual Testing?
Manual testing, a key component of software testing in which test cases are carried out by human testers without the help of automated testing tools. It entails methodically investigating software programs, spotting flaws, and making sure they adhere to requirements. Table of Content What is Manual
10 min read
How to Conduct a Paired Samples T-Test in Python
Paired sample T-test: This test is also known as the dependent sample t-test. It is a statistical concept and is used to check whether the mean difference between the two sets of observation is equal to zero. Â Each entity is measured is two times in this test that results in the pairs of observation
3 min read