Contingency Tables in R Programming
Last Updated :
27 Nov, 2024
Prerequisite: Data Structures in R Programming
Contingency tables are very useful to condense a large number of observations into smaller to make it easier to maintain tables. A contingency table shows the distribution of a variable in the rows and another in its columns. Contingency tables are not only useful for condensing data, but they also show the relations between variables. They are a way of summarizing categorical variables. A contingency table that deals with a single table are called a complex or a flat contingency table.
Making Contingency tables
A contingency table is a way to redraw data and assemble it into a table. And, it shows the layout of the original data in a manner that allows the reader to gain an overall summary of the original data. The table() function is used in R to create a contingency table. The table() function is one of the most versatile functions in R. It can take any data structure as an argument and turn it into a table. The more complex the original data, the more complex is the resulting contingency table.
Creating contingency tables from Vectors
In R a vector is an ordered collection of basic data types of a given length. The only key thing here is all the elements of a vector must be of the identical data type e.g homogeneous data structures. Vectors are one-dimensional data structures. It is the simplest data object from which you can create a contingency table.
Example:
R
# R program to illustrate
# Contingency Table
# Creating a vector
vec = c(2, 4, 3, 1, 6, 3, 2, 1, 4, 5)
# Creating contingency table from vec using table()
conTable = table(vec)
print(conTable)
Output:
vec
1 2 3 4 5 6
2 2 2 2 1 1
In the given program what happens is first when we execute table() command on the vector it sorts the vector value and also prints the frequencies of every element given in the vector.
Creating contingency tables from Data
Now we will see a simple example that provides a data frame containing character values in one column and also containing a factor in one of its columns. This one column of factors contains character variables. In order to create our contingency table from data, we will make use of the table(). In the following example, the table() function returns a contingency table. Basically, it returns a tabular result of the categorical variables.
Example:
R
# R program to illustrate
# Contingency Table
# Creating a data frame
df = data.frame(
"Name" = c("Amiya", "Rosy", "Asish"),
"Gender" = c("Male", "Female", "Male")
)
# Creating contingency table from data using table()
conTable = table(df)
print(conTable)
Output:
Gender
Name Female Male
Amiya 0 1
Asish 0 1
Rosy 1 0
Creating custom contingency tables
The contingency table in R can be created using only a part of the data which is in contrast with collecting data from all the rows and columns. We can create a custom contingency table in R using the following ways:
- Using Columns of a Data Frame in a Contingency Table
- Using Rows of a Data Frame in a Contingency Table
- By Rotating Data Frames in R
- Creating Contingency Tables from Matrix Objects in R
- Using Columns of a Data Frame in a Contingency Table: With the help of table() command, we are able to specify the columns with which the contingency tables can be created. In order to do so, you only need to pass the name of vector objects in the parameter of table() command.
Example:
R
# R program to illustrate
# Contingency Table
# Creating a data frame
df = data.frame(
"Name" = c("Amiya", "Rosy", "Asish"),
"Gender" = c("Male", "Female", "Male")
)
# Creating contingency table by selecting a column
conTable = table(df$Name)
print(conTable)
Output:
Amiya Asish Rosy
1 1 1
From the output, you can notice that the table() command sorts the name in alphabetical order along with their frequencies of occurrence.
- Using Rows of a Data Frame in a Contingency Table: We can't create a contingency table using rows of a data frame directly as we did in "using column" part. With the help of the matrix, we can create a contingency table by looking at the rows of a data frame.
Example:
R
# R program to illustrate
# Contingency Table
# Creating a data frame
df = data.frame(
"Name" = c("Amiya", "Rosy", "Asish"),
"Gender" = c("Male", "Female", "Male")
)
# Creating contingency table by selecting rows
conTable = table(as.matrix(df[2:3, ]))
print(conTable)
Output:
Asish Female Male Rosy
1 1 1 1
- By Rotating Data Frames in R: We can also create a contingency table by rotating a data frame in R. We can perform a rotation of the data, that is, transpose of the data using the t() command.
Example:
R
# R program to illustrate
# Contingency Table
# Creating a data frame
df = data.frame(
"Name" = c("Amiya", "Rosy", "Asish"),
"Gender" = c("Male", "Female", "Male")
)
# Rotating the data frame
newDf = t(df)
# Creating contingency table by rotating data frame
conTable = table(newDf)
print(conTable)
Output:
newDf
Amiya Asish Female Male Rosy
1 1 1 2 1
- Creating Contingency Tables from Matrix Objects in R A matrix is a rectangular arrangement of numbers in rows and columns. In a matrix, as we know rows are the ones that run horizontally and columns are the ones that run vertically. Matrices are two-dimensional, homogeneous data structures. We can create a contingency table by using this matrix object.
Example:
R
# R program to illustrate
# Contingency Table
# Creating a matrix
A = matrix(
c(1, 2, 4, 1, 5, 6, 2, 4, 7),
nrow = 3,
ncol = 3
)
# Creating contingency table using matrix object
conTable = table(A)
print(conTable)
Output:
A
1 2 4 5 6 7
2 2 2 1 1 1
Converting Objects into tables
As mentioned above, a table is a special type of data object which is similar to the matrix but also possesses several differences.
- Converting Matrix Objects into tables: We can directly convert a matrix object into table by using the as.table() command. Just pass the matrix object as a parameter to the as.table() command.
Example:
R
# R program to illustrate
# Contingency Table
# Creating a matrix
A = matrix(
c(1, 2, 4, 1, 5, 6, 2, 4, 7),
nrow = 3,
ncol = 3
)
# Converting Matrix Objects into tables
newTable = as.table(A)
print(newTable)
Output:
A B C
A 1 1 2
B 2 5 4
C 4 6 7
- Converting Data frame Objects into tables: We can't directly convert a data frame object into table by using the as.table() command. In the case of a data frame, the object can be converted into the matrix, and then it can be converted into the table using as.table() command.
Example:
R
# R program to illustrate
# Contingency Table
# Creating a data frame
df = data.frame(
"Name" = c("Amiya", "Rosy", "Asish"),
"Gender" = c("Male", "Female", "Male")
)
# Converting data frame object to matrix object
modifiedDf = as.matrix(df)
# Converting Matrix Objects into tables
newTable = as.table(modifiedDf)
print(newTable)
Output:
Name Gender
A Amiya Male
B Rosy Female
C Asish Male
Similar Reads
Dummy Variables in R Programming
Dummy variables are binary variables used to represent categorical data in numerical form. They represents a characteristic of an observation for example, gender can be represented as 1 for male and 0 for female or vice versa. New columns are created to reflect these binary values, such as gender_m
3 min read
How to Code in R programming?
R is a powerful programming language and environment for statistical computing and graphics. Whether you're a data scientist, statistician, researcher, or enthusiast, learning R programming opens up a world of possibilities for data analysis, visualization, and modeling. This comprehensive guide aim
4 min read
Assigning Vectors in R Programming
Vectors are one of the most basic data structure in R. They contain data of same type. Vectors in R is equivalent to arrays in other programming languages. In R, array is a vector of one or more dimensions and every single object created is stored in the form of a vector. The members of a vector are
5 min read
8 Coding Style Tips for R Programming
R is an open-source programming language that is widely used as a statistical software and data analysis tool. R generally comes with the Command-line interface. R is available across widely used platforms like Windows, Linux, and macOS. Also, the R programming language is the latest cutting-edge to
5 min read
Basic Syntax in R Programming
R is the most popular language used for Statistical Computing and Data Analysis with the support of over 10, 000+ free packages in CRAN repository. Like any other programming language, R has a specific syntax which is important to understand if you want to make use of its powerful features. This art
3 min read
Hypothesis Testing in R Programming
A hypothesis is made by the researchers about the data collected for any experiment or data set. A hypothesis is an assumption made by the researchers that are not mandatory true. In simple words, a hypothesis is a decision taken by the researchers based on the data of the population collected. Hypo
6 min read
Bartlettâs Test in R Programming
In statistics, Bartlett's test is used to test if k samples are from populations with equal variances. Equal variances across populations are called homoscedasticity or homogeneity of variances. Some statistical tests, for example, the ANOVA test, assume that variances are equal across groups or sam
5 min read
data.table vs data.frame in R Programming
data.table in R is an enhanced version of the data.frame. Due to its speed of execution and the less code to type it became popular in R. The purpose of data.table is to create tabular data same as a data frame but the syntax varies. In the below example let we can see the syntax for the data table:
3 min read
Stratified Boxplot in R Programming
A boxplot is a graphical representation of groups of numerical data through their quartiles. Box plots are non-parametric that they display variation in samples of a statistical population without making any assumptions of the underlying statistical distribution. The spacings between the different p
5 min read
Absolute and Relative Frequency in R Programming
In statistics, frequency or absolute frequency indicates the number of occurrences of a data value or the number of times a data value occurs. These frequencies are often plotted on bar graphs or histograms to compare the data values. For example, to find out the number of kids, adults, and senior c
2 min read