R - Create Dataframe From Existing Dataframe
Last Updated :
12 Apr, 2024
Create Dataframes when dealing with organized data so sometimes we also need to make Dataframes from already existing Dataframes. In this Article, let's explore various ways to create a data frame from an existing data frame in R Programming Language.
Ways to Create Dataframe from Existing Dataframe
Using Base R data. frame() Function
Using base R functionality, the creation of a new data frame from an existing one using direct column referencing. By using the data.frame() function, specific columns such as 'Name' and 'Age' are extracted from the original dataframe 'df', showing a normal approach to dataframe manipulation in R.
R
# Create a dataframe
df <- data.frame(
ID = 1:5,
Name = c("Shravan", "Jeetu", "Lakhan", "Pankaj", "Mihika"),
Age = c(20, 18, 19, 20, 18),
Score = c(80, 75, 85, 90, 95)
)
# Display the original dataframe
print("Original Dataframe:")
print(df)
# Create a new dataframe using direct column referencing
new_df <- data.frame(
Name = df$Name,
Age = df$Age
)
# Display the new dataframe
print("New Dataframe created using Direct Column Referencing:")
print(new_df)
Output:
[1] "Original Dataframe:"
ID Name Age Score
1 1 Shravan 20 80
2 2 Jeetu 18 75
3 3 Lakhan 19 85
4 4 Pankaj 20 90
5 5 Mihika 18 95
[1] "New Dataframe created using Direct Column Referencing:"
Name Age
1 Shravan 20
2 Jeetu 18
3 Lakhan 19
4 Pankaj 20
5 Mihika 18
Using subset() Function
Using the subset() function in R, the creation of a new dataframe from an existing one by selectively extracting columns 'Name' and 'Score'. Through this example, the subset function shows a simple approach to dataframe manipulation.
R
# Create a dataframe
df <- data.frame(
ID = 1:5,
Name = c("Shravan", "Jeetu", "Lakhan", "Pankaj", "Mihika"),
Age = c(20, 18, 19, 20, 18),
Score = c(80, 75, 85, 90, 95)
)
# Display the original dataframe
print("Original Dataframe:")
print(df)
# Using the subset() function to create a new dataframe
new_df_subset_func <- subset(df, select = c(Name, Score))
# Display the new dataframe created using the subset() Function
print("New Dataframe created using subset() Function:")
print(new_df_subset_func)
Output:
[1] "Original Dataframe:"
ID Name Age Score
1 1 Shravan 20 80
2 2 Jeetu 18 75
3 3 Lakhan 19 85
4 4 Pankaj 20 90
5 5 Mihika 18 95
[1] "New Dataframe created using subset() Function:"
Name Score
1 Shravan 80
2 Jeetu 75
3 Lakhan 85
4 Pankaj 90
5 Mihika 95
Using merge() Function
The merging of two dataframes, 'df1' and 'df2', based on their common column 'Name' using the merge() function in R. By combining data from both datasets, this approach allows thorough data aggregation, showing a suitable view of the information included within each dataframe.
R
# Create the first dataframe
df1 <- data.frame(
Name = c("Shravan", "Jeetu", "Lakhan", "Pankaj", "Mihika"),
Age = c(20, 18, 19, 20, 18),
Score = c(80, 75, 85, 90, 95)
)
# Create the second dataframe
df2 <- data.frame(
Name = c("Shravan", "Jeetu", "Mihika"),
Gender = c("Male", "Male", "Female")
)
# Display the first dataframe
cat("First Dataframe (df1):\n")
print(df1)
# Display the second dataframe
cat("\nSecond Dataframe (df2):\n")
print(df2)
# Merge dataframes based on common column 'Name'
new_df <- merge(df1, df2, by = "Name")
# Display the new merged dataframe
cat("\nMerged Dataframe (new_df):\n")
print(new_df)
Output:
First Dataframe (df1):
Name Age Score
1 Shravan 20 80
2 Jeetu 18 75
3 Lakhan 19 85
4 Pankaj 20 90
5 Mihika 18 95
Second Dataframe (df2):
Name Gender
1 Shravan Male
2 Jeetu Male
3 Mihika Female
Merged Dataframe (new_df):
Name Age Score Gender
1 Jeetu 18 75 Male
2 Mihika 18 95 Female
3 Shravan 20 80 Male
Using Subset Method
The Subset Method in R is used to create a new dataframe by selectively extracting specific columns from an existing dataframe. By using less code and column indexing, this method is a simple approach to dataframe manipulation.
R
# Create a dataframe
df <- data.frame(
ID = 1:5,
Name = c("Shravan", "Jeetu", "Lakhan", "Pankaj", "Mihika"),
Age = c(20, 18, 19, 20, 18),
Score = c(80, 75, 85, 90, 95)
)
# Display the original dataframe
print("Original Dataframe:")
print(df)
# Subsetting the dataframe to select desired columns
new_df_subset <- df[, c("Name", "Age")]
# Display the new dataframe created using the Subset Method
print("New Dataframe created using Subset Method:")
print(new_df_subset)
Output:
[1] "Original Dataframe:"
ID Name Age Score
1 1 Shravan 20 80
2 2 Jeetu 18 75
3 3 Lakhan 19 85
4 4 Pankaj 20 90
5 5 Mihika 18 95
[1] "New Dataframe created using Subset Method:"
Name Age
1 Shravan 20
2 Jeetu 18
3 Lakhan 19
4 Pankaj 20
5 Mihika 18
Using dplyr package
The select() function from the dplyr package in R is used to create a new dataframe by selecting specific columns from an existing dataframe. By using simple functions provided by dplyr, data scientists can easily manipulate datasets to their analytical needs.
R
# Load the dplyr package
library(dplyr)
# Create a dataframe
df <- data.frame(
ID = 1:5,
Name = c("Shravan", "Jeetu", "Lakhan", "Pankaj", "Mihika"),
Age = c(20, 18, 19, 20, 18),
Score = c(80, 75, 85, 90, 95)
)
# Display the original dataframe
cat("Original Dataframe:\n")
print(df)
# Using dplyr package: Selecting specific columns using select() function
new_df_dplyr <- select(df, Name, Score)
# Display the new dataframe created using dplyr package
cat("\nNew Dataframe created using dplyr package:\n")
print(new_df_dplyr)
Output:
Original Dataframe:
ID Name Age Score
1 1 Shravan 20 80
2 2 Jeetu 18 75
3 3 Lakhan 19 85
4 4 Pankaj 20 90
5 5 Mihika 18 95
New Dataframe created using dplyr package
Name Score
1 Shravan 80
2 Jeetu 75
3 Lakhan 85
4 Pankaj 90
5 Mihika 95
Using data.table Package
The creation of a new dataframe from an existing one using the data.table package in R. By using data.table, first dataframe is converted to data.table then after selecting specific columns, its again converted back to dataframe and then new dataframe is being printed.
Note: Before running this code install the data.table package.
R
# Load the data.table package
library(data.table)
# Create a dataframe
df <- data.frame(
ID = 1:5,
Name = c("Shravan", "Jeetu", "Lakhan", "Pankaj", "Mihika"),
Age = c(20, 18, 19, 20, 18),
Score = c(80, 75, 85, 90, 95)
)
# Convert the dataframe to a data.table
dt <- as.data.table(df)
# Display the original data.table
cat("Original Data.table:\n")
print(dt)
# Using data.table package: Selecting specific columns using data.table syntax
new_dt <- dt[, .(Name, Age)]
# Convert the result back to a dataframe
new_df_data_table <- as.data.frame(new_dt)
# Display the new dataframe created using data.table package
cat("\nNew Dataframe created using data.table package:\n")
print(new_df_data_table)
Output:
Original Data.table:
ID Name Age Score
<int> <char> <num> <num>
1: 1 Shravan 20 80
2: 2 Jeetu 18 75
3: 3 Lakhan 19 85
4: 4 Pankaj 20 90
5: 5 Mihika 18 95
New Dataframe created using data.table package:
Name Age
1 Shravan 20
2 Jeetu 18
3 Lakhan 19
4 Pankaj 20
5 Mihika 18
Similar Reads
Create table from DataFrame in R
In this article, we are going to discuss how to create a table from the given Data-Frame in the R Programming language. Function Used: table(): This function is an essential function for performing interactive data analyses. As it simply creates tabular results of categorical variables. Syntax: tabl
3 min read
PySpark - Create DataFrame from List
In this article, we are going to discuss how to create a Pyspark dataframe from a list. To do this first create a list of data and a list of column names. Then pass this zipped data to spark.createDataFrame() method. This method is used to create DataFrame. The data attribute will be the list of dat
2 min read
How to create dataframe in R
Dataframes are fundamental data structures in R for storing and manipulating data in tabular form. They allow you to organize data into rows and columns, similar to a spreadsheet or a database table. Creating a data frame in the R Programming Language is a simple yet essential task for data analysis
3 min read
Convert JSON data to Dataframe in R
In Data Analysis, we have to manage data in various formats, one of which is JSON (JavaScript Object Notation). JSON is used for storing and exchanging data between different systems and is hugely used in web development. In R Programming language, we have to work often with data in different format
4 min read
Create pandas dataframe from lists using zip
One of the way to create Pandas DataFrame is by using zip() function. You can use the lists to create lists of tuples and create a dictionary from it. Then, this dictionary can be used to construct a dataframe. zip() function creates the objects and that can be used to produce single item at a time.
2 min read
Create Matrix and Data Frame from Lists in R Programming
In R programming, there 5 basic objects. Lists are the objects that can contain heterogeneous types of elements, unlike vectors. Matrices can contain the same type of elements or homogeneous elements. On the other hand, data frames are similar to matrices but have an advantage over matrices to keep
3 min read
Different ways to create Pandas Dataframe
It is the most commonly used Pandas object. The pd.DataFrame() function is used to create a DataFrame in Pandas. There are several ways to create a Pandas Dataframe in Python. Example: Creating a DataFrame from a Dictionary [GFGTABS] Python import pandas as pd # initialize data of lists. data = {
7 min read
Create data.frame from nested lapply's
In R, nested `lapply()` functions can be used to create a data frame from a nested list. This approach allows you to apply a function to each element of the nested list and then convert the processed data into a structured tabular format. This can be useful when dealing with complex data structures
2 min read
Create a dataframe in R with different number of rows
In this article, we will explore how to create a data frame with different numbers of rows by using the R Programming Language. How do we create the data frame?data. frame() is a function that is used to create the data frame. By using these functions provided by R, it is possible to create the data
2 min read
Creating a Data Frame from Vectors in R Programming
A vector can be defined as the sequence of data with the same datatype. In R, a vector can be created using c() function. R vectors are used to hold multiple data values of the same datatype and are similar to arrays in C language. Data frame is a 2 dimensional table structure which is used to hold
5 min read