How to Read Large JSON file in R
Last Updated :
28 Apr, 2025
First, it is important to understand that JSON (JavaScript Object Notation), is a lightweight data-interchange format that is easy for humans to read and write and easy for machines to parse and generate. JSON files are often used for data transmission between a server and a web application and can be quite large in size.
In this article, we'll cover the basics of using read_json and split to read large JSON files in R. We'll also explore some advanced techniques for optimizing performance and reducing memory usage. Whether you're a seasoned R programmer or a beginner, this article will provide you with the knowledge and skills you need to read large JSON files in R with confidence.
Read Large JSON files in R using read_json()
read_json is a function from the jsonlite package that allows you to read JSON files in a memory-efficient way. It reads the file line by line, so it only loads a small portion of the data into memory at a time. This makes it a great choice for reading large JSON files.
Install the jsonlite library and load it
To read a large JSON file in R, one of the most popular packages is jsonlite. This package provides a simple and efficient way to parse JSON data and convert it into an R object. To install jsonlite, you can use the following command:
install.packages("jsonlite")
library(jsonlite)
Creating Random Dataset
Here we are creating our own dataset, you can create your own or you can use any JSON large dataset from any site.
R
library(jsonlite)
# generate random id
generate_id <- function() paste0(sample(c(letters,
LETTERS, 0:9), 10,
replace=TRUE),
collapse="")
# real first names of people
first_names <- c("John", "Jane", "Michael",
"Emily", "William", "Ashley",
"David", "Jessica", "Andrew",
"Jennifer",
"Matthew", "Sarah", "Daniel",
"Amanda", "Christopher", "Elizabeth",
"Nicholas", "Megan", "Robert",
"Lauren", "Joseph", "Ava", "Jacob",
"Sophia", "Jonathan", "Natalie", "Ryan",
"Madison", "Adam", "Chloe")
# real last names of people
last_names <- c("Smith", "Johnson", "Williams", "Jones",
"Brown", "Davis", "Miller", "Wilson",
"Moore", "Taylor",
"Anderson", "Thomas", "Jackson", "White",
"Harris", "Martin", "Thompson", "Garcia",
"Martinez", "Robinson",
"Clark", "Rodriguez", "Lewis", "Lee", "Walker",
"Hall", "Allen", "King", "Wright", "Scott")
# education qualifications
qualifications <- c("Primary Education", "Secondary Education",
"High School", "Undergraduate", "Postgraduate")
# create a data frame
df <- data.frame(ID = sapply(1:1000000,
function(i) generate_id()),
First_Name = sample(first_names,
1000000, replace = TRUE),
Last_Name = sample(last_names,
1000000, replace = TRUE),
Age = sample(18:30, 1000000,
replace = TRUE),
Highest_qualification =
sample(qualifications, 1000000,
replace = TRUE),
stringsAsFactors = FALSE)
# write the data frame to a JSON file
write_json(df, "people.json")
You can check the size of the file using the following code.Â
R
file.info("people.json")$size
Output:
113428352
Read the JSON file into R
The read_json() function will automatically detect the data structure of the JSON file and convert it into an R object, which can be a list or a data frame. Once you have the data in an R object, you can use all the standard R functions and packages to manipulate and analyze it.
You can use the read_json() function to read a JSON file into R. For example, to read a JSON file called "data.json" in your working directory, you would use the following code:
R
data <- jsonlite::read_json("file.json")
head(data, 3)
Output:

Split Large JSON files in R using Split
The split is a base R function that allows you to split a large file into smaller pieces. This can be useful when working with large JSON files as it reduces the memory footprint of your data. By splitting the file into smaller pieces, you can process each piece separately and then combine the results.
In this example project, you can see how to use the split method to read large JSON files in R. The project starts by generating a large dataset of 1 million rows. This dataset is then saved to a JSON file, which serves as the large JSON file that you want to read in R.
Install and Loading the Required Package
To split a large JSON file in R, you will need to have the split package installed. You can install it using the following code. Once the package is installed, you can load it using the following code:
install.packages("split")
library(split)
Determine the Number of Rows in the File
Next, you need to specify the file path of the large JSON file. To split the large JSON file into smaller files, you need to determine the number of rows in the file and use the ceiling() function from the base package to round up to the nearest integer.
R
file_path <- "S:\\data.json"
# Expected number of rows in each chunk
chunk_size <- 100000
# Open the input file
data_stream <- stream_in(file(file_path),
simplifyDataFrame = TRUE,
pagesize = chunk_size)
n_rows <- nrow(data_stream)
n_chunks <- ceiling(n_rows / chunk_size)
Split the Large JSON File
Finally, you can use the split() function to split the large JSON file into smaller files.
R
# split data into parts
parts <- split(data_stream, 1:n_chunks)
Write each part in a Separate File
Next, the split method is used to split the large JSON file into smaller pieces. The split function takes two arguments: the file to be split and the number of lines that each split file should contain. In this example, the large JSON file is split into 10 smaller files, each containing 100,000 lines.
R
for (i in 1:n_chunks) {
write(toJSON(parts[[i]]), paste0("part_", i, ".json"))
}
Complete Code
With these simple steps, you can split a large JSON file in R into smaller files, making it easier to process the data in R. Whether you are working with large datasets or just want to organize your data more efficiently, this method can be a useful tool in your R programming arsenal.
R
# load data
library(jsonlite)
file_path <- "S:\\data.json"
chunk_size <- 100000 # Expected number of rows in each chunk
# Open the input file
data_stream <- stream_in(file(file_path),
simplifyDataFrame = TRUE,
pagesize = chunk_size)
n_rows <- nrow(data_stream)
n_chunks <- ceiling(n_rows / chunk_size)
# split data into parts
parts <- split(data_stream, 1:n_chunks)
# write each part to a separate file
for (i in 1:n_chunks) {
write(toJSON(parts[[i]]), paste0("S:\\part_", i, ".json"))
}
Output:
Splitted Parts
Read Large JSON file in R
Similar Reads
How to read JSON files in R JSON (JavaScript Object Notation) is a lightweight data-interchange format that is easy to read for humans as well as machines to parse and generate. It's widely used for APIs, web services and data storage. A JSON structure looks like this:{ "name": "John", "age": 30, "city": "New York"}JSON data c
2 min read
How to read this JSON file with jsonlite in R? JSON data is represented as key-value pairs, which are similar to the concept of a dictionary in Python or a list of named elements in R. In this article, we will learn how to access different components of a JSON file using R. What is jsonlite package in R? The jsonlite package in R provides an eas
2 min read
How to Read XML File in R? XML (Extensible Markup Language) can be a widely used format for storing and transporting data. It can be structured and allowing the both humans and machines to easily parse and interpret the data it contains. In R programming, reading and processing XML files is straightforward thanks to the vario
5 min read
How to Read Zip Files into R In the R Programming Language Zip files are compressed archives that store one or more files or directories in compressed format. They are commonly used to package and distribute files, particularly when working with huge datasets or many files. Zip files not only conserve disc space but also facili
4 min read
How to Read Many ASCII Files into R? Reading data from ASCII files into R is a common task in data analysis and statistical computing. ASCII files, known for their simplicity and wide compatibility, often contain text data that can be easily processed in R. Here we read multiple ASCII files into R Programming Language. What are ASCII F
4 min read
How to Read Many Files in R with Loop? When working with data in R Programming Language you often need to read multiple files into your environment. If you have a large number of files, doing this manually is impractical. Instead, you can use a loop to automate the process. This guide will show you how to read many files in R using a loo
3 min read