Open In App

Pandas - Parsing JSON Dataset

Last Updated : 14 May, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

JSON (JavaScript Object Notation) is a popular way to store and exchange data especially used in web APIs and configuration files. Pandas provides tools to parse JSON data and convert it into structured DataFrames for analysis. In this guide we will explore various ways to read, manipulate and normalize JSON datasets in Pandas.

Before working with JSON data we need to import pandas. If you're fetching JSON from a web URL or API you'll also need requests.

Python
import pandas as pd
import requests

Reading JSON Files

To read a JSON file or URL in pandas we use the read_json function. In the below code path_or_buf is the file or web URL to the JSON file.

Python
pd.read_json(path_or_buf)

Create a DataFrame and Convert It to JSON

if you don't have JSON file then create a small DataFrame and see how to convert it to JSON using different orientations.

  • orient='split': separates columns, index and data clearly.
  • orient='index': shows each row as a key-value pair with its index.
Python
df = pd.DataFrame([['a', 'b'], ['c', 'd']],
                  index=['row 1', 'row 2'],
                  columns=['col 1', 'col 2'])

print(df.to_json(orient='split')) 
print(df.to_json(orient='index'))

Output:

Screenshot-2025-03-15-114204

Read the JSON File directly from Web Data

You can fetch JSON data from online sources using the requests library and then convert it to a DataFrame. In the below example it reads and prints JSON data from the specified API endpoint using the pandas library in Python.

  • requests.get(url) fetches data from the URL.
  • response.json() converts response to a Python dictionary/list.
  • json_normalize() converts nested JSON into a flat table.
Python
import pandas as pd
import requests

url = 'https://jsonplaceholder.typicode.com/posts'
response = requests.get(url)

data = pd.json_normalize(response.json())
data.head()

Output:

Screenshot-2025-03-15-115552

Handling Nested JSON in Pandas

Sometimes JSON data has layers like lists or dictionaries inside other dictionaries then it is called as Nested JSON. To turn deeply nested JSON into a table use json_normalize() from pandas making it easier to analyze or manipulate in a table format.

  • json.load(f): Loads the raw JSON into a Python dictionary.
  • json_normalize(d['programs']): Extracts the list under the programs key and flattens it into columns.
Python
import json  
import pandas as pd  
from pandas import json_normalize  

with open('/content/raw_nyc_phil.json') as f:
    d = json.load(f)

nycphil = json_normalize(d['programs'])
nycphil.head(3)

Output:

Screenshot-2025-03-15-120649

As you can see in above output it gives a readable table with columns like id , orchestra , season etc. Working with JSON can seem confusing at first especially when it's deeply nested. But with pandas and a little practice using json_normalize() you can turn messy JSON into clean and tabular data.


Next Article
Article Tags :
Practice Tags :

Similar Reads