Read And Write Tabular Data using Pandas
Last Updated :
05 Feb, 2024
Pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with “relational” or “labeled” data both easy and intuitive. It aims to be the fundamental, high-level building block for doing practical, real-world data analysis in Python.
The two primary data structures of Pandas, Series (1-dimensional) and DataFrame (2-dimensional), handle the vast majority of typical use cases in finance, statistics, social science, and many areas of engineering. For R users, DataFrame provides everything about R’s data.frame provides, and much more. Pandas is built on top of NumPy and is intended to integrate well within a scientific computing environment with many other third-party libraries.
Data structures
Dimension
| Name
| Description
|
---|
1
| Series
| 1D-labeled homogeneously-typed array
|
---|
2
| DataFrame
| General 2D labeled, size-mutable tabular structure with potentially heterogeneously-typed column
|
---|
Reading Tabular Data
Pandas provides the read_csv() function to read data stored as a csv file into a pandas DataFrame. pandas supports many different file formats or data sources out of the box (csv, excel, sql, json, parquet, …), each of them with the prefix read_*.
Importing Necessary libraries
Python3
CSV file
dataset.csv1. Reading the csv file
Dataset link : dataset.csv
Python
# Load the dataset from the 'dataset.csv' file using Pandas
data = pd.read_csv('dataset.csv')
# Display the first few rows of the loaded dataset
print(data.head())
Output:
total_bill tip sex smoker day time size
0 16.99 1.01 Female No Sun Dinner 2
1 10.34 1.66 Male No Sun Dinner 3
2 21.01 3.50 Male No Sun Dinner 3
3 23.68 3.31 Male No Sun Dinner 2
4 24.59 3.61 Female No Sun Dinner 4
2. Reading excel file
Dataset link : data.xlsx
Python
# Load the dataset from the 'data.xlsx' file using Pandas
data = pd.read_excel('data.xlsx')
# Display the first few rows of the loaded dataset
print(data.head())
Output:
Column1 Column2 Column3
0 1 A 10.5
1 2 B 20.3
2 3 C 15.8
3 4 D 8.2
Writing Tabular Data
1. Writing in Excel file
Python
# Reading the data from a CSV file named 'dataset.csv' into a pandas DataFrame
data = pd.read_csv('dataset.csv')
# Specifying the path for the new Excel file to be created
excel_file_path = 'newDataset.xlsx'
# Writing the DataFrame to an Excel file with the specified path, excluding the index column
data.to_excel(excel_file_path, index=False)
# Displaying a message indicating that the data has been successfully written to the Excel file
print(f'Data written to Excel file: {excel_file_path}')
Output:
Data written to Excel file: newDataset.xlsx
newDataset.xlsx2. Writing in CSV file
Python
# Reading the data from a CSV file named 'dataset.csv' into a pandas DataFrame
data = pd.read_csv('dataset.csv')
# Specifying the path for the new CSV file to be created
csv_file_path = 'newDataset.csv'
# Writing the DataFrame to a CSV file with the specified path, excluding the index column
data.to_csv(csv_file_path, index=False)
# Displaying a message indicating that the data has been successfully written to the CSV file
print(f'Data written to CSV file: {csv_file_path}')
Output:
Data written to CSV file: newDataset.csv
Conclusion
In conclusion, Pandas provides essential tools for efficiently managing tabular data, allowing seamless reading and writing operations across various file formats. The library's key functions, such as read_csv, read_excel, to_csv, and to_excel, facilitate the smooth import and export of data, irrespective of its original format.
Pandas' adaptability extends to diverse data scenarios, enabling users to address nuances like missing values and customizable parameters. Whether dealing with CSV, Excel, SQL, JSON, or other file types, Pandas offers a consistent and user-friendly interface for data manipulation.
Similar Reads
Read and Write Rectangular Text Data Quickly using R
Reading and writing rectangular text data quickly in R Programming Language can be achieved using various packages and functions, depending on your specific needs and the data format. Two commonly used packages for this purpose are readr and data. table. Here's how you can do it with these packages.
8 min read
Data Processing with Pandas
Data Processing is an important part of any task that includes data-driven work. It helps us to provide meaningful insights from the data. As we know Python is a widely used programming language, and there are various libraries and tools available for data processing. In this article, we are going t
10 min read
How to write Pandas DataFrame as TSV using Python?
In this article, we will discuss how to write pandas dataframe as TSV using Python. Let's start by creating a data frame. It can be done by importing an existing file, but for simplicity, we will create our own. Python3 # importing the module import pandas as pd # creating some sample data sample =
1 min read
Indexing and Selecting Data with Pandas
Indexing in Pandas refers to selecting specific rows and columns from a DataFrame. It allows you to subset data in various ways, such as selecting all rows with specific columns, some rows with all columns, or a subset of both rows and columns. This technique is also known as Subset Selection. Let's
6 min read
Reading An Arff File To Pandas Dataframe
Attribute-Relation File Format (ARFF) is a file format developed by the Machine Learning Project of the University of Waikato, New Zealand. It has been developed by the Computer Science department of the aforementioned University. The ARFF files mostly belong to WEKA (Waikato Environment for Knowled
4 min read
Using csv module to read the data in Pandas
The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets and databases. There were various formats of CSV until its standardization. The lack of a well-defined standard means that subtle differences often exist in the data produced and consumed b
3 min read
Reshape a Pandas DataFrame using stack,unstack and melt method
Pandas use various methods to reshape the dataframe and series. Reshaping a Pandas DataFrame is a common operation to transform data structures for better analysis and visualization. The stack method pivots columns into rows, creating a multi-level index Series. Conversely, the unstack method revers
5 min read
How to write Pandas DataFrame to PostgreSQL table?
In this article, we will be looking at some methods to write Pandas dataframes to PostgreSQL tables in the Python. Method 1: Using to_sql() function to_sql function is used to write the given dataframe to a SQL database. Syntax df.to_sql('data', con=conn, if_exists='replace', index=False) Parameter
3 min read
Working with database using Pandas
Performing various operations on data saved in SQL might lead to performing very complex queries that are not easy to write. So to make this task easier it is often useful to do the job using pandas which are specially built for data preprocessing and is more simple and user-friendly than SQL. There
3 min read
Create Effective and Reproducible Code Using Pandas
Pandas stand tall as a versatile and powerful tool. Its intuitive data structures and extensive functionalities make it a go-to choice for countless data professionals and enthusiasts alike. However, writing code that is both effective and reproducible requires more than just a knowledge of Pandas f
3 min read