Data profiling in Pandas using Python Last Updated : 04 May, 2020 Comments Improve Suggest changes Like Article Like Report Pandas is one of the most popular Python library mainly used for data manipulation and analysis. When we are working with large data, many times we need to perform Exploratory Data Analysis. We need to get the detailed description about different columns available and there relation, null check, data types, missing values, etc. So, Pandas profiling is the python module which does the EDA and gives detailed description just with a few lines of code. Installation: pip install pandas-profiling Example: Python3 1== #import the packages import pandas as pd import pandas_profiling # read the file df = pd.read_csv('Geeks.csv') # run the profile report profile = df.profile_report(title='Pandas Profiling Report') # save the report as html file profile.to_file(output_file="pandas_profiling1.html") # save the report as json file profile.to_file(output_file="pandas_profiling2.json") Output: HTML File: JSON File: Comment More infoAdvertise with us Next Article Data profiling in Pandas using Python I itsanjanikumari Follow Improve Article Tags : Python Pandas python-modules Python-pandas python +1 More Practice Tags : pythonpython Similar Reads Pandas Profiling in Python Pandas is a very vast library that offers many functions with the help of which we can understand our data. Pandas profiling provides a solution to this by generating comprehensive reports for datasets that have numerous features. These reports can be customized according to specific requirements. I 5 min read Data Manipulation in Python using Pandas In Machine Learning, the model requires a dataset to operate, i.e. to train and test. But data doesnât come fully prepared and ready to use. There are discrepancies like Nan/ Null / NA values in many rows and columns. Sometimes the data set also contains some of the rows and columns which are not ev 6 min read Python | Pandas Series.data Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas series is a One-dimensional ndarray with axis labels. The labels need not be un 2 min read Data Processing with Pandas Data Processing is an important part of any task that includes data-driven work. It helps us to provide meaningful insights from the data. As we know Python is a widely used programming language, and there are various libraries and tools available for data processing. In this article, we are going t 10 min read Python - Basics of Pandas using Iris Dataset Python language is one of the most trending programming languages as it is dynamic than others. Python is a simple high-level and an open-source language used for general-purpose programming. It has many open-source libraries and Pandas is one of them. Pandas is a powerful, fast, flexible open-sourc 8 min read How to Plot a Dataframe using Pandas Pandas plotting is an interface to Matplotlib, that allows to generate high-quality plots directly from a DataFrame or Series. The .plot() method is the core function for plotting data in Pandas. Depending on the kind of plot we want to create, we can specify various parameters such as plot type (ki 8 min read Python | Pandas dataframe.info() When working with data in Python understanding the structure and content of our dataset is important. The dataframe.info() method in Pandas helps us in providing a concise summary of our DataFrame and it quickly assesses its structure, identify issues like missing values and optimize memory usage.Ke 2 min read Python - Performing operations on the stock data This article demonstrates basic operations that can be done using Python to analyze and construct algorithmic trading strategies on stock data. We run through some simple operations that can be performed using Python on stock data, and we begin by reading stock data from a CSV file. Python has emerg 4 min read Using csv module to read the data in Pandas The so-called CSV (Comma Separated Values) format is the most common import and export format for spreadsheets and databases. There were various formats of CSV until its standardization. The lack of a well-defined standard means that subtle differences often exist in the data produced and consumed b 3 min read Olympics Data Analysis Using Python In this article, we are going to see the Olympics analysis using Python. The modern Olympic Games or Olympics are leading international sports events featuring summer and winter sports competitions in which thousands of athletes from around the world participate in a variety of competitions. The Oly 4 min read Like