Check missing dates in Pandas
Last Updated :
15 Sep, 2022
In this article, we will learn how to check missing dates in Pandas.
A data frame is created from a dictionary of lists using pd.DataFrame() which accepts the data as its parameter. Note that here, the dictionary consists of two lists named Date and Name. Both of them are of the same length and some dates are missing from the given sequence of dates ( FromĀ 2021-01-18 to 2021-01-25 ).
Check missing datesChecking whether the given Date is missing from the data frame
Here we are returning True if the date is present and False if the date is missing from the data frame.
Python3
import pandas as pd
# A dataframe from a dictionary of lists
data = {'Date': ['2021-01-18', '2021-01-20',
'2021-01-23', '2021-01-25'],
'Name': ['Jia', 'Tanya', 'Rohan', 'Sam']}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
d='2021-01-19'
print(pd.to_datetime(d) in df['Date'].tolist())
Output:
True
Using data_range() and .difference() function to check missing dates
Example 1:
df.set_index() method sets the dates as the index for the data frame we created. Ā One can simply print the data frame using print(df) to see it before and after setting the Date as an index. Now, once we have set the date as the index, we convert the given list of dates into a DateTime object. Originally, the dates in our list are strings that need to be converted into the DateTime object. Pandas provide us with a method called to_datetime() which converts the date and time in string format to a DateTime object.
Python3
#import pandas
import pandas as pd
# A dataframe from a dictionary of lists
data = {'Date': ['2021-01-18', '2021-01-20',
'2021-01-23', '2021-01-25'],
'Name': ['Jia', 'Tanya', 'Rohan', 'Sam']}
df = pd.DataFrame(data)
# Setting the Date values as index
df = df.set_index('Date')
# to_datetime() method converts string
# format to a DateTime object
df.index = pd.to_datetime(df.index)
# dates which are not in the sequence
# are returned
print(pd.date_range(
start="2021-01-18", end="2021-01-25").difference(df.index))
Output:
Finally, we get all the dates that are missing between 2021-01-18 and 2021-01-25.
DatetimeIndex(['2021-01-19', '2021-01-21', '2021-01-22', '2021-01-24'], dtype='datetime64[ns]', freq=None)
Pandas.Index.difference() returns a new Index with elements of index not in others. Therefore, by using pd.date_range(start date, end date).difference(Date), we get all the dates that are not present in our list of Dates. The data type returned is an Immutable ndarray-like of datetime64 data.
Example 2:
Let us consider another example. However, this time we will not set the date as an index and will assign freq='B' (Business Day Frequency) inside the pd.date_range() function.
Just like the previous example, we make a dataframe from the dictionary of lists. However, this time we do not set the date values as index. Instead, we set the column 'Total People' as our index values. Using pd.date_range() function, which takes start date, end date and frequency as parameters, we provide the values. We set the freq= 'B' Ā (Business Day Frequency) in order to omit weekends. Finally, Pandas.Index.difference() Ā takes the Date column as a parameter and returns all those values which are not in the given set of values.
Python3
#import pandas
import pandas as pd
# A dataframe from a dictionary of lists
d = {'Date': ['2021-01-10', '2021-01-14', '2021-01-18',
'2021-01-25', '2021-01-28', '2021-01-29'],
'Total People': [20, 21, 19, 18, 13, 56]}
df = pd.DataFrame(d)
# Setting the Total People as index
df = df.set_index('Total People')
# to_datetime() method converts string
# format to a DateTime object
df['Date'] = pd.to_datetime(df['Date'])
# dates which are not in the sequence
# are returned
my_range = pd.date_range(
start="2021-01-10", end="2021-01-31", freq='B')
print(my_range.difference(df['Date']))
Ā Output:
Check missing dates
Note that all the missing values except 2021-01-23, 2021-01-24, and 2021-01-30 are returned because we have set freq='B' which omits all the weekends.
Using reindex() function to check missing dates
Here we are typecasting the string type date into datetime type and with help of reindex() we are checking all the dates that are missing in the given data Frame and assign it to True otherwise assign it to False.
Python3
import pandas as pd
# A dataframe from a dictionary of lists
data = {'Date': ['2021-01-18', '2021-01-20',
'2021-01-23', '2021-01-25'],
'Name': ['Jia', 'Tanya', 'Rohan', 'Sam']}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)
df.reindex(pd.date_range('2021-01-17', '2021-01-29')
).isnull().all(1)
Output:
Check missing dates
Similar Reads
Max and Min date in Pandas GroupBy
Prerequisites: Pandas Pandas GroupBy is very powerful function. This function is capable of splitting a dataset into various groups for analysis. Syntax: dataframe.groupby([column names]) Along with groupby function we can use agg() function of pandas library. Agg() function aggregates the data tha
1 min read
Get Month from Date in Pandas
Extracting the month from a date in a dataset involves converting a date column into a format that allows us to access the month component directly. Python, provides several methods to achieve this with ease using pd.to_datetime() and .dt.month. Here's a quick method to illustrate:Using dt.month fro
3 min read
Pandas Change Datatype
In data analysis, ensuring that each column in a Pandas DataFrame has the correct data type is crucial for accurate computations and analyses. The most common way to change the data type of a column in a Pandas DataFrame is by using the astype() method. This method allows you to convert a specific c
2 min read
Python | Pandas DatetimeIndex.date
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas DatetimeIndex.date attribute outputs an Index object containing the date values
2 min read
How to Check the Data Type in Pandas DataFrame?
Pandas DataFrame is a Two-dimensional data structure of mutable size and heterogeneous tabular data. There are different Built-in data types available in Python. Â Two methods used to check the datatypes are pandas.DataFrame.dtypes and pandas.DataFrame.select_dtypes. Creating a Dataframe to Check Dat
2 min read
How to check if pandas DateTimeIndex dates belong to a list?
To check if Pandas DateTimeIndex dates belong to a list, we can use the isin() method. In this we basically consider a subset of dates and check whether those dates are present in the list or not. Check whether the set of dates exist in date column or notisin() is a method in Pandas that is used to
5 min read
Get the day from a date in Pandas
Given a particular date, it is possible to obtain the day of the week on which the date falls. This is achieved with the help of Pandas library and the to_datetime() method present in pandas. In most of the datasets the Date column appears to be of the data type String, which definitely isn't comfor
2 min read
Change String To Date In Pandas Dataframe
Working with date and time data in a Pandas DataFrame is common, but sometimes dates are stored as strings and need to be converted into proper date formats for analysis and visualization. In this article, we will explore multiple methods to convert string data to date format in a Pandas DataFrame.U
5 min read
How to Convert Datetime to Date in Pandas ?
DateTime is a collection of dates and times in the format of "yyyy-mm-dd HH:MM:SS" where yyyy-mm-dd is referred to as the date and HH:MM:SS is referred to as Time. In this article, we are going to discuss converting DateTime to date in pandas. For that, we will extract the only date from DateTime us
4 min read
Python | Pandas DatetimeIndex.minute
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas DatetimeIndex.minute attribute outputs an Index object containing the minute va
2 min read