Processing time with Pandas DataFrame
Last Updated :
03 Aug, 2023
Pandas was created with regards to financial modeling, so as you may expect, it contains a genuinely ample number of tools for working with dates and times. Sometimes the given format of the date and time in our dataset cannot be directly used for analysis, so we pre-process these time values to obtain features like date, month, year, hours, minutes and seconds. Let's discuss all the different ways to process date and time with Pandas dataframe. Divide date and time into multiple features: Create five dates and time using pd.date_range which generate sequences of fixed-frequency dates and time spans. Then we use pandas.Series.dt to extract the features.
Python3
# Load library
import pandas as pd
# calling DataFrame constructor
df = pd.DataFrame()
# Create 6 dates
df['time'] = pd.date_range('2/5/2019', periods = 6, freq ='2H')
print(df['time']) # print dataframe
# Extract features - year, month, day, hour, and minute
df['year'] = df['time'].dt.year
df['month'] = df['time'].dt.month
df['day'] = df['time'].dt.day
df['hour'] = df['time'].dt.hour
df['minute'] = df['time'].dt.minute
# Show six rows
df.head(6)
Output:
0 2019-02-05 00:00:00
1 2019-02-05 02:00:00
2 2019-02-05 04:00:00
3 2019-02-05 06:00:00
4 2019-02-05 08:00:00
5 2019-02-05 10:00:00
Name: time, dtype: datetime64[ns]
time year month day hour minute
0 2019-02-05 00:00:00 2019 2 5 0 0
1 2019-02-05 02:00:00 2019 2 5 2 0
2 2019-02-05 04:00:00 2019 2 5 4 0
3 2019-02-05 06:00:00 2019 2 5 6 0
4 2019-02-05 08:00:00 2019 2 5 8 0
5 2019-02-05 10:00:00 2019 2 5 10 0
Convert strings to Timestamps: We convert the given strings to datetime format using pd.to_datetime and then we can extract different features from the datetime using first method.
Python3
# Load libraries
import numpy as np
import pandas as pd
# Create time Strings
dt_strings = np.array(['04-03-2019 12:35 PM',
'22-06-2017 11:01 AM',
'05-09-2009 07:09 PM'])
# Convert to datetime format
timestamps = [pd.to_datetime(date, format ="%d-%m-%Y%I:%M %p",
errors ="coerce") for date in dt_strings]
# Convert to datetimes
timestamps = [pd.to_datetime(date, format ="%d-%m-%Y %I:%M %p",
errors ="coerce") for date in dt_strings]
Output:
[Timestamp('2019-03-04 12:35:00'), Timestamp('2017-06-22 11:01:00'), Timestamp('2009-09-05 19:09:00')]
Change the pattern of Timestamps: We can change the pattern of Timestamp using strftime method.
Python3
# Load library
import pandas as pd
# calling DataFrame constructor
df = pd.DataFrame()
# Create 6 dates
df['time'] = pd.date_range('2/5/2019', periods = 6, freq ='2H')
print("Old Pattern")
print(df['time']) #
print("\nNew Pattern")
df['new_time']=df['time'].dt.strftime("%d-%B")
print(df['new_time'])
Output:
Old Pattern
0 2019-02-05 00:00:00
1 2019-02-05 02:00:00
2 2019-02-05 04:00:00
3 2019-02-05 06:00:00
4 2019-02-05 08:00:00
5 2019-02-05 10:00:00
Name: time, dtype: datetime64[ns]
New Pattern
0 05-February
1 05-February
2 05-February
3 05-February
4 05-February
5 05-February
Name: new_time, dtype: object
Note: strftime method converts the timestamp to object datatype.
Extract Days Of the Week from the given Date: We use Series.dt.weekday_name to find name of the day in a week from the given Date.
Python3
# Load library
import pandas as pd
df = pd.DataFrame()
# Create 6 dates
dates = pd.pd.Series(date_range('2/5/2019', periods = 6, freq ='M'))
print(dates)
# Extract days of week and then print
print(dates.dt.weekday_name)
Output:
0 2019-02-28
1 2019-03-31
2 2019-04-30
3 2019-05-31
4 2019-06-30
5 2019-07-31
dtype: datetime64[ns]
0 Thursday
1 Sunday
2 Tuesday
3 Friday
4 Sunday
5 Wednesday
dtype: object
Extract Data in Date and Time Ranges: We can obtain the rows that lie in particular time range from the given dataset. Method #1: If the dataset is not indexed with time.
Python3
# Load library
import pandas as pd
# Create data frame
df = pd.DataFrame()
# Create datetimes
df['date'] = pd.date_range('1/1/2012', periods = 1000, freq ='H')
print(df.head(5))
# Select observations between two datetimes
x = df[(df['date'] > '2012-1-1 01:00:00') &
(df['date'] <= '2012-1-1 11:00:00')]
print(x)
Output:
date
0 2012-01-01 00:00:00
1 2012-01-01 01:00:00 // 5 rows of Timestamps out of 1000
2 2012-01-01 02:00:00
3 2012-01-01 03:00:00
4 2012-01-01 04:00:00
date
2 2012-01-01 02:00:00
3 2012-01-01 03:00:00
4 2012-01-01 04:00:00
5 2012-01-01 05:00:00 //Timestamps in the given range
6 2012-01-01 06:00:00
7 2012-01-01 07:00:00
8 2012-01-01 08:00:00
9 2012-01-01 09:00:00
10 2012-01-01 10:00:00
11 2012-01-01 11:00:00
Method #2: If the dataset is indexed with time
Python3
# Load library
import pandas as pd
# Create data frame
df = pd.DataFrame()
# Create datetimes
df['date'] = pd.date_range('1/1/2012', periods = 1000, freq ='H')
# Set index
df = df.set_index(df['date'])
print(df.head(5))
# Select observations between two datetimes
x = df.loc['2012-1-1 04:00:00':'2012-1-1 12:00:00']
print(x)
Output:
date
date
2012-01-01 00:00:00 2012-01-01 00:00:00
2012-01-01 01:00:00 2012-01-01 01:00:00
2012-01-01 02:00:00 2012-01-01 02:00:00
2012-01-01 03:00:00 2012-01-01 03:00:00 // 5 rows of Timestamps out of 1000
2012-01-01 04:00:00 2012-01-01 04:00:00
date
date
2012-01-01 04:00:00 2012-01-01 04:00:00
2012-01-01 05:00:00 2012-01-01 05:00:00
2012-01-01 06:00:00 2012-01-01 06:00:00
2012-01-01 07:00:00 2012-01-01 07:00:00
2012-01-01 08:00:00 2012-01-01 08:00:00
2012-01-01 09:00:00 2012-01-01 09:00:00 //Timestamps in the given range
2012-01-01 10:00:00 2012-01-01 10:00:00
2012-01-01 11:00:00 2012-01-01 11:00:00
2012-01-01 12:00:00 2012-01-01 12:00:00
Similar Reads
Data Processing with Pandas
Data Processing is an important part of any task that includes data-driven work. It helps us to provide meaningful insights from the data. As we know Python is a widely used programming language, and there are various libraries and tools available for data processing. In this article, we are going t
10 min read
How to preprocess string data within a Pandas DataFrame?
Sometimes, the data which we're working on might be stuffed in a single column, but for us to work on the data, the data should be spread out into different columns and the columns must be of different data types. When all the data is combined in a single string, the string needs to be preprocessed.
3 min read
Python | Pandas dataframe.at_time()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.Pandasdataframe.at_time() function is used to select all the values in a row correspond
2 min read
Python | Pandas DataFrame.where()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas where() method in Python is used to check a data frame for one or more conditio
2 min read
Sorting rows in pandas DataFrame
Pandas DataFrame is two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). We often need to do certain operations on both rows and column while handling the data. Letâs see how to sort rows in pandas DataFrame. Code #1: Sorting rows by Sc
2 min read
Python | Pandas Dataframe.pop()
The pop() method in Pandas is used to remove a column from a DataFrame and return it as a Series. This is similar in concept to the dictionary pop() method in Python, but specifically designed for use with Pandas DataFrames. It's key features include:Removes a specified column from a DataFrame.Retur
2 min read
Python | Pandas DataFrame.to_records
Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure o
2 min read
Python | Pandas dataframe.rmul()
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. Pandas dataframe.rmul() function is used for finding the multiplication of dataframe a
2 min read
Get Size of the Pandas DataFrame
In this article, we will discuss how to get the size of the Pandas Dataframe using Python. Method 1 : Using df.size This will return the size of dataframe  i.e. rows*columns Syntax: dataframe.size where, dataframe is the input dataframe Example: Python code to create a student dataframe and display
2 min read
Python | Pandas DataFrame.values
Pandas DataFrame is a two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. It can be thought of as a dict-like container for Series objects. This is the primary data structure o
2 min read