Select Pandas dataframe rows between two dates


import modules

import pandas as pd
import numpy as np

create dummy dataframe

raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'red', 'yellow', "green"],
'grade': [88, 92, 95, 70],
'birth_date': ['01-02-1996', '08-05-1997', '04-28-1996', '12-16-1995']}
df = pd.DataFrame(raw_data, index = ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'])
df
age birth_date favorite_color grade name
Willard Morris 20 01-02-1996 blue 88 Willard Morris
Al Jennings 19 08-05-1997 red 92 Al Jennings
Omar Mullins 22 04-28-1996 yellow 95 Omar Mullins
Spencer McDaniel 21 12-16-1995 green 70 Spencer McDaniel

Select Pandas dataframe rows between two dates

We can perform this using a boolean mask
First, lets ensure the 'birth_date' column is in date format

df['birth_date'] = pd.to_datetime(df['birth_date'])

next, set the desired start date and end date to filter df with
-- these can be in datetime (numpy and pandas), timestamp, or string format

start_date = '03-01-1996'
end_date = '06-01-1997'

next, set the mask -- we can then apply this to the df to filter it

mask = (df['birth_date'] > start_date) & (df['birth_date'] <= end_date)

assign mask to df to return the rows with birth_date between our specified start/end dates

df = df.loc[mask]
df
age birth_date favorite_color grade name
Omar Mullins 22 1996-04-28 yellow 95 Omar Mullins


Ace your next data science interview

Get better at data science interviews by solving a few questions per week



Find a bug? Submit a suggested change on Github, or message me on Twitter.