Dropping rows/columns from a Pandas dataframe


import modules

import pandas as pd
import numpy as np
 

create dummy dataframe

raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'red', 'yellow', "green"],
'grade': [88, 92, 95, 70]}
df = pd.DataFrame(raw_data, index = ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'])
df
age favorite_color grade name
Willard Morris 20 blue 88 Willard Morris
Al Jennings 19 red 92 Al Jennings
Omar Mullins 22 yellow 95 Omar Mullins
Spencer McDaniel 21 green 70 Spencer McDaniel
 

drop a row by name

df.drop(['Willard Morris', 'Spencer McDaniel'])
age favorite_color grade name
Al Jennings 19 red 92 Al Jennings
Omar Mullins 22 yellow 95 Omar Mullins
 

drop a row by number

df.drop(df.index[0], inplace=True)
df
age favorite_color grade name
Al Jennings 19 red 92 Al Jennings
Omar Mullins 22 yellow 95 Omar Mullins
Spencer McDaniel 21 green 70 Spencer McDaniel
 

drop first 2 rows (put ':' to left of # to drop last X rows)

df.drop(df.index[:2], inplace=True)
df
age favorite_color grade name
Spencer McDaniel 21 green 70 Spencer McDaniel
 

dropping column by name

df.drop(['age'], axis = 1, inplace = True)
df
favorite_color grade name
Spencer McDaniel green 70 Spencer McDaniel




Ace your next data science interview

Get better at data science interviews by solving a few questions per week



Find a bug? Submit a suggested change on Github, or message me on Twitter.