Select rows from a Pandas DataFrame based on values in a column

Pandas

import modules

import pandas as pd

Create some dummy data

raw_data = {'name': ['Willard Morris', 'Al Jennings', 'Omar Mullins', 'Spencer McDaniel'],
'age': [20, 19, 22, 21],
'favorite_color': ['blue', 'blue', 'yellow', "green"],
'grade': [88, 92, 95, 70]}df = pd.DataFrame(raw_data)df.head()
age favorite_color grade name
0 20 blue 88 Willard Morris
1 19 blue 92 Al Jennings
2 22 yellow 95 Omar Mullins
3 21 green 70 Spencer McDaniel

Select rows based on column value:

#To select rows whose column value equals a scalar, some_value, use ==:df.loc[df['favorite_color'] == 'yellow']
age favorite_color grade name
2 22 yellow 95 Omar Mullins

Select rows whose column value is in an iterable array:

#To select rows whose column value is in an iterable array, which we'll define as array, you can use isin:array = ['yellow', 'green']df.loc[df['favorite_color'].isin(array)]
age favorite_color grade name
2 22 yellow 95 Omar Mullins
3 21 green 70 Spencer McDaniel

Select rows based on multiple column conditions:

#To select a row based on multiple conditions you can use &:array = ['yellow', 'green']df.loc[(df['age'] == 21) & df['favorite_color'].isin(array)]
age favorite_color grade name
3 21 green 70 Spencer McDaniel

Select rows where column does not equal a value:

#To select rows where a column value does not equal a value, use !=:df.loc[df['favorite_color'] != 'yellow']
age favorite_color grade name
0 20 blue 88 Willard Morris
1 19 blue 92 Al Jennings
3 21 green 70 Spencer McDaniel

Select rows whose column value is not in an iterable array:

#To return a rows where column value is not in an iterable array, use ~ in front of df:array = ['yellow', 'green']df.loc[~df['favorite_color'].isin(array)]
age favorite_color grade name
0 20 blue 88 Willard Morris
1 19 blue 92 Al Jennings