Creating a quick data pipeline
Question
Suppose you're given the following dataframe:
name | gender | age | |
---|---|---|---|
0 | Paul | Male | 23 |
1 | Sean | Male | 35 |
2 | Jenn | Female | 17 |
Using this data, write a data processing pipeline to perform the following actions to the data:
- Groups the dataframe by a specified column and returns the mean age of the group
- Converts the column name to uppercase
If you're using a Python, you can build the dataframe out using the below code:
import pandas as pd
# Create empty dataframe
df = pd.DataFrame()
# Create columns
df['name'] = ['Paul', 'Sean', 'Jenn']
df['gender'] = ['Male', 'Male', 'Female']
df['age'] = [23, 35, 17]
# Preview dataframe
df.head()