Creating a quick data pipeline

Question

Suppose you're given the following dataframe:

name gender age
0 Paul Male 23
1 Sean Male 35
2 Jenn Female 17

Using this data, write a data processing pipeline to perform the following actions to the data:

  • Groups the dataframe by a specified column and returns the mean age of the group
  • Converts the column name to uppercase

If you're using a Python, you can build the dataframe out using the below code:

import pandas as pd
# Create empty dataframe
df = pd.DataFrame()
# Create columns
df['name'] = ['Paul', 'Sean', 'Jenn']
df['gender'] = ['Male', 'Male', 'Female']
df['age'] = [23, 35, 17]
# Preview dataframe
df.head()

Solution

Access restricted

Subscribe to premium account to see the solution.

Get premium now