Continue to develop talent and keep skills current on your team of analysts or data scientists with our real world, data-centric questions sent straight to your team’s inboxes!

We cover a range of data-oriented topics including:

- Statistics (theory and application against real world problems)
- Working with Python functions
- Data manipulation and processing using Python Pandas
- Data visualization using Python and Matplotlib/Seaborn
- Advanced application of SQL
- ML topics and application

We have also integrated our questions with Google Colab, allowing your team to execute code and run our solutions directly in the browser (no setup required) to reinforce understanding.

1 We write questions

Your team will receive business relevant, data-centric questions to sharpen their skills 3x per week, in accordance with the schedule below

2 Your team solves the questions

Solve the problem before receiving the solution the next morning.

3 We send you the solution

We'll send your team a detailed solution the day after each question.

4 Prosper!

Your team continues to sharpen their data skills and will be able to apply their newly gained knowledge to their role!

Sample question 1: Statistical knowledge

Suppose there are 15 different color crayons in a box. Each time one obtains a crayon, it is equally likely to be any of the 15 types. Compute the expected # of different colors that are obtained in a set of 5 crayons. (Hint: use indicator variables and linearity of expectation)

We enumerate the crayons from 1 to 15. Let \(X_i\) indicate when the ith crayon is among the 5 crayons selected.

So,

\(E(X_i) =\) Pr {Probability that at least one type i crayon is in set of 5}

\(E(X_i) =\) 1 - Pr {no type i crayons in set of 5}

\(E(X_i) = 1 - \frac{14}{15}^5\ \)

Therefore, the expected # of crayons is:

\( = \sum_{i=1}^{15} E(X_i)\)

\( = 15[1 - \frac{14}{15}^5]\)

\( = 4.38\)

Sample question 2: Coding/computation

Suppose you have a dataframe, df, with the following records:

age | favorite_color | grade | name | |
---|---|---|---|---|

0 | 20 | blue | 88 | Willard Morris |

1 | 19 | blue | 92 | Al Jennings |

2 | 22 | yellow | 95 | Omar Mullins |

3 | 21 | green | 70 | Spencer McDaniel |

The dataframe is showing information about students. Write code using Python Pandas to select the rows where the students' favorite color is blue or yellow and their grade is at least 90.

Click here to view this problem in an interactive Colab (Jupyter) notebook.

#define array of target colors

fav_color_filter = ['blue', 'yellow']

#To select rows whose column value is in an iterable array, which we defined as fav_color_filter, we can use isin

df = df.loc[df['favorite_color'].isin(fav_color_filter)]

#next, we need to filter on scores above 90. here we can use loc on our dataframe:

df = df.loc[(df['grade'] >= 90)]

#preview the dataframe

df.head()

Resultant dataframe:

age | favorite_color | grade | name | |
---|---|---|---|---|

1 | 19 | blue | 92 | Al Jennings |

2 | 22 | yellow | 95 | Omar Mullins |

Click here to view this solution in an interactive Colab (Jupyter) notebook.

Sample question 3: Coding/computation

A prime number is a natural number greater than 1 that cannot be formed by multiplying two smaller natural numbers. Given a single number, *n*, write a function using Python to return whether or not the number is prime. Additionally, if the inputted number is prime, save it into an array, *a*.

Click here to view this problem in an interactive Colab (Jupyter) notebook.

We'll set up a function below to determine whether or not a given number is prime, using simple if/else statements. Additionally, when a number is defined as prime we'll append it to our array, a.

#First, define an empty array to store prime numbers

a = []

#Define a function to identify whether or not a given number, x, is prime

def is_prime(x):

if x < 2:

#if the number is < 2, it's not prime, per definition of prime number

#(e.g. natural number greater than 1)

return False

else:

#for all other numbers >=2

for n in range(2,x):

#if divisible by two smaller #s, then not prime

if x % n == 0:

return False

#s that don't meet the above conditions are prime! save them to our array, a

a.append(x)

return True

Click here to view this solution in an interactive Colab (Jupyter) notebook.