Elevate your team's skills with Data Interview Qs!
Continue to develop talent and keep skills current on your team of analysts or data scientists with our real world, data-centric questions sent straight to your team’s inboxes!
We cover a range of data-oriented topics including:
- Statistics (theory and application against real world problems)
- Working with Python functions
- Data manipulation and processing using Python Pandas
- Data visualization using Python and Matplotlib/Seaborn
- Advanced application of SQL
- ML topics and application
We have also integrated our questions with Google Colab, allowing your team to execute code and run our solutions directly in the browser (no setup required) to reinforce understanding.
How it works
1 We write questions
Your team will receive business relevant, data-centric questions to sharpen their skills 3x per week, in accordance with the schedule below
2 Your team solves the questions
Solve the problem before receiving the solution the next morning.
3 We send you the solution
We'll send your team a detailed solution the day after each question.
4 Prosper!
Your team continues to sharpen their data skills and will be able to apply their newly gained knowledge to their role!
The schedule
Sample questions
Sample question 1: Statistical knowledge
Suppose there are 15 different color crayons in a box. Each time one obtains a crayon, it is equally likely to be any of the 15 types. Compute the expected # of different colors that are obtained in a set of 5 crayons. (Hint: use indicator variables and linearity of expectation)
We enumerate the crayons from 1 to 15. Let \(X_i\) indicate when the ith crayon is among the 5 crayons selected.
So,
\(E(X_i) =\) Pr {Probability that at least one type i crayon is in set of 5}
\(E(X_i) =\) 1 - Pr {no type i crayons in set of 5}
\(E(X_i) = 1 - \frac{14}{15}^5\ \)
Therefore, the expected # of crayons is:
\( = \sum_{i=1}^{15} E(X_i)\)
\( = 15[1 - \frac{14}{15}^5]\)
\( = 4.38\)
Sample question 2: Coding/computation
Suppose you have a dataframe, df, with the following records:
|
age |
favorite_color |
grade |
name |
0 |
20 |
blue |
88 |
Willard Morris |
1 |
19 |
blue |
92 |
Al Jennings |
2 |
22 |
yellow |
95 |
Omar Mullins |
3 |
21 |
green |
70 |
Spencer McDaniel |
The dataframe is showing information about students. Write code using Python Pandas to select the rows where the students' favorite color is blue or yellow and their grade is at least 90.
Click here to view this problem in an interactive Colab (Jupyter) notebook.
#define array of target colors
fav_color_filter = ['blue', 'yellow']
#To select rows whose column value is in an iterable array, which we defined as fav_color_filter, we can use isin
df = df.loc[df['favorite_color'].isin(fav_color_filter)]
#next, we need to filter on scores above 90. here we can use loc on our dataframe:
df = df.loc[(df['grade'] >= 90)]
#preview the dataframe
df.head()
Resultant dataframe:
|
age |
favorite_color |
grade |
name |
1 |
19 |
blue |
92 |
Al Jennings |
2 |
22 |
yellow |
95 |
Omar Mullins |
Click here to view this solution in an interactive Colab (Jupyter) notebook.
Sample question 3: Coding/computation
A prime number is a natural number greater than 1 that cannot be formed by multiplying two smaller natural numbers. Given a single number, n, write a function using Python to return whether or not the number is prime. Additionally, if the inputted number is prime, save it into an array, a.
Click here to view this problem in an interactive Colab (Jupyter) notebook.
We'll set up a function below to determine whether or not a given number is prime, using simple if/else statements. Additionally, when a number is defined as prime we'll append it to our array, a.
#First, define an empty array to store prime numbers
a = []
#Define a function to identify whether or not a given number, x, is prime
def is_prime(x):
if x < 2:
#if the number is < 2, it's not prime, per definition of prime number
#(e.g. natural number greater than 1)
return False
else:
#for all other numbers >=2
for n in range(2,x):
#if divisible by two smaller #s, then not prime
if x % n == 0:
return False
#s that don't meet the above conditions are prime! save them to our array, a
a.append(x)
return True
Click here to view this solution in an interactive Colab (Jupyter) notebook.
Pricing
$15/person
Used by thousands of students and industry workers