Ace your next data science interview
Get better at data science interviews by solving a few questions per week
*We will never spam
How it works
1 We write questions
Get relevant questions frequently asked at top companies.
2 You solve them
Solve the problem before receiving the solution the next morning.
3 We send you the solution
Premium
Check your work and get better at interviewing!
The schedule
Sample questions
Sample question: Statistical knowledge
Suppose there are 15 different color crayons in a box. Each time one obtains a crayon, it is equally likely to be any of the 15 types. Compute the expected # of different colors that are obtained in a set of 5 crayons. (Hint: use indicator variables and linearity of expectation)
We enumerate the crayons from 1 to 15. Let \(X_i\) indicate when the ith crayon is among the 5 crayons selected.
So,
\(E(X_i) =\) Pr {Probability that at least one type i crayon is in set of 5}
\(E(X_i) =\) 1 - Pr {no type i crayons in set of 5}
\(E(X_i) = 1 - \frac{14}{15}^5\ \)
Therefore, the expected # of coupons is:
\( = \sum_{i=1}^{25} E(X_i)\)
\( = 15[1 - \frac{14}{15}^5]\)
\( = 4.38\)
Sample question: Coding/computation
Given a dataframe, df, return only those rows which have missing values.
For example:
|Name
|age
|favorite_color
|grade
|name
|Willard Morris
|20
|blue
|
|Willard Morris
|Al Jennings
|19
|red
|92
|Al Jennings
|
|22
|yellow
|95
|Omar Mullins
|Spencer McDaniel
|21
|green
|70
|Spencer McDaniel
Will return...
|Name
|age
|favorite_color
|grade
|name
|Willard Morris
|20
|blue
|
|Willard Morris
|
|22
|yellow
|95
|Omar Mullins
in Python
df[df.isnull().any(axis=1)