A lot of interviews have a take-home case component. This is different than a lot of the prepping we provide because this is challenging you to think end-to-end, on what is probably a real business problem the organization you're interviewing for is facing.
We have provided the data (downloaded from Kaggle, thank you Kaggle!), a mock prompt (similar to ones experienced at top-tier companies), a potential presentation (which could be thought of as a solution), and the work we did to get to the solution. We hope that if you run into a similar scenario or want to extra practice, you can use our work to guide you in the right direction!
It was really hard to find a real dataset that we could use as a "practice case." We ended up finding data on Kaggle! You can download the data there, or you can visit the links below. We didn't use all of the data provided in this practice set by Kaggle, so below we only include ones we used to get to the solution.
You’re a Data Scientist / Business Analyst working for a new eCommerce company called A&B Co. (similar to Amazon) and you’ve been asked to prepare a presentation for the Vice President of Sales and the Vice President of Operations that summarizes sales and operations thus far. The summary should include (at a minimum) a summary of current state the business, current customer satisfaction, and a proposal of 2-3 areas where the company can improve.
Here are some facts:
Note all data was provided by Kaggle. Feel free to read about each file on Kaggle (or by clicking the "Click for more infomation" button above), however you can download the data by clicking on each link below.
This is a potential solution to address the problems outlined in the prompt. Note this is not the only solution, nor is it necessarily the best. The goal of providing you this is to give you an idea of how we would go about solving this problem and how we would present this to an interview pannel.
Below are the code from our iPython notebook, the work we did in Google Sheets (how we created most of the charts), and some thoughts we had on solving case.
You can view the Jupyter notebook we used here.
A link to the spreadsheet.
Most of the charts / quick side analysis I did was using Google sheets. Python charts are nice, but I find Google sheets generally quicker/more flexible for basic charts. For a case, you typically get 24-48 hours and you might end up spending a lot of time trying to make charts look "pretty". I found it's best to use Python for data aggregation/manipulation and use a spreadsheet software to make the more simple charts (unless otherwise specified in prompt).
Build a framework to answer the questions. If you’re not sure what the questions are, create questions for yourself to answer. It makes the process of digging for data so much easier.
It’s hard to come up with an answer if you don’t know what the questions are. This point seems like a no-brainer, but it’s good to make sure that you’ve created a structure to answer the key points. If you’re not sure what the question at hand is, you might need to play with the data a bit to understand what seems to be the problem at hand.
For this particular case the ask is really clear, we need to create a summary which includes current state the business, current customer satisfaction, and a proposal for an area where the company can improve. The questions we came up with (and some answers listed below) given the ask are:
We send 3 questions each week to thousands of data scientists and analysts preparing for interviews or just keeping their skills sharp. You can sign up to receive the questions for free on our home page.