Below is a daily table for an active acount at Shopify (an online ecommerce, retail platform). The table is called store_account and the columns are:
||a unique Shopify store id
||Possible values are: [‘open’, 'closed’, ‘fraud’]
||Amount of spend in USD
Here's some additional information about the table:
- The granularity of the table is store_id and day
- Assume “close” and “fraud” are permanent labels
- Active = daily revenue > 0
- Accounts get labeled by Shopify as fraudulent and they no longer can sell product
- Every day of the table has every store_id that has ever used Shopify
Given the above, write code using Python (Pandas library) to show what percent of active stores were fraudulent by day.
(Note: you should be able to write code here given solely the table schema (field names, descriptions above) -- it's fairly common in interviews to be provided the structure of tables and told to write a quick query.)
- We want one value for each day in the month.
- A store can be fraudulent and active on same day. E.g. they could generate revenue until 10AM, then be flagged as fradulent from 10AM onward.
Ace your next data science interview
Get better at data science interviews by solving a few questions per week