Below is a daily table for an active acount at Shopify (an online ecommerce, retail platform). The table is called store_account and the columns are:
||a unique Shopify store id
||Possible values are: [‘open’, 'closed’, ‘fraud’]
||Amount of spend in USD
Here's some additional information about the table:
- The granularity of the table is store_id and day
- Assume “close” and “fraud” are permanent labels
- Active = daily revenue > 0
- Accounts get labeled by Shopify as fraudulent and they no longer can sell product
- Every day of the table has every store_id that has ever used Shopify
What percent of active stores were fraudulent by day, for each day in the last month? Solution will be given using Python (Pandas) for premium users.
- We want one value for each day in the month.
- A store can be fraudulent and active on same day. E.g. they could generate revenue until 10AM, then be flagged as fradulent from 10AM onward.
Ace your next data science interview
Get better at data science interviews by solving a few questions per week