Fraudulent retail accounts
Below is a "daily" table for an active account at Shopify (an online ecommerce, retail platform). The table is called store_account and the columns are:
|Column Name||Data Type||Description|
|store_id||integer||a unique Shopify store id|
|status||string||Possible values are: ['open', 'closed', 'fraud']|
|revenue||double||Amount of spend in USD|
Here's some additional information about the table:
- The granularity of the table is store_id and day
'fraud'are permanent labels
- Active =
daily revenue > 0
- Accounts get labeled by Shopify as fraudulent and they no longer can sell product
- Every day of the table has every store_id that has ever used Shopify
Given the above, write code using Python (Pandas library) to show what percent of active stores were fraudulent by day.
- We want one value for each day in the month.
- A store can be fraudulent and active on same day. E.g. they could generate revenue until 10AM, then be flagged as fradulent from 10AM onward.
(Note: you should be able to write code here given solely the table schema (field names, descriptions above) -- it's fairly common in interviews to be provided the structure of tables and told to write a quick query.)
Subscribe to premium account to see the solution.Get premium now