The following dataset contains the following fields that describe the online transaction:
- user_id: unique id of user
- signup_time: time that the user created account
- purchase_time: time of the transaction
- purchase_value: amount of the transaction
- device_id: device that user conducted transaction on
- source: attribution channel for the transaction
- browser: browser that user conducted transaction on
- sex: gender of the user
- age: age of the user
- ip_address: IP address of the purchase
- is_fraud: if the transaction is flagged as fraudulent
Can you describe the major differences between the non-fraud vs fraud transactions?
More specifically, can you create histograms for purchase value, time between sign up and purchase time, and age -- with fraud/non-fraud differentiated on the same chart?
Solution will be written using Python.
Subscribe to premium account to see the solution.Get premium now