Fraudulent transactions

Question

The following dataset contains the following fields that describe the online transaction:

user_id: unique id of user
signup_time: time that the user created account
purchase_time: time of the transaction
purchase_value: amount of the transaction
device_id: device that user conducted transaction on
source: attribution channel for the transaction
browser: browser that user conducted transaction on
sex: gender of the user
age: age of the user
ip_address: IP address of the purchase
is_fraud: if the transaction is flagged as fraudulent

Can you describe the major differences between the non-fraud vs fraud transactions?

More specifically, can you create histograms for purchase value, time between sign up and purchase time, and age -- with fraud/non-fraud differentiated on the same chart?

Solution will be written using Python.

Solution

Access restricted

Subscribe to premium account to see the solution.

Get premium now