Fraudulent transactions

Question

The following dataset contains the following fields that describe the online transaction:

  • user_id: unique id of user
  • signup_time: time that the user created account
  • purchase_time: time of the transaction
  • purchase_value: amount of the transaction
  • device_id: device that user conducted transaction on
  • source: attribution channel for the transaction
  • browser: browser that user conducted transaction on
  • sex: gender of the user
  • age: age of the user
  • ip_address: IP address of the purchase
  • is_fraud: if the transaction is flagged as fraudulent

Can you describe the major differences between the non-fraud vs fraud transactions?

More specifically, can you create histograms for purchase value, time between sign up and purchase time, and age -- with fraud/non-fraud differentiated on the same chart?

Solution will be written using Python.

Solution

Access restricted

Subscribe to premium account to see the solution.

Get premium now