The following dataset contains information on loans. Can you do the following to prepare the data set for analysis?
Create a new column called "loan_status_type" which will categorize "loan_status" into the following:
Current - loans currently outstanding
Closed - loans that are no longer open
Create a new column called "loan_status_standing" which will categorize "loan_status" into the following:
Good - customers who have (so far) successfully met the condition of their loan (e.g. no missed payments, no late fees accumulated)
Bad - customers who have missed payments / defaulted
With these 2 new columns, can you plot the month and year the loan was issued and the sum of the loan amounts by loan_status_type and loan_status_contract?
The data provided is a subset of a larger dataset. You can find more information about the larger dataset here.
| LoanStatNew | Description |
|---|---|
| zip_code | The first 3 numbers of the zip code provided by the borrower in the loan application. |
| addr_state | The state provided by the borrower in the loan application |
| annual_inc | The annual income provided by the borrower during registration. |
| collection_recovery_fee | post charge off collection fee |
| collections_12_mths_ex_med | Number of collections in 12 months excluding medical collections |
| delinq_2yrs | The number of 30+ days past-due incidences of delinquency in the borrower’s credit file for the past 2 years |
| desc | Loan description provided by the borrower |
| dti | A ratio calculated using the borrower’s total monthly debt payments on the total debt obligations, excluding mortgage and the requested LC loan, divided by the borrower’s self-reported monthly income. |
| earlie... |
New to InterviewQs? Sign up now.
By proceeding, you agree to our Terms and Conditions and Privacy Policy.
Loading editor...