Happiness score correlation

Question

Suppose you are given this dataset, which provides the happiness scored across countries based on various factors. The factors include:

Year: The year the country was assessed
Overall rank: Rank of country's (or region's) happiness score for the year
Country or region: Country or region being measured
Score: This is the happiness score
GDP per capita: The extent to which GDP contributes to the calculation of the Happiness Score.
Social support: The extent to which Family contributes to the calculation of the Happiness Score
Life expectancy: The extent to which Life expectancy contributed to the calculation of the Happiness Score
Freedom: The extent to which Freedom contributed to the calculation of the Happiness Score.
Generosity: The extent to which Generosity contributed to the calculation of the Happiness Score.
Perceptions of corruption: The extent to which Perception of Corruption contributes to Happiness Score.

Given this, can you identify the factors that have the highest correlation to the happiness score?

To help get you started, below is code to load the dataset into a Pandas dataframe. You can also make a copy of this Google Colab notebook.

#Importing packages.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import math
%matplotlib inline

#Loading in pearon's r
from scipy.stats import pearsonr

#Reading in data
data = pd.read_csv('https://raw.githubusercontent.com/erood/interviewqs.com_code_snippets/master/Datasets/world_happiness_2015_2019.csv', parse_dates=True) 
data.head()

	Year	Overall rank	Country or region	Score	GDP per capita	Social support	Healthy life expectancy	...
0	2019	1	Finland	7.769	1.340	1.587	0.986	...
1	2019	2	Denmark	7.600	1.383	1.573	0.996	...
2	2019	3	Norway	7.554	1.488	1.582	1.028	...
3	2019	4	Iceland	7.494	1.380	1.624	1.026	...
4	2019	5	Netherlands	7.488	1.396	1.522	0.999	...

Solution

Access restricted

Subscribe to premium account to see the solution.

Get premium now