Finding duplicate values in a SQL table


The following example uses ipython-sql. To install:
pip install ipython-sql

%sql is a Jupyter magic command and can be ignored when not using Jupyter notebooks.

Configure SQL for Jupyter notebook

#Load sql
%load_ext sql
#connect to sqlite
%sql sqlite://

Create a table

%%sql

CREATE TABLE players (first_name, last_name, age);
%%sql

INSERT INTO players (first_name, last_name, age) VALUES
('Lebron', 'James', 33),
('Steph', 'Curry', 30),
('Mike', 'Jordan', 55),
('Mike', 'Bibby', 40);

Count the # of unique first names in the table

%%sql 

--Select the distinct # of first names in our table, players
SELECT COUNT(DISTINCT(first_name)) FROM players
COUNT(DISTINCT(first_name))
3


Ace your next data science interview

Get better at data science interviews by solving a few questions per week



Find a bug? Submit a suggested change on Github, or message me on Twitter.