SQL Nested Subquery Example Interview Question


Overview of Nested Queries

A SQL nested query, also known as a subquery or subselect, is a SELECT query method embedded within another query. It’s an efficient way to get the desired output that may require multiple query steps or to handle queries that are interdependent.

When to use Nested Subqueries

You can use nested subqueries in a couple different ways:

In order to use a subquery, there are a couple of syntactical rules that need to be followed:


Example SQL interview question using a nested subquery

Suppose you are given the following table showing company sales:

Table: sales_info

date sale_id sale_usd
2020-01-05 1111 93695
2020-01-07 1112 879617
2020-01-07 1113 752878
... ... ...

Calculate the cumulative percent of total sales on a given day. The output of the table should look like the example table below.

date pct_total_sales
2020-01-05 X%
2020-01-07 Y%

You can work through this example using the interactive SQL fiddle here.


Before we start writing SQL, we'll break the question into steps:

  1. Calculate the daily sales total
  2. Calculate the cumulative sum of the daily sales total and total sales for all days
  3. Divide the daily total sales by the cumulative sum

1. Calculate the daily sales total

First, we'll write the base query, which will become a subquery in the next step. The below query calculates the daily sales total and you can interact with the query with this SQL fiddle.

SELECT

#we'll need to sum sale_usd by the date

date,

sum(sale_usd) as total_usd

FROM sales_info

#because we're aggregating sale_usd by date, we need

#need to group by date

GROUP BY date



2. Calculate the cumulative sum of the daily sales total and total sales for all days

The query below calculates the cumulative sum of the daily sales total and total sales for all days. You can interact with the query below with this SQL fiddle.

You'll notice in this step that we build a query around the previous query from the step above. In order to successfully build the subquery, we needed to enclose the subquery in parentheses and ensure that we alias the table (since we're using MySQL, note that this isn't a requirement for some SQL variants). We also need to ensure that we include all fields needed for the outer query (e.g. date and total sales).

SELECT

#in this query, we do not group by since we're using a windowing function

date,

SUM(total_usd) OVER (

ORDER BY date ASC rows

BETWEEN unbounded preceding and current row)

as cum_total, # this is a windowing function to

# calculate the cumulative sum

SUM(total_usd) OVER () as total # this is a windowing function to

# calculate the total

FROM(

SELECT

#we'll need to sum sale_usd by the date

date,

sum(sale_usd) as total_usd

FROM sales_info

#because we're aggregating sale_usd by date, we need

#need to group by date

GROUP BY date

) as q1 #we create an alias for this table as required by MySQL



3. Divide the cumulative total sales by the cumulative sum

The last step is to divide the cum_total by the total. We can accomplish this in the same step as above (just dividing the two windowing functions), or we can build a subquery over the previous step. The query below uses another subquery resulting in the end query having 2 nested subqueries. You can interact with the query below with this SQL fiddle.

SELECT

date,

100 * cum_total / total as

FROM(

#in this query, we do not group by

#since we're using a windowing function

SELECT

date,

SUM(total_usd) OVER (ORDER BY date ASC

rows BETWEEN unbounded preceding and current row)

as cum_total, # this is a windowing function to

# calculate the cumulative sum

SUM(total_usd) OVER () as total # this is a windowing function

# to calculate the total

FROM(

SELECT

#we'll need to sum sale_usd by the date

date,

sum(sale_usd) as total_usd

FROM sales_info

#because we're aggregating sale_usd by date, we need

#need to group by date

GROUP BY date

) as q1 #we create an alias for this table as required by MySQL

) as q2