Table of contents
Open Table of contents
Checklist
4-Step Framework to Solve SQL Problems during an Interview
- Explore Underlying Data
- Identify Required Columns
- Visualize The Output
- Build Solution Step-By-Step And Test
Debug your Code
Follow the 4 steps to debug your code for most common mistakes you may have:
- DISTINCT or not?
- Do you need to return distinct results?
- AGG what? Data Types?
- Which column should you aggregate?
- e.g., in some cases,
COUNT(*)!=COUNT(col)
- e.g., in some cases,
- What Data Type should the result be?
INT,BOOL, orNUMERIC?
- Which column should you aggregate?
- NULL values & Missing values?
- How to handle
NULLvalues? Pay attention to the Primary Keys. - How to handle the missing values, should we leave
0or remove the records?
- How to handle
- WHERE, HAVING, or CASE?
- While they are all filters, think clear before applying the filter.
- Especially when
LEFT JOIN,HAVING(on the right table) is completely different than applying WHERE clause on the entire tables.- When you don’t need the records, apply
WHERE - When you need the records from the left table and leave
NULLor0on the results, applyHAVING
- When you don’t need the records, apply
Complete logical execution order of SQL
- FROM, JOIN
- WHERE
- GROUP BY
- HAVING
- Window functions
- SELECT
- DISTINCT
- ORDER BY
- LIMIT/OFFSET
That’s why you can use alias when ORDER the results since ORDER BY is at the end of the execution sequence. While HAVING is before SELECT, in some database like Postgresql and MySQL, you can still use alias when HAVING.
1. FROM/JOIN ← No aliases exist yet
2. WHERE ← No aliases exist yet ❌
3. GROUP BY ← Aliases may work (DB-dependent)
4. HAVING ← Aliases often work ✅
5. SELECT ← Aliases are CREATED here
6. DISTINCT ← Can use aliases (DB-dependent)
7. ORDER BY ← Aliases always work ✅
8. LIMIT/OFFSET ← Works on final result
The rule: You can only use aliases in clauses that execute AFTER SELECT (or at the same time).
The Rules by Clause
| Clause | Executes | Name Resolution Priority |
|---|---|---|
| WHERE | Before SELECT | Original column only (aliases don’t exist) |
| GROUP BY | Before SELECT | Original column first, then alias |
| HAVING | After GROUP BY | Alias first, then original column |
| SELECT | Creating aliases | N/A (this is where aliases are born) |
| ORDER BY | After SELECT | Alias first, then original column |
While Postgresql can use alias when GROUP BY, HAVING, ORDER BY, you should NOT to use it when GROUP BY since it use original column first! Just explicitly specify the columns for GROUP BY!
An example of GROUP BY - Before SELECT - Original column first, then alias: This code is only standard SQL but runs on MySQL/Postgresql/SQLit
SELECT
CASE
WHEN income < 20000 THEN 'Low Salary'
WHEN income < 50000 THEN 'Average Salary'
ELSE 'High Salary'
END AS category,
COUNT(*) AS accounts_count
FROM Accounts
GROUP BY category
Local Test
VSCode Extensions
- PostgreSQL: Postgresql DB connecter
- SQL Formatter VSCode: format SQL code.
- Data Wrangler: for handling tabular data in Jupyter.
Set up Your Local Test ENV
For better learning LeetCode SQL, you can set up your local SQL ENV.
For example, open your MySQL Workbench and connect to your local SQL server. Run:
create database leetcode;
use leetcode;
For the LeetCode SQL problem, click on SQL Schema under the question name. Then copy the SQL schema from the right window.

Open VSCode or any text editor that supports Regular Exression. Paste the SQL schema to any new file.
Like this:

Press Command + F for OS X users or Control + F for Windows users.
Click on the button to turn on User Regular Expression

For the area Find, input \n
For the area Replace, input ;\n
This command is for replacing every line break with a ; then add a line break.
You must have ; for each SQL statement to make it valid.
Then click on Replace All

Then you will get the valid SQL INSERT statements.

Copy and paste them to your MySQL workbench for local testing.
Don’t forget to drop your test table if you have a table with the same name. Otherwise you cannot run this SQL INSERT statements.
For example, LeetCode usually use table Order to store order data. You may have multiple questions with the same table name but different table schema. So you need to drop it first.
Tools
str_to_sql can convert LeetCode SQL testcase to SQL INSERT statemens for better debugging in your local SQL env.
Source Cdoe:
def str_to_sql(tb_string: str, tb_name: str, data_types=None) -> str:
"""
Convert a string representation of a table to an SQL INSERT statement.
Args:
tb_string: Multiline string with the first line as headers and the rest as data rows.
tb_name: Name of the SQL table to insert data into.
data_types(data type, optional): A set of data types to map each column.
Returns: SQL INSERT statement as a string.
"""
lines = tb_string.strip().split("\n")
headers = lines[0].split("|")[
1:-1
] # Skip the first and last empty splits from the borders
headers = [header.strip() for header in headers]
if data_types and len(data_types) != len(headers):
raise ValueError(
f"Expected {len(headers)} data types, received {len(data_types)} data types!"
)
values = []
# apply str to each column
if not data_types:
for line in lines[2:]: # Skip headers and '----'
row = line.split("|")[
1:-1
] # Skip the first and last empty splits from the borders
row = [i.strip() for i in row]
values.append(f"({', '.join(map(repr, row))})")
# map desired data type to each column
else:
for line in lines[2:]: # Skip headers and '----'
row = line.split("|")[
1:-1
] # Skip the first and last empty splits from the borders
row = [i.strip() for i in row]
i = 0
while i < len(data_types):
row[i] = data_types[i](row[i])
i += 1
values.append(f"({', '.join(map(repr, row))})")
columns = ", ".join(headers)
# multiline output
# values_string = ',\n'.join(values)
# single line output
values_string = ", ".join(values)
# multiline output
# return f"INSERT INTO {tb_name} ({columns}) VALUES\n{values_string};"
# single line output
return f"INSERT INTO {tb_name} ({columns}) VALUES {values_string};"
How get your failed testcase and use this program?
- Under the Input area of your Testcase, select the table and copy it.

- Paste the table string to your notebook and assign the value to variable
tb_string. Assigntb_namewith the same table name as your Testcase. Then run funcitonstr_to_sql. - Notice you can also pass date types in a tuple to the parameter
data_types.
Example 1:
tb_string = """
| player_id | device_id | event_date | games_played |
| --------- | --------- | ---------- | ------------ |
| 1 | 2 | 2016-03-01 | 5 |
| 1 | 2 | 2016-05-02 | 6 |
| 2 | 3 | 2017-06-25 | 1 |
| 3 | 1 | 2016-03-02 | 0 |
| 3 | 4 | 2018-07-03 | 5 |
"""
str_to_sql(tb_string, 'Customer')
Output:
'INSERT INTO Customer (customer_id, product_key) VALUES ('1', '5'), ('2', '6'), ('3', '5'), ('3', '6'), ('1', '6');'
Example 2:
Use print to skip the ''
Pass data types to the parameter data_types to assign data type for each column.
If you skip the para data_types, the default data type for each column is string.
print(str_to_sql(tb_string, 'Activity', (int, int, str, int)))
INSERT INTO Activity (player_id, device_id, event_date, games_played) VALUES (1, 2, '2016-03-01', 5), (1, 2, '2016-05-02', 6), (2, 3, '2017-06-25', 1), (3, 1, '2016-03-02', 0), (3, 4, '2018-07-03', 5);