Skip to content
Go back

LeetCode SQL Hack

Table of contents

Open Table of contents

Checklist

4-Step Framework to Solve SQL Problems during an Interview

  1. Explore Underlying Data
  2. Identify Required Columns
  3. Visualize The Output
  4. Build Solution Step-By-Step And Test

Debug your Code

Follow the 4 steps to debug your code for most common mistakes you may have:


Complete logical execution order of SQL

  1. FROM, JOIN
  2. WHERE
  3. GROUP BY
  4. HAVING
  5. Window functions
  6. SELECT
  7. DISTINCT
  8. ORDER BY
  9. LIMIT/OFFSET

That’s why you can use alias when ORDER the results since ORDER BY is at the end of the execution sequence. While HAVING is before SELECT, in some database like Postgresql and MySQL, you can still use alias when HAVING.

1. FROM/JOINNo aliases exist yet
2. WHERENo aliases exist yet ❌
3. GROUP BY        ← Aliases may work (DB-dependent)
4. HAVING          ← Aliases often work ✅
5. SELECT          ← Aliases are CREATED here
6. DISTINCT        ← Can use aliases (DB-dependent)
7. ORDER BY        ← Aliases always work ✅
8. LIMIT/OFFSET    ← Works on final result

The rule: You can only use aliases in clauses that execute AFTER SELECT (or at the same time).

The Rules by Clause

ClauseExecutesName Resolution Priority
WHEREBefore SELECTOriginal column only (aliases don’t exist)
GROUP BYBefore SELECTOriginal column first, then alias
HAVINGAfter GROUP BYAlias first, then original column
SELECTCreating aliasesN/A (this is where aliases are born)
ORDER BYAfter SELECTAlias first, then original column

While Postgresql can use alias when GROUP BY, HAVING, ORDER BY, you should NOT to use it when GROUP BY since it use original column first! Just explicitly specify the columns for GROUP BY!

An example of GROUP BY - Before SELECT - Original column first, then alias: This code is only standard SQL but runs on MySQL/Postgresql/SQLit

SELECT
	CASE
		WHEN income < 20000 THEN 'Low Salary'
		WHEN income < 50000 THEN 'Average Salary'
		ELSE 'High Salary'
	END AS category,
	COUNT(*) AS accounts_count
FROM Accounts
GROUP BY category

Local Test

VSCode Extensions


Set up Your Local Test ENV

For better learning LeetCode SQL, you can set up your local SQL ENV.

For example, open your MySQL Workbench and connect to your local SQL server. Run:

create database leetcode;
use leetcode;

For the LeetCode SQL problem, click on SQL Schema under the question name. Then copy the SQL schema from the right window.

Example DB Schema

Open VSCode or any text editor that supports Regular Exression. Paste the SQL schema to any new file.

Like this:

Searching on text editor

Press Command + F for OS X users or Control + F for Windows users.

Click on the button to turn on User Regular Expression

Replace by regular expression

For the area Find, input \n For the area Replace, input ;\n This command is for replacing every line break with a ; then add a line break. You must have ; for each SQL statement to make it valid.

Then click on Replace All

Replace All

Then you will get the valid SQL INSERT statements.

Complete the INSERT statement

Copy and paste them to your MySQL workbench for local testing.

Don’t forget to drop your test table if you have a table with the same name. Otherwise you cannot run this SQL INSERT statements.

For example, LeetCode usually use table Order to store order data. You may have multiple questions with the same table name but different table schema. So you need to drop it first.


Tools

str_to_sql can convert LeetCode SQL testcase to SQL INSERT statemens for better debugging in your local SQL env.

Source Cdoe:

def str_to_sql(tb_string: str, tb_name: str, data_types=None) -> str:
    """
    Convert a string representation of a table to an SQL INSERT statement.

    Args:
        tb_string: Multiline string with the first line as headers and the rest as data rows.
        tb_name: Name of the SQL table to insert data into.
        data_types(data type, optional): A set of data types to map each column.

    Returns: SQL INSERT statement as a string.
    """
    lines = tb_string.strip().split("\n")
    headers = lines[0].split("|")[
        1:-1
    ]  # Skip the first and last empty splits from the borders
    headers = [header.strip() for header in headers]

    if data_types and len(data_types) != len(headers):
        raise ValueError(
            f"Expected {len(headers)} data types, received {len(data_types)} data types!"
        )

    values = []
    # apply str to each column
    if not data_types:
        for line in lines[2:]:  # Skip headers and '----'
            row = line.split("|")[
                1:-1
            ]  # Skip the first and last empty splits from the borders
            row = [i.strip() for i in row]
            values.append(f"({', '.join(map(repr, row))})")
    # map desired data type to each column
    else:
        for line in lines[2:]:  # Skip headers and '----'
            row = line.split("|")[
                1:-1
            ]  # Skip the first and last empty splits from the borders
            row = [i.strip() for i in row]
            i = 0
            while i < len(data_types):
                row[i] = data_types[i](row[i])
                i += 1
            values.append(f"({', '.join(map(repr, row))})")

    columns = ", ".join(headers)
    # multiline output
    # values_string = ',\n'.join(values)
    # single line output
    values_string = ", ".join(values)

    # multiline output
    # return f"INSERT INTO {tb_name} ({columns}) VALUES\n{values_string};"
    # single line output
    return f"INSERT INTO {tb_name} ({columns}) VALUES {values_string};"

How get your failed testcase and use this program?

  1. Under the Input area of your Testcase, select the table and copy it.
  2. Paste the table string to your notebook and assign the value to variable tb_string. Assign tb_name with the same table name as your Testcase. Then run funciton str_to_sql.
  3. Notice you can also pass date types in a tuple to the parameter data_types.

Example 1:

tb_string = """
| player_id | device_id | event_date | games_played |
| --------- | --------- | ---------- | ------------ |
| 1         | 2         | 2016-03-01 | 5            |
| 1         | 2         | 2016-05-02 | 6            |
| 2         | 3         | 2017-06-25 | 1            |
| 3         | 1         | 2016-03-02 | 0            |
| 3         | 4         | 2018-07-03 | 5            |
"""

str_to_sql(tb_string, 'Customer')

Output:

'INSERT INTO Customer (customer_id, product_key) VALUES ('1', '5'), ('2', '6'), ('3', '5'), ('3', '6'), ('1', '6');'

Example 2:

Use print to skip the '' Pass data types to the parameter data_types to assign data type for each column. If you skip the para data_types, the default data type for each column is string.

print(str_to_sql(tb_string, 'Activity', (int, int, str, int)))
INSERT INTO Activity (player_id, device_id, event_date, games_played) VALUES (1, 2, '2016-03-01', 5), (1, 2, '2016-05-02', 6), (2, 3, '2017-06-25', 1), (3, 1, '2016-03-02', 0), (3, 4, '2018-07-03', 5);

Share this post on:

Previous Post
Database Transaction Management
Next Post
LeetCode SQL - Scalar Fuction