sql Interview Questions
(5 questions)
Q1. Find the 2nd highest salary from an Employee table
“There are multiple ways to find the second highest salary, but the most reliable and interview-friendly approach is using DENSE_RANK.”
-
Best approach: DENSE_RANK()
- Assigns ranks based on salary in descending order
- Highest salary gets rank 1, second highest gets rank 2
- Correctly handles duplicate salaries
SELECT salary FROM ( SELECT salary, DENSE_RANK() OVER (ORDER BY salary DESC) AS rnk FROM Employee ) t WHERE rnk = 2; -
Alternative approach — MAX with subquery
- First find the highest salary
- Then get the maximum salary less than that
SELECT MAX(salary) FROM Employee WHERE salary < (SELECT MAX(salary) FROM Employee); -
Other option — ORDER BY with OFFSET
- Works, but not always preferred in interviews
“Prefer DENSE_RANK in real-time, because it’s more robust and handles edge cases like duplicate salaries properly.”
Q2. Delete duplicate rows from a table keeping only one
“The goal here is to remove duplicate records while keeping one unique row, and the most preferred approach is using ROW_NUMBER.”
-
Best approach — ROW_NUMBER() with CTE
- Identify duplicates using PARTITION BY
- Assign row numbers to each duplicate group
- Keep the first row and delete the rest
WITH cte AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY name, salary ORDER BY id) AS rn FROM Employee ) DELETE FROM cte WHERE rn > 1; -
Alternative approach — DISTINCT with temp table
- Copy unique records into a new table
- Truncate original table and reinsert data
-
GROUP BY (supporting approach)
- Used to identify duplicates, not directly delete them
Q3. Calculate cumulative sum (running total) of sales by date
“To calculate a running total, we use window functions like SUM() OVER(), which allows us to accumulate values across rows.”
-
Best approach: SUM() OVER()
- Calculates cumulative sum in a defined order
- ORDER BY sale_date ensures values are added sequentially
SELECT sale_date, sales_amount, SUM(sales_amount) OVER (ORDER BY sale_date) AS running_total FROM sales; -
With PARTITION (if needed)
- Used when we want running totals within a group, like per region
SELECT sale_date, region, sales_amount, SUM(sales_amount) OVER (PARTITION BY region ORDER BY sale_date) AS running_total FROM sales;
“In simple terms, a running total means current value plus all previous values, and SUM() OVER with ORDER BY handles this automatically.”
Q4. Explain LEAD and LAG functions with an example
“LEAD and LAG are window functions used to access values from the next or previous row without using joins.”
-
LAG() — Previous row value
- Fetches data from the previous row based on ordering
- First row returns NULL since there is no previous row
SELECT sale_date, sales_amount, LAG(sales_amount) OVER (ORDER BY sale_date) AS prev_day_sales FROM sales; -
LEAD() — Next row value
- Fetches data from the next row
- Last row returns NULL since there is no next row
SELECT sale_date, sales_amount, LEAD(sales_amount) OVER (ORDER BY sale_date) AS next_day_sales FROM sales; -
Real use case
- Compare current value with previous value
SELECT sale_date, sales_amount, sales_amount - LAG(sales_amount) OVER (ORDER BY sale_date) AS diff FROM sales;
“In simple terms, LAG is used to look back to the previous row, and LEAD is used to look ahead to the next row.”
Q5. Explain all Window Functions and their syntax
“Window functions are used to perform calculations across a set of rows related to the current row, without reducing the number of rows like GROUP BY.”
-
Basic Syntax
function_name(column) OVER ( PARTITION BY column ORDER BY column ) - OVER() defines the window - PARTITION BY splits data into groups - ORDER BY defines the sequence within each group
-
Aggregate Window Functions
- Perform calculations like SUM, AVG, COUNT, MIN, MAX without collapsing rows
- Example: running total
SUM(sales_amount) OVER (ORDER BY sale_date) -
Ranking Functions
- Used to assign ranks to rows
ROW_NUMBER()→ unique sequenceRANK()→ same rank with gapsDENSE_RANK()→ same rank without gapsNTILE(n)→ divides rows into groups
-
Value Functions
- Access values from other rows
LAG()→ previous rowLEAD()→ next rowFIRST_VALUE()→ first value in windowLAST_VALUE()→ last value in window
-
Window Frame (Advanced)
- Defines the range of rows used in calculation
SUM(sales) OVER ( ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW ) - Means from first row to current row
“In simple terms, window functions allow us to perform calculations across rows while still keeping each row in the result.”