Top 100 SQL Interview Questions
Basic SQL Questions
What is SQL?
- SQL stands for Structured Query Language, and it is used for managing and manipulating relational databases.
Explain the SELECT statement.
- The SELECT statement is used to retrieve data from a database. It can be used to retrieve specific columns or all columns from a table.
What is the difference between SQL and MySQL?
- SQL is a language used for managing and querying databases, while MySQL is a relational database management system (RDBMS) that uses SQL.
Explain the WHERE clause in SQL.
- The WHERE clause is used to filter records based on a specified condition.
What is a primary key?
- A primary key is a unique identifier for each record in a table. It must contain unique values and cannot be null.
Intermediate SQL Questions
Explain the JOIN clause in SQL.
- JOIN is used to combine rows from two or more tables based on a related column between them.
What is a foreign key?
- A foreign key is a field that refers to the primary key in another table. It establishes a link between two tables.
What is normalization?
- Normalization is the process of organizing data in a database to reduce redundancy and improve data integrity.
Explain GROUP BY and HAVING clauses.
- GROUP BY is used to group rows based on the values of one or more columns. HAVING is used to filter results after grouping.
What is an index?
- An index is a data structure that improves the speed of data retrieval operations on a database table.
Explain the difference between INNER JOIN and LEFT JOIN.
- INNER JOIN returns only matching rows from both tables, while LEFT JOIN returns all rows from the left table and the matching rows from the right table.
What is a subquery?
- A subquery is a query nested inside another query. It can be used to retrieve data that will be used by the main query.
Explain the difference between UNION and UNION ALL.
- UNION combines the result sets of two SELECT statements and removes duplicates, while UNION ALL includes all rows, including duplicates.
What is a stored procedure?
- A stored procedure is a set of SQL statements that can be stored in the database and executed later as a single unit.
Explain the concept of ACID properties in a database.
- ACID stands for Atomicity, Consistency, Isolation, and Durability, which are properties that guarantee database transactions are processed reliably.
Advanced SQL Questions
What is a trigger?
- A trigger is a set of instructions that are automatically executed or fired when a certain event occurs in the database.
Explain the concept of a view.
- A view is a virtual table based on the result of a SELECT statement. It does not store the data itself but provides a way to represent it.
What is the difference between DELETE and TRUNCATE?
- DELETE is used to delete specific rows from a table based on a condition, while TRUNCATE is used to remove all rows from a table.
Explain the concept of a cursor in SQL.
- A cursor is a database object used to process a result set one row at a time.
What is the purpose of the COMMIT and ROLLBACK statements?
- COMMIT is used to save changes made during the current transaction, while ROLLBACK is used to undo changes made during the current transaction.
Explain the concept of a deadlock.
- A deadlock occurs when two or more transactions are blocked indefinitely, each waiting for the other to release a lock.
What is the difference between a clustered and non-clustered index?
- A clustered index determines the physical order of data rows in a table, while a non-clustered index does not.
Explain the concept of normalization forms.
- Normalization forms are rules that define how to organize data in a relational database to reduce redundancy and improve data integrity.
What is the purpose of the CASE statement in SQL?
- The CASE statement is used to perform conditional logic within a SQL query, similar to an IF-THEN-ELSE statement.
Explain the concept of a materialized view.
- A materialized view is a database object that contains the results of a query and is stored as a physical table.
Write a SQL query to find the second highest salary from an Employee table.
SELECT MAX(salary) FROM Employee WHERE salary < (SELECT MAX(salary) FROM Employee);
Write a SQL query to count the number of rows in a table.
SELECT COUNT(*) FROM TableName;
Write a SQL query to find duplicate records in a table.
SELECT column1, column2, COUNT(*) FROM TableName GROUP BY column1, column2 HAVING COUNT(*) > 1;
Write a SQL query to retrieve the top N records from a table.
SELECT * FROM TableName LIMIT N;
Find the second-highest salary from an “employees” table.
SELECT MAX(salary) FROM employees WHERE salary < (SELECT MAX(salary) FROM employees);
- Explanation: This query uses a subquery to find the highest salary and then retrieves the next highest salary by filtering salaries less than the maximum.
Calculate the running total of sales for each month in a “sales” table.
SELECT sale_date, amount, SUM(amount) OVER (ORDER BY sale_date) AS running_total FROM sales;
- Explanation: The query uses the window function
SUM() OVER
to calculate the running total of sales, ordered by the sale date.
Retrieve the top 3 most sold products from a “products” table.
SELECT product_id, product_name, sold_units FROM ( SELECT product_id, product_name, SUM(units_sold) AS sold_units FROM sales JOIN products ON sales.product_id = products.product_id GROUP BY product_id, product_name ORDER BY sold_units DESC LIMIT 3 ) AS top_products;
- Explanation: This query joins the “sales” and “products” tables, calculates the total sold units for each product, orders them in descending order, and selects the top 3.
Identify customers who have made at least three consecutive purchases.
SELECT customer_id, MIN(sale_date) AS start_date, MAX(sale_date) AS end_date, COUNT(*) AS consecutive_purchases FROM ( SELECT customer_id, sale_date, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY sale_date) - ROW_NUMBER() OVER (PARTITION BY customer_id, DATE_ADD(sale_date, INTERVAL 1 DAY) ORDER BY sale_date) AS grp FROM sales ) AS purchase_groups GROUP BY customer_id, grp HAVING consecutive_purchases >= 3;
- Explanation: This query uses window functions to create groups of consecutive purchases and then identifies customers with at least three consecutive purchases.
Calculate the moving average of the “temperature” column for a weather table over a 7-day window.
SELECT date, temperature, AVG(temperature) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS moving_avg FROM weather;
Explanation: The query calculates the moving average of the “temperature” column using the window function AVG() OVER
with a specified window of the last 7 days.
Update the “rank” column in a “students” table based on their scores, assigning the same rank to students with equal scores.
UPDATE students SET rank = ( SELECT COUNT(DISTINCT score) + 1 FROM students s2 WHERE s2.score > students.score );
- Explanation: This query uses a correlated subquery to update the “rank” column based on the count of distinct scores greater than the current student’s score.
Pivot the “sales” table to show total sales for each product in each month.
SELECT product_id, product_name, COALESCE(SUM(CASE WHEN MONTH(sale_date) = 1 THEN amount END), 0) AS january_sales, COALESCE(SUM(CASE WHEN MONTH(sale_date) = 2 THEN amount END), 0) AS february_sales, -- Repeat for other months COALESCE(SUM(CASE WHEN MONTH(sale_date) = 12 THEN amount END), 0) AS december_sales FROM sales JOIN products ON sales.product_id = products.product_id GROUP BY product_id, product_name;
This query uses conditional aggregation (CASE statements) to pivot the “sales” table and display total sales for each product in each month.
Identify the longest consecutive sequence of login dates for each user in a “logins” table.
SELECT user_id, MIN(login_date) AS start_date, MAX(login_date) AS end_date, COUNT(*) AS consecutive_days FROM ( SELECT user_id, login_date, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY login_date) - ROW_NUMBER() OVER (PARTITION BY user_id, DATE_ADD(login_date, INTERVAL 1 DAY) ORDER BY login_date) AS grp FROM logins ) AS login_groups GROUP BY user_id, grp ORDER BY consecutive_days DESC LIMIT 1;
- Explanation: This query identifies the longest consecutive sequence of login dates for each user by creating groups of consecutive logins using window functions and then selecting the maximum consecutive days.
SQL Query Questions
What is an SQL query?
- Explanation: An SQL query is a statement used to retrieve, insert, update, or delete data from a database. It is written using the SQL language and can be executed against a relational database management system (RDBMS).
Explain the difference between SQL and T-SQL.
- Explanation: SQL (Structured Query Language) is a standard language for managing and manipulating relational databases. T-SQL (Transact-SQL) is Microsoft’s proprietary extension of SQL used with SQL Server and Azure SQL Database.
How do you retrieve all columns from a table named “employees”?
- Explanation: Use the
SELECT * FROM employees;
query to retrieve all columns from the “employees” table.
What is the purpose of the WHERE clause in a SQL query?
- Explanation: The WHERE clause is used to filter the rows returned by a query based on a specified condition. It is used in SELECT, UPDATE, and DELETE statements.
Write a SQL query to find the total number of rows in a table named “orders”.
- Explanation:
SELECT COUNT(*) FROM orders;
Explain the difference between INNER JOIN and LEFT JOIN.
- Explanation: INNER JOIN returns only matching rows from both tables, while LEFT JOIN returns all rows from the left table and the matching rows from the right table.
Write a SQL query to retrieve unique values from a column named “category” in a table named “products”.
- Explanation:
SELECT DISTINCT category FROM products;
What is the purpose of the GROUP BY clause in SQL?
- Explanation: GROUP BY is used to group rows based on the values of one or more columns. It is often used with aggregate functions like COUNT, SUM, AVG, etc.
Explain the HAVING clause in SQL.
- Explanation: HAVING is used in combination with the GROUP BY clause to filter the results of aggregate functions based on a specified condition.
Write a SQL query to find the highest salary from an “employees” table.
- Explanation:
SELECT MAX(salary) FROM employees;
What is the purpose of the ORDER BY clause in SQL?
- Explanation: ORDER BY is used to sort the result set of a query in ascending or descending order based on one or more columns.
Write a SQL query to update the “status” column to ‘Approved’ in a “orders” table where the “amount” is greater than 1000.
- Explanation:
UPDATE orders SET status = 'Approved' WHERE amount > 1000;
Explain the concept of a subquery.
- Explanation: A subquery is a query nested inside another query. It can be used to retrieve data that will be used by the main query.
Write a SQL query to find the average salary for each department in a table named “employees”.
- Explanation:
SELECT department, AVG(salary) FROM employees GROUP BY department;
What is the purpose of the LIMIT clause in SQL?
- Explanation: The LIMIT clause is used to restrict the number of rows returned by a query. It is often used for pagination.
Write a SQL query to delete all records from a table named “customers” where the “last_purchase_date” is older than a year.
- Explanation:
DELETE FROM customers WHERE last_purchase_date < NOW() - INTERVAL 1 YEAR;
Explain the concept of a self-join.
- Explanation: A self-join occurs when a table is joined with itself. It is used to combine rows in a table with related rows in the same table.
Write a SQL query to find the top 5 highest-paid employees from an “employees” table.
- Explanation:
SELECT * FROM employees ORDER BY salary DESC LIMIT 5;
What is the purpose of the DISTINCT keyword in a SELECT statement?
- Explanation: DISTINCT is used to retrieve unique values from a specified column in the result set of a query.
Write a SQL query to calculate the total sales for each month in a table named “sales”.
- Explanation:
SELECT MONTH(sale_date) AS month, SUM(amount) AS total_sales FROM sales GROUP BY MONTH(sale_date);
FQAs
Window Functions:
Q: What are window functions in SQL, and when would you use them?
- A: Window functions perform calculations across a specified range of rows related to the current row. They are used for tasks like ranking, moving averages, and cumulative sums.
Q: Explain the difference between the ROW_NUMBER() and RANK() functions.
- A: ROW_NUMBER() assigns a unique number to each row, while RANK() assigns a rank, with ties receiving the same rank and leaving gaps.
Q: How can you use the LAG() and LEAD() functions in a query?
- A: LAG() accesses data from a previous row, and LEAD() accesses data from a subsequent row, both relative to the current row.
Common Table Expressions (CTEs):
Q: What is a Common Table Expression (CTE), and how is it different from a subquery?
- A: A CTE is a named temporary result set defined within a SELECT, INSERT, UPDATE, or DELETE statement. It enhances readability and can be referenced multiple times.
Q: Provide an example where using a CTE improves the readability and performance of a query.
- A: Using a CTE to calculate aggregates or recursive queries can enhance readability and simplify complex logic.
Stored Procedures and Functions:
Q: Compare and contrast stored procedures and functions. When would you prefer one over the other?
- A: Both are reusable blocks of code, but stored procedures do not have a return value, while functions return a value. Use stored procedures for actions, functions for calculations.
Q: How do you handle errors within a stored procedure?
- A: Use a combination of TRY, CATCH blocks to handle errors gracefully within a stored procedure.
Indexes and Query Optimization:
Q: Explain the importance of indexes in a database. When should you create composite indexes?
- A: Indexes improve query performance by allowing the database engine to quickly locate rows. Composite indexes are beneficial when queries involve multiple columns.
Q: Discuss the concept of covering indexes and their impact on query performance.
- A: Covering indexes include all columns needed for a query, eliminating the need to access the actual table and improving performance.
Transactions and Concurrency Control:
Q: What is a transaction in the context of a relational database?
- A: A transaction is a sequence of one or more SQL statements treated as a single unit, ensuring data consistency and integrity.
Q: How does isolation level affect the behavior of transactions in a multi-user environment?
- A: Isolation level defines the degree to which one transaction must be isolated from the effects of other concurrent transactions, impacting consistency and performance.
Advanced Joins:
Q: Explain the differences between INNER JOIN, LEFT JOIN, and RIGHT JOIN. Provide examples of when to use each.
- A: INNER JOIN returns matching rows, LEFT JOIN returns all from the left table and matching from the right, and RIGHT JOIN is the opposite. Choose based on data requirements.
Q: How can you achieve the same result as a FULL JOIN using other join types?
- A: You can simulate a FULL JOIN using a combination of LEFT JOIN and RIGHT JOIN with UNION.
Materialized Views:
Q: What is a materialized view, and how does it differ from a regular view?
- A: A materialized view is a physical copy of the result set of a query, stored for faster retrieval. It differs from a regular view, which is a virtual table.
Q: In what scenarios would you choose to use a materialized view?
- A: Materialized views are useful for complex queries with large datasets where precomputing and storing results can significantly improve performance.
Recursive Queries:
Q: Describe a scenario where a recursive query is necessary. Provide an example.
- A: Recursive queries are useful for handling hierarchical data, such as organizational charts or category hierarchies.
Q: How does a recursive common table expression work?
- A: A recursive CTE refers to itself in the SELECT statement, allowing iterative processing until a specified condition is met.
Database Security:
Q: What are some best practices for securing a database?
- A: Best practices include using strong passwords, limiting access with roles and permissions, encrypting sensitive data, and regularly updating and patching the database system.
Q: Explain the concept of role-based access control (RBAC) in the context of database security.
- A: RBAC assigns permissions to roles, and users are then assigned to roles. This simplifies permission management and ensures consistent access control.
Advanced Subqueries:
Q: How do correlated subqueries differ from non-correlated subqueries?
- A: Correlated subqueries refer to columns in the outer query, while non-correlated subqueries can run independently of the outer query.
Q: Provide an example where a subquery is more appropriate than a join.
- A: Subqueries are often more suitable when the inner query result depends on the outer query.
JSON in SQL:
Q: How does SQL handle JSON data? What functions can you use to manipulate JSON?
- A: SQL has functions like JSON_VALUE, JSON_QUERY, and JSON_MODIFY to extract and manipulate data within JSON documents.
Q: Demonstrate how to extract specific information from a JSON column.
- A: Using JSON_VALUE or JSON_QUERY functions along with the path to the desired data.
Temporal Tables:
Q: Explain the purpose of temporal tables. How do they simplify historical data tracking?
- A: Temporal tables track changes to data over time, making it easy to retrieve historical versions of records.
Q: Provide a query that retrieves data as of a specific point in time using temporal tables.
- A: Use the AS OF SYSTEM TIME clause to query temporal tables at a specific timestamp.