Removing Duplicates with DISTINCT in SQL
Removing Duplicates with DISTINCT in SQL
When querying data from a database, you may encounter duplicate rows in your result set. The DISTINCT
keyword in SQL is used to eliminate duplicate rows, ensuring that the results returned by your query are unique. This is particularly useful when you want to retrieve a list of unique values from a column or a combination of columns.
1. Understanding DISTINCT
The DISTINCT
keyword is applied to the SELECT
statement to remove duplicate rows from the result set. When you use DISTINCT
, SQL scans the result set and removes any duplicate rows, returning only unique entries.
Syntax:
column1, column2, ...
: The columns you want to retrieve, with duplicates removed.
2. Using DISTINCT with a Single Column
You can use DISTINCT
to retrieve unique values from a single column. This is useful when you want to get a list of all distinct values in that column.
Example:
In this example:
The
SELECT DISTINCT
statement retrieves a list of unique departments from theemployees
table.
Result:
Sales
Marketing
IT
HR
In this output:
The result contains only one row for each unique department, even if multiple employees work in the same department.
3. Using DISTINCT with Multiple Columns
You can also use DISTINCT
to retrieve unique combinations of values from multiple columns. This ensures that the combination of values in the specified columns is unique across the result set.
Example:
In this example:
The
SELECT DISTINCT
statement retrieves unique combinations offirst_name
andlast_name
from theemployees
table.
Result:
Amit
Sharma
Neha
Singh
Ravi
Verma
In this output:
The result contains unique combinations of
first_name
andlast_name
, even if some first names or last names appear multiple times in the table.
4. Combining DISTINCT with Other Clauses
You can combine DISTINCT
with other SQL clauses such as WHERE
, ORDER BY
, and LIMIT
to further refine your query.
Example 1: Using DISTINCT with WHERE
In this example:
The
SELECT DISTINCT
statement retrieves a list of unique departments where employees have a salary greater than 50,000.
Result:
Sales
IT
In this output:
Only the departments with employees earning more than 50,000 are included in the result, and duplicates are removed.
Example 2: Using DISTINCT with ORDER BY
In this example:
The
SELECT DISTINCT
statement retrieves unique departments and sorts them in ascending order.
Result:
HR
IT
Marketing
Sales
In this output:
The unique department names are displayed in alphabetical order.
Example 3: Using DISTINCT with LIMIT
In this example:
The
SELECT DISTINCT
statement retrieves unique departments, sorts them in ascending order, and limits the result to the top 3 rows.
Result:
HR
IT
Marketing
In this output:
The top 3 unique departments, sorted alphabetically, are displayed.
5. Handling NULL Values with DISTINCT
When using DISTINCT
, SQL treats NULL
values as unique. If a column contains NULL
values, the DISTINCT
keyword will include one NULL
in the result set, even if there are multiple NULL
values.
Example:
If the employees
table has the following data:
Sales
Marketing
IT
NULL
Sales
NULL
Result:
Sales
Marketing
IT
NULL
In this output:
The
DISTINCT
keyword removes duplicate values, including multipleNULL
entries, leaving only oneNULL
in the result set.
6. Practical Use Case: Fetching Unique Customer Locations
Consider a scenario where you want to retrieve a list of unique cities where your customers are located.
Example:
In this example:
The
SELECT DISTINCT
statement retrieves a list of unique cities from thecustomers
table, ensuring that each city appears only once.
Result:
New York
Los Angeles
Chicago
Houston
This query helps you identify all the different cities where your customers are based, without duplicate entries.
Conclusion
The DISTINCT
keyword in SQL is a powerful tool for ensuring that your query results contain only unique values. Whether you're working with a single column or multiple columns, DISTINCT
helps you eliminate duplicates and retrieve clean, unique data. By combining DISTINCT
with other SQL clauses, you can create refined and efficient queries that return the exact data you need.
This tutorial is brought to you by codes with pankaj.
Last updated