How to Delete Duplicate Rows in SQL using Reliable Solution?

  • Written By  

  • Published on December 2nd, 2021

  • Updated on September 4, 2023

5/5 - (1 vote)

Are you searching for a reliable solution to delete duplicate rows in SQL? Do you also face the issue of the duplicacy of rows in SQL? If yes, then you will find the apt solution to your query in this blog.

The issue of duplicate rows in SWL is not new. SQL users often face this problem that hinders their working. The duplicate rows issue in SQL can be avoided using a primary key, identity columns, clustered and non-clustered indexes, constraints, etc. Perhaps, there are some exceptions where these rules are not implemented, and you face this error.

As per the SQL database rules, there should not be any duplicacy. Therein, to avoid the issues and delete duplicates SQL, follow the best solutions provided further. But, before that, let us take a look at why it is essential to delete duplicate records in SQL server 2008 and other versions.

Why is it Necessary to Delete Duplicate Rows in SQL?

  • The duplicate rows in SQL lead to the wastage of the memory space and eventually affect the database’s functioning.
  • Duplicate rows contain inaccurate data that hampers the search results.
  • Users find it difficult to fetch out the data due to the presence of duplicate rows with unnecessary data.
  • When Users perform the task on the database files they may start troubling.

All the above-listed points show us the necessity to prevent the duplicacy of rows in SQL. So, follow the methods given below to delete duplicate records in SQL without a primary key.

How to Delete Duplicate Records in SQL Query?

The duplicate rows and records in SQL can be removed manually by applying different clauses. All the accurate methods to delete duplicate rows in SQL are mentioned below. However, you are required to create a sample table and data for the process execution.

Creating a Sample Table

Use the below commands in the SQL database to create the sample data. It creates a table in SQL that we would utilize for demonstration for resolving the duplicate rows in the SQL issue.

create table original_table (key_value int )


insert into original_table values (1)

insert into original_table values (1)

insert into original_table values (1)

insert into original_table values (2)

insert into original_table values (2)

insert into original_table values (2)


insert into original_table values (2)

Let us now start with the methods to delete duplicate records in SQL server 2000 and other versions.

  • Deleting Duplicate Rows using Group By and Having Clause

It is a method in which we will first use the Group By clause to identify the duplicacy in the table. The data in the table will be grouped as per the defined columns. Afterward, we will proceed to delete the identified duplicate rows in SQL. Input the below commands in the SQL database to delete duplicate rows in SQL.

SELECT DISTINCT *

INTO duplicate_table

FROM original_table

GROUP BY key_value

HAVING COUNT(key_value) > 1

DELETE original_table

WHERE key_value

IN (SELECT key_value

FROM duplicate_table)

INSERT original_table

SELECT *

FROM duplicate_table


DROP TABLE duplicate_table

As you complete the above task, all the duplicate rows and records will be deleted. Furthermore, to execute this method without any errors, you need to have sufficient storage space available in the database. If you cannot delete duplicate records in SQL without a primary key using this method, follow the other solution below.

  • Delete Duplicate Rows in SQL using Common Table Expressions (CTE)

The Common Table Expressions, generally known as CTE, can be used to delete duplicate records in SQL server 2008 and other versions. In this method, you can apply the ROW_FUNCTION that is available from the SQL Server 2005. The clause makes the task to delete the duplicate rows in the SQL server much easier. Therefore, execute the below clause to complete this method in the SQL database.

DELETE T

FROM

(

SELECT *

,DupRank = ROW_NUMBER() OVER (

PARTITION BY key_value

ORDER BY (SELECT NULL)

)

FROM original_table

) AS T


WHERE DupRank > 1

As you apply the above script, it will first partition the data using the ROW_NUMBER function and then delete all the duplicate rows.

Now, if you compare the above two methods to delete duplicate rows in SQL, you will find the second method using the ROW_FUNCTION to be more efficient. The reasons for the same are given below.

  • In the second method, there is no requirement for temporarily copying the duplicate rows or records to another table.
  • You do not have to combine the original table.

The only drawback of this method is that it does not work with the outdated and older versions of SQL Servers.

  • Deleting Duplicate Rows using RANK Function

To describe this method effectively, we will guide you by creating a new sample table and data.

CREATE TABLE Employee

[ID] INT identity(1,1), 

[FirstName] Varchar(100), 

[LastName] Varchar(100), 

[Country] Varchar(100), 

GO 


Insert into Employee ([FirstName],[LastName],[Country] )values('ABC,'DEF’)

The RANK Function can be utilized to delete duplicate rows in SQL. It specifies a unique id for each row regardless of the duplicate rows. We would take the help of the Partition By clause also along with the RANK function for preparing the subset of data. So, follow the below set of commands in the SQL database to execute the process accurately.

SELECT E.ID, 

E.firstname, 

E.lastname, 

E.country, 

T.rank

FROM [SampleDB].[dbo].[Employee] E

INNER JOIN

(

 SELECT *, 

RANK() OVER(PARTITION BY firstname, 

lastname, 

country

ORDER BY id) rank

 FROM [SampleDB].[dbo].[Employee]

) T ON E.ID = t.ID;

Now, as you have ranked and identified the duplicate data using the above task. You can now move ahead with the deletion process using the clause mentioned below.

DELETE E

FROM [SampleDB].[dbo].[Employee] E

INNER JOIN

(

SELECT *, 

RANK() OVER(PARTITION BY firstname, 

lastname, 

country

ORDER BY id) rank

FROM [SampleDB].[dbo].[Employee]

) T ON E.ID = t.ID

WHERE rank > 1;

All the above methods to delete duplicate rows in SQL are competent to complete the task. However, it is recommended that you do not apply these methods and clauses directly to the data. It would be safe to test them on sample data before application.

Nonetheless, if you cannot execute the above methods delete duplicate records in SQL server 2000 and other versions. There may be a possibility that the SQL database file is corrupted. Moreover, the corruption in SQL databases also leads to data inaccuracy and duplicacy.

To fix this issue, there is an expert solution to recover and repair the SQL database files to remove any duplicacy. The Cigati SQL Recovery Tool is a robust utility that can repair the corrupt and damaged MDF and NDF files of the SQL database. The utility resolves any error of the SQL database that may be occurring due to the damage in the SQL files. Furthermore, the software is crafted with exemplary features that make the recovery process smooth.

Download Now

Conclusion

Duplicate rows in SQL create data inaccuracy issues for the users. It has become essential to tackle the issue and fix it. To delete duplicate rows in SQL, use the manual solutions mentioned in the blog. All the methods are capable to undertake the duplicacy removal task in SQL. Nevertheless, if there is corruption in the SQL files and you are facing issues because of that, it is suggested to opt for the SQL Recovery Tool. It is a leading tool to recover and repair corrupt SQL database files, including all objects like tables, views, programmability, triggers, etc.

You May Also Read: Reset SA Password in SQL Server

About The Author:

Rohit Singh is an Email Backup, Recovery & Migration Consultant and is associated with Software Company from the last 3 years. He writes technical updates and their features related to MS Outlook, Exchange Server, Office 365, and many other Email Clients & Servers.

Related Post

100% safe and secure100% Safe & SECURE
SupportLifetime Support (24X7)
Money BackMoney Back Policy
Trusted by CustomersTrusted by 10000+ Customers