How to Remove Duplicate Records Except a Single Record in MySQL? - Coding

In MySQL, remove duplicate records while preserving one representative entry using a DELETE statement with a self-join. Match records based on criteria like identical values in a column and delete duplicates based on conditions, such as retaining the record with the lowest ID, ensuring data cleanliness, and reducing redundancy in the database.

Managing data efficiently is crucial in any database system, and dealing with duplicate records is a common challenge. In this article, we will explore a practical approach to remove duplicate records except a single record in MySQL. This process is essential for maintaining data integrity and ensuring that your database contains unique and relevant information.

Eliminating Duplicate Records Except One in MySQL

When working with databases, it’s not uncommon to encounter situations where duplicate records need to be cleaned up. The process involves identifying redundant data based on specific criteria and keeping only one instance of each unique record. In MySQL, this can be achieved using a combination of the DELETE statement and self-joins.

Syntax:

The main concept revolves around using a self-join to compare records within the same table and then selectively delete duplicate entries. Here’s the basic syntax:

DELETE t1 FROM table_name t1

JOIN table_name t2 ON t1.column_name = t2.column_name AND t1.primary_key > t2.primary_key;

DELETE: Specifies the action to be taken, which is the removal of records.
table_name: The name of the table containing duplicate records.
t1 and t2: Aliases for the same table, enabling a self-join.
column_name: The column used to identify duplicate records.
primary_key: The primary key column of the table, ensuring we keep one instance.

Example: Eliminating Duplicate Records Keeping Lowest ID in MySQL

Duplicate records in a MySQL table can lead to data inconsistencies. This example demonstrates how to remove duplicate records from the ’employee’ table, preserving only the records with the lowest ID for each unique name.

Consider a table named employee with columns id, name, and salary. We’ll insert some sample data, including duplicate records.

-- Table creation and data insertion
CREATE TABLE employee (
    id INT PRIMARY KEY,
    name VARCHAR(50),
    salary INT
);

INSERT INTO employee (id, name, salary) VALUES
    (1, 'John', 50000),
    (2, 'Alice', 60000),
    (3, 'Bob', 55000),
    (4, 'Alice', 60000),
    (5, 'Charlie', 70000),
    (6, 'John', 50000);

Now, let’s use the DELETE statement to remove duplicate records, keeping only the one with the lowest id.

-- Remove duplicate records, keeping the one with the lowest id
DELETE e1 FROM employee e1
JOIN employee e2 ON e1.name = e2.name AND e1.id > e2.id;

Output:

Before deleting the duplicate record:

Before deleting the duplicate record.

This shows the initial state of the employee table with duplicate records. Notably, there are entries for ‘Alice’ and ‘John’ with the same salary.

AFTER deleting the duplicate record:

After deleting the duplicate record.

Explanation:

This depicts the result after executing the DELETE statement. Duplicate records, such as the second ‘Alice’ and the second ‘John,’ have been removed, leaving only one instance of each unique record based on the lowest id.

To put it simply, becoming adept at the task of removing duplicate records, leaving only one unique entry in MySQL, is a valuable skill in keeping databases organized and efficient. The combination of self-joins with the DELETE statement allows you to confidently get rid of redundancy, promoting data accuracy and ensuring optimal performance for your database. Success in this process hinges on being adaptable to different table structures and carefully considering the criteria for removal. Ultimately, a well-maintained database forms the bedrock for robust applications, making these techniques indispensable for anyone engaged in database management or development.

Conclusion

The DELETE statement in MySQL can be used with a self-joined query to get rid of duplicate records while keeping a single representative record. The query checks records for matches in a column and deletes duplicate records based on conditions such as keeping the record with the least ID. This helps to keep data clean, reduce redundancy, and improve database performance, especially when dealing with repetitive entries.

FAQs on How to Remove Duplicate Records Except a Single Record in MySQL

What is the purpose of removing duplicate records in MySQL?

Removing duplicate records helps maintain data integrity, reduce redundancy, and improve database performance by ensuring that each piece of information is stored only once.

Can I use other criteria besides the primary key to determine which duplicate to keep?

Yes, you can modify the self-join condition to use other criteria, such as keeping the record with the latest timestamp or the highest value in a specific column.

Is it possible to automate the process of removing duplicate records in MySQL?

Yes, you can create a stored procedure to automate the process of identifying and removing duplicate records based on your specific criteria.

Reffered: https://www.geeksforgeeks.org

Databases

Related
How to Select Row With Max Value in SQL?
How to Get Multiple Counts With Single Query in MySQL
How to Get Last Records in One to Many Relationship in MySQL
How to Insert a Value that Contains an Apostrophe in SQL?
How to Delete Duplicate Rows in MySQL?

Type:	Geek
Category:	Coding
Sub Category:	Tutorial
Uploaded by:	Admin
Views:	12