Wednesday, 11 July 2012

Finding Duplicate and similar records in Oracle Table

One of the things I have recently learnt is a way to find duplicates in Oracle tables. I have seen ways to do this but have not come across the method I am going to put in this blog so hopefully some of you will find it useful.

The technique is to use aliases rather than counts (which I have seen on other forums) to select from the table twice.

The basic query is


  SELECT t1.column_name,
FROM table t1, table t2
where t1.column_name = t2.column_name,
and t1.column_ID <> t2.column_ID

This means that we are checking if there exist records that share the column value we are testing (column_name) but have a different ID/primary key.

The real problem I faced however was finding records where part of the value was a duplicate. In this case I needed to identify values such as example_xxx and example_yyy with the yyy ending being an incorrect entry.

To do this I used the replace function so the query looks like this


  SELECT t1.column_name,  t2.column_name
FROM table t1, table t2
where replace(t1.column_name, '_xxx', '') = replace(t2.column_name,'_yyy',''),
and t1.column_ID <> t2.column_ID






No comments:

Post a Comment