In sweet memories of my ever loving brother "kutty thambi " ARUN KUMAR

Tuesday, August 11, 2009

delete a duplicate rows in a oracle table

for example here, user HR is having table emp,

HR@prod SQL>select * from emp;

EMPID ENAME SAL DEPTNO
---------- -------------------- ---------- ----------
10005 palani 54544 10
10001 Scott Tiger 1000 40
10002 Frank Naude 500 20
10003 rajesh 21000 140
10004 ramesh 3500 220
10005 suresh 34567 123

6 rows selected.

to delete duplicate rows in empid column
in emp table
SQL>delete from emp where rowid not in
(select max(rowid) from emp
group by empid);

1 row deleted.

SQL>select * from emp;

EMPID ENAME SAL DEPTNO
---------- -------------------- ---------- ----------
10001 Scott Tiger 1000 40
10002 Frank Naude 500 20
10003 rajesh 21000 140
10004 ramesh 3500 220
10005 suresh 34567 123


to delete the old duplicate row from the table
instead of max(rowid) replace min(rowid)
for example,

HR@prod SQL>insert into emp
values(10005,'palani',54544,10);

1 row created.

HR@prod SQL>select * from emp;

EMPID ENAME SAL DEPTNO
---------- -------------------- ---------- ----------
10005 palani 54544 10
10001 Scott Tiger 1000 40
10002 Frank Naude 500 20
10003 rajesh 21000 140
10004 ramesh 3500 220
10005 suresh 34567 123

6 rows selected.

HR@prod SQL>delete from emp where rowid not in
(select min(rowid) from emp
group by empid);

1 row deleted.

HR@prod SQL>select * from emp;

EMPID ENAME SAL DEPTNO
---------- -------------------- ---------- ----------
10005 palani 54544 10
10001 Scott Tiger 1000 40
10002 Frank Naude 500 20
10003 rajesh 21000 140
10004 ramesh 3500 220


thanks,


4 comments:

Anonymous said...

Could you please explain how this gets accomplished ?

select(max(rowid) from emp group by empid;

>> Ok .. this selects the max rowid

Having selected this ; how does it detect the duplicates ? By comparing the index and table ? Could you elaborate ?

Tnx.

Rajeshkumar Govindarajan said...

hi friend
thanks for reading my blog

the internal between duplicate rows is as follows,

HR@prod SQL>select max(rowid),empid from emp
2 group by empid;

MAX(ROWID) EMPID
------------------ ----------
AAAOO/AAEAAABOGAAD 10004
AAAOO/AAEAAABOGAAB 10002
AAAOO/AAEAAABOGAAA 10001
AAAOO/AAEAAABOEAAD 10005
AAAOO/AAEAAABOGAAC 10003

HR@prod SQL>select rowid,empid from emp;

ROWID EMPID
------------------ ----------
AAAOO/AAEAAABOEAAC 10005
AAAOO/AAEAAABOEAAD 10005
AAAOO/AAEAAABOGAAA 10001
AAAOO/AAEAAABOGAAB 10002
AAAOO/AAEAAABOGAAC 10003
AAAOO/AAEAAABOGAAD 10004

6 rows selected.

HR@prod SQL>delete from emp where rowid not in
2 (select max(rowid) from emp
3 group by empid);

1 row deleted.

HR@prod SQL>select rowid,empid from emp;

ROWID EMPID
------------------ ----------
AAAOO/AAEAAABOEAAD 10005
AAAOO/AAEAAABOGAAA 10001
AAAOO/AAEAAABOGAAB 10002
AAAOO/AAEAAABOGAAC 10003
AAAOO/AAEAAABOGAAD 10004

Anonymous said...

Thank you for depicting the flow.
BUT on what basis does the delete know which of the '10005' option to delete ? Its randomly deleting one ?
So indexes are not used in this in any way ?

Dmitry said...

I know easier ways to migrate Oracle database and fix possible errors in the same time. you may try the recovery 2005 sql database program. I periodically use this application in our company to fix database errors

 
Share/Bookmark