关于表中重复数据的处理--MySQL

2018-03-09  本文已影响0人  yulong_vip

表结构及数据如下

SELECT * FROM test!
test.png

一. 根据字段查询表中重复的数据
1.根据字段test1,test2,查询表中重复的数据

SELECT * FROM test a WHERE (a.test1,a.test2) IN (SELECT test1,test2 FROM test GROUP BY test1,test2 HAVING COUNT(*)>1)
结果如下 1.png

2.根据字段test1,test3,查询表中重复的数据

SELECT * FROM test a WHERE (a.test1,a.test3) IN (SELECT test1,test3 FROM test GROUP BY test1,test3 HAVING COUNT(*)>1)
结果如下 2.png

二.删除表中重复数据,只保留一条
1.Mysql使用变量方法实现行号功能 (类似于Oracle中的rowid函数)

SELECT @rowid:=@rowid+1 as rowid,test.* FROM (select @rowid:=0) test,test
结果如下 2.1.png

2.结合以上方法,即可实现删除表中重复数据,只保留rowid最小的一行

根据字段test1,test3,查出需要删除的数据rowid:

SELECT * FROM
(SELECT @rowid:=@rowid+1 as rowid,test.* FROM (select @rowid:=0) test,test) c
WHERE rowid not in 
(select min(rowid) from (SELECT @rowid:=@rowid+1 as rowid,test.* FROM (select @rowid:=0) test,test)b group by b.test1,b.test3 having count(*)>=1) 
结果如下 2.2.png

根据字段test1,test3,删除重复数据

DELETE FROM 
(SELECT @rowid:=@rowid+1 as rowid,test.* FROM (select @rowid:=0) test,test) t 
WHERE rowid not in 
(select min(rowid) from (SELECT @rowid:=@rowid+1 as rowid,test.* FROM (select @rowid:=0) test,test)b group by b.test1,b.test3 having count(*)>=1) 

尴尬!设想中的以上写法为错误写法:原因是:更新数据时使用了查询,而查询的数据又做了更新的条件,mysql不支持这种方式。使用虚拟表的情况下再加一层封装发现仍无法规避这个问题(实体表则可以)?

尝试多次后,无奈使用折中解决办法,直接创建新表,把去除重复后的数据插入新表中,再删除旧表;

create temporary table temp1 as SELECT test1,test2,test3,test4 FROM 
(SELECT @rowid:=@rowid+1 as rowid,test.* FROM (select @rowid:=0) test,test) t 
WHERE rowid in 
(select min(rowid) from (SELECT @rowid:=@rowid+1 as rowid,test.* FROM (select @rowid:=0) test,test)b group by b.test1,b.test3 having count(*)>=1) 
SELECT * FROM temp1
结果如下 2.3.png

显然这个方法存在效率问题,如有更好解决方法请指教;

上一篇下一篇

猜你喜欢

热点阅读