mysql 知识库

记一次MySql去重的经历

2018-07-26  本文已影响141人  始终我是我

整个表target_table大概有14000W条记录,有个主键id,需要去重的字段是mac.

  1. 查看重复的记录数

    SELECT
     min( id ) AS mid,
     mac,
     count ( * ) 
    FROM
     target_table 
    GROUP BY
     mac 
    HAVING
     count( mac ) > 1;
    

    整个表重复的数据2w+

  2. 创建临时表

    CREATE TABLE dup_tmp (
    SELECT
     min( id ) AS mid,
     mac,
     count ( * ) 
    FROM
     target_table 
    GROUP BY
     mac 
    HAVING
     count( mac ) > 1 
     );
    

    将重复的mac筛选到临时表

  3. 删除多余的数据保留最小的id记录

    DELETE target 
    FROM
     target_table AS target
     JOIN dup_tmp AS tmp ON target.mac = tmp.mac 
    WHERE
     target.id > tmp.mid;
    

    刚开始的时候mac字段没有建索引,删除很慢。建了索引删除就很快了。

上一篇下一篇

猜你喜欢

热点阅读