數(shù)據(jù)庫根據(jù)指定字段去重

數(shù)據(jù)庫根據(jù)指定字段去重

需求:對一張用戶表根據(jù)name/email/card_num字段去除重復(fù)數(shù)據(jù);

思路:用group by方法可以查詢出’去重’后的數(shù)據(jù),將這些數(shù)據(jù)存儲到一張臨時表中,然后將臨時表的數(shù)據(jù)存儲到指定的表中;

誤區(qū)及解決方案:group by方法只能獲取部分字段(去重指定字段),不能一次獲取到完整的數(shù)據(jù),但是可以通過max函數(shù)獲取group by結(jié)果集中的id,再根據(jù)id集合查詢出全部的記錄。

測試思路

  • 查詢?nèi)ブ睾蟮臄?shù)據(jù)

select max(id) as id,name,email,card_num FROM users GROUP BY name,email,card_num;

  • 從去重后的數(shù)據(jù)中獲取id集合

SELECT ID from (SELECT max(id) as id,name,email,card_num FROM users ?GROUP BY name,email,card_num) as T;

  • 根據(jù)去重后的數(shù)據(jù)中獲取id集合,從源數(shù)據(jù)中獲得記錄列表

SELECT * from users ?where id in (SELECT ID from (SELECT max(id) as id,name,email,card_num FROM users GROUP BY name,email,card_num) as T);

實際方法

  • 根據(jù)去重后的數(shù)據(jù)中獲取id集合,從源數(shù)據(jù)中獲得記錄列表,將這些列表數(shù)據(jù)存入一個臨時表中

create TEMP table tmp_data as SELECT * from users where id in (SELECT ID from (SELECT max(id) as id,name,email,card_num FROM users GROUP BY name,email,card_num) as T);

  • 將臨時表中的數(shù)據(jù)存入指定的數(shù)據(jù)表中,完畢

insert into users_copy1 select * from tmp_data;

檢測

  • 檢測結(jié)果是不是和第一步查詢?nèi)ブ睾蟮臄?shù)據(jù)總數(shù)相同

select count(*) from users_copy1;

測試結(jié)果:1.4w條數(shù)據(jù)中有2300條數(shù)據(jù)重復(fù),實際運行結(jié)果為0.7s,基本滿足現(xiàn)在的需求。

更多mysql相關(guān)技術(shù)文章,請訪問MySQL教程欄目進行學(xué)習(xí)!

? 版權(quán)聲明
THE END
喜歡就支持一下吧
點贊7 分享