有些站长朋友使用wordpress做采集站,不断的云采集各类文章自动发布到自己的网站上。但采集站最大的问题就是会采集到很多重复文章。这时,我们需将采集过来的重复文章进行去重处理。

以下是二种一次性去除标题重复文章的方法:

  1. 去除重复文章,只保留一篇
    
    		
    CREATE TABLE my_tmp AS SELECT MIN(ID) AS col1 FROM wp_posts GROUP BY post_title; DELETE FROM wp_posts WHERE ID NOT IN (SELECT col1 FROM my_tmp); DROP TABLE my_tmp;
  2. 去除重复文章,一片都不保留
    
    		
    CREATE TABLE my_tmp AS Select ID AS col1 From wp_posts Where post_title In (Select post_title From wp_posts Group By post_title Having Count(*)>2); DELETE FROM wp_posts WHERE ID IN (SELECT col1 FROM my_tmp);  DROP TABLE my_tmp;
  3. 另一种去除所有重复文章的方法
    
    		
    CREATE TABLE my_tmp AS Select ID AS col1 From wp_posts Where post_title In (Select post_title From wp_posts Group By post_title Having Count(*)>2);DELETE FROM wp_posts WHERE ID IN (SELECT col1 FROM my_tmp); DROP TABLE my_tmp;

操作方法很简单,只需将上面的SQL语句,放到自己网站数据库的 SQL框里,然后执行就可以了。(注意:操作之前,请先进行网站备份)