<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Pure Performance Inc &#187; duplicate rows</title>
	<atom:link href="http://www.pure-performance.com/tag/duplicate-rows/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.pure-performance.com</link>
	<description>Web and PeopleSoft Consulting</description>
	<lastBuildDate>Mon, 23 Jan 2012 02:22:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Removing duplicate rows (SQL)</title>
		<link>http://www.pure-performance.com/2009/02/removing-duplicate-rows-sql/</link>
		<comments>http://www.pure-performance.com/2009/02/removing-duplicate-rows-sql/#comments</comments>
		<pubDate>Fri, 06 Feb 2009 15:34:58 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[MySQL]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[duplicate rows]]></category>

		<guid isPermaLink="false">http://pure-performance.com/?p=50</guid>
		<description><![CDATA[At least once in your career you will have to deal with duplicate rows causing havic on you application.  This post will help you get through that delema.  I have Oracle and MySql examples below.]]></description>
			<content:encoded><![CDATA[<div style="float:right;display:inline;margin:0px 0px 0px 0px;"><script type="text/javascript"><!--
google_ad_client = "ca-pub-7557653265381688";
/* PPI-200x200 */
google_ad_slot = "2387055907";
google_ad_width = 200;
google_ad_height = 200;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></div><p><img class="alignleft size-full wp-image-58" title="Oracle MySql Duplicates" src="http://pure-performance.com/wp-content/uploads/2009/02/oraclemysqlo.jpg" alt="Oracle MySql Duplicates" width="100" height="78" />At least once in your career you will have to deal with duplicate rows causing havoc on you application.  This post will help you get through that dilema.  I have Oracle and MySql examples below.</p>
<h2>ORACLE Specific</h2>
<p>With Oracle with have the luxury of <a title="Rowid Defined" href="http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/pseudocolumns008.htm" target="_blank">ROWID </a>which uniquely identifies a row in an Oracle table. We will use this pseudo column to remove the duplicates.</p>
<h4>Simple Method</h4>
<p>This is the simplest method which removes the latest duplicate row added. If you want to remove the earliest you need to change  MAX to MIN and replace &#8220;less than&#8221;(&lt;) with &#8220;greater than&#8221;(&gt;).</p>
<blockquote><p>DELETE FROM [TABLE] A<br />
WHERE ROWID &lt;  ( SELECT max(ROWID)<br />
FROM [TABLE] B<br />
WHERE A.[PRIMARY KEY FIELDS] = B.[PRIMARY KEY FIELDS]);</p></blockquote>
<h4>Another method with constraints</h4>
<p>This is another method you can use if you have constraints. This method uses the Oracle function &#8220;<a title="Exists Oracle function" href="http://download.oracle.com/docs/cd/B14117_01/server.101/b10759/conditions009.htm#sthref943" target="_blank">Exists</a>&#8221; which checks for the existence in a sub-query.</p>
<blockquote><p>DELETE FROM [TABLE] A<br />
WHERE CONTRAINT = [VARIABLE]<br />
AND EXISTS ( SELECT &#8216;X&#8217;<br />
FROM [TABLE] B<br />
WHERE A.[PRIMARY KEY FIELDS] = B.[PRIMARY KEY FIELDS]<br />
AND A.ROWID &lt; B.ROWID);</p></blockquote>
<h2>MYSQL <small> (These would also work in Oracle )</small></h2>
<p>With MySQL  we do not have the tools that Oracle has to easily find the duplicates. But most MySql tables use an auto-increment ID field that helps us to identify the duplicates.</p>
<h4>Simple Method</h4>
<p>As with the Oracle method use the &#8220;greater than&#8221; sign to to keep the earliest row entered. We can change this  to pull the latest  by replacing &#8220;greater than&#8221; with &#8220;less than&#8221;.</p>
<blockquote><p>DELETE A FROM [TABLE] as A, [TABLE] as B<br />
WHERE A.[UNIQUE FIELD(S)] = B.[UNIQUE FIELD(S)]<br />
AND A.ID &gt; B.ID;</p></blockquote>
<h4>Without the ID field</h4>
<p>Now it gets complicated in MySQL, we need to create a table with an unique ID then remove the duplicates. Once the dups have been remove  then  we can put the data back onto the original table.</p>
<blockquote><p><em>Drop the dups table if it exists.</em></p>
<p>DROP TABLE IF EXISTS [TABLE]_dups;</p>
<p><em>Create the dups table.  This table will be the base table(table with dups) with the ID added.</em></p>
<p>CREATE TABLE [TABLE]_dups (<br />
id INT(11) default NULL auto_increment,<br />
[ALL TABLE COLUMNS],<br />
PRIMARY KEY (id)<br />
);</p>
<p><em>Insert into the dups table from the base table.</em></p>
<p>INSERT INTO [TABLE]_dups<br />
SELECT NULL,[UNIQUE FIELD(S)]<br />
FROM [TABLE];</p>
<p><em>Delete the dups using the SQL we used in the prior example</em></p>
<p>DELETE A FROM [TABLE]_dups as A, [TABLE]_dups as B<br />
WHERE A.[UNIQUE FIELD(S)] = B.[UNIQUE FIELD(S)]<br />
AND A.ID &lt; B.ID;</p>
<p><em>Delete the Base table</em></p>
<p>DELETE FROM [TABLE];</p>
<p><em>Insert into the base table from the dups table.</em></p>
<p>INSERT INTO [TABLE]<br />
SELECT [all columns less the ID field]<br />
FROM [TABLE]_dups;</p>
<p><em>Remove the dups table</em></p>
<p>DROP TABLE [TABLE]_dups;</p></blockquote>
<p>I hope this posts help you when you run into this problem&#8230; and we all run into this problem.</p>
<div style="text-align:center;width:100%;"><div style="margin:0px 0px 0px 0px;"><script type="text/javascript"><!--
google_ad_client = "ca-pub-7557653265381688";
/* PPI-728x90 */
google_ad_slot = "9798683516";
google_ad_width = 728;
google_ad_height = 90;
//-->
</script>
<script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script></div></div>]]></content:encoded>
			<wfw:commentRss>http://www.pure-performance.com/2009/02/removing-duplicate-rows-sql/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

