Sometimes we don’t want to have reptitions of certain values across multiple tuples in our dataset.
For instance, we might not want to see an email address repeated more than once and simply using
DISTINCT is not an option.
In this case, I will invariably use a CTE to remove the offending repetitions.
This solution will use
int ROW_NUMBER(), which returns an integer than increments (from 1, by 1) as it traverses the dataset.
It is also a windowed function such that we can partition our dataset along identical values if we so wish.
Suppose we include
ROW_NUMBER() like so:
SELECT Email , ROW_NUMBER() OVER (PARTITION BY Email ORDER BY Email) AS rn FROM dbo.Account;
This means we can delete any row where
RN > 1.
However, we cannot use
Therefore we use a CTE.
WITH cteDedupe AS ( SELECT ROW_NUMER() OVER (PARTITION BY Email ORDER BY Email) AS rn FROM dbo.Account ) DELETE cteDedupe WHERE rn > 1;