SQL Server Instead Of trigger to prevent duplicates

You might face a scenario where you’re using a REST Web Service or WebAPI where multiple hits of the same request to the API from the User interface is causing duplicate inserts. As the hits come to the API at the same date time-stamp, the API check fails at the database level as that record won’t exist in the table at that point in time.

To prevent the duplicate from getting inserted, we have multiple options at the Database level and one of them is using Instead Of trigger.

Using the INSTEAD Of trigger, you can conditionally choose to INSERT into the table or take some other action as per the requirement.

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER TRIGGER [dbo].[PREVENT_DUP_INSERT] ON [dbo].[tbl_submission]
INSTEAD OF INSERT
AS
BEGIN
	SET NOCOUNT OFF;
	IF NOT EXISTS(
		SELECT 1 FROM dbo.tbl_submission
		WHERE question_uid=(SELECT inserted.q_uid FROM inserted)
		AND user_survey_instance_id=(SELECT user_inst_id FROM inserted)
	)
	BEGIN
		INSERT INTO dbo.tbl_submission(user_id,q_uid,[value],[group],user_inst_id,created_at,updated_at)
		SELECT user_id,q_uid,[value],[group],user_inst_id,created_at,updated_at FROM inserted
	END
	SELECT [id] FROM [dbo].[tbl_submission] WHERE @@ROWCOUNT > 0 AND [id] = scope_identity();
END

An id must be returned by the body of the INSTEAD OF trigger. This is required especially if you’re using an ORM like Entity Framework which may be give concurrency related exception.

You can also choose to apply UNIQUE constraint on select columns to prevent duplicates and handle the Insert exceptions in the API or Trigger itself. But in my case, since there are already few duplicate insertions and one of the columns was VARCHAR(MAX) which does not allow creating UNIQUE indexes.

Find word in SQL Server Varchar variable

I’ll be discussing two ways by which you can search for a string in a SQL Server variable:

Using CHARINDEX() Method:

This function is used to return the position of a substring in string. It’ll return 0 if the substring is not found.
This start position value of the word can be useful for other calculations as well.

DECLARE @strWord nvarchar(50)='My Test String'  

IF CHARINDEX('Test String',@strWord) > 0   
   PRINT 'Found'
ELSE  
    PRINT 'Not Found'
	
Output:
Found

Use the Index value in another variable for other calculations with CHARINDEX as below:

DECLARE @iDex INT
SELECT @iDex=CHARINDEX('Test String',@strWord)

Another way is to use the LIKE operator in SQL Server. This operator is also used in SQL queries’ WHERE clause e.g.

SELECT * FROM Tbl_Name WHERE mainStringCol LIKE '%Test String%'

The above query will fetch all rows where the mainStringCol contains the text “Test String”.

Now, to find a string in a Varchar variable using the LIKE operator, e.g. below:

DECLARE @txtMailBody nvarchar(MAX)='  Hello Mr Anderson...'
IF @txtMailBody LIKE '%Hello Mr%'
	PRINT 'Hello'
ELSE IF @txtMailBody LIKE '%Wad Up%'
	PRINT 'Wad Up'
ELSE
	PRINT 'Goof Up!'

Output:
Hello

Finding duplicates in a table by grouping columns SQL Server

The GROUP By clause is used in a SELECT query to group data in rows based on certain columns.
It is often used with aggregate functions (COUNT, MAX, MIN, SUM, AVG) to group the result-set.

To find duplicates in SQL Server table using GROUP BY clause based on multiple columns say name and email as below:

Id NAME EMAIL
1 John John.Smith@abc.com
2 James James.Connor@abc.com
3 Bob Bob.Smith@abc.com
4 Robin Robin.Sharma@abc.com
5 Bob Bob.Smith@abc.com
6 John John.Smith@abc.com

SELECT name, email, COUNT(*) AS cnt FROM userdetails
GROUP BY name, email
HAVING COUNT(*)>1

Output:

NAME EMAIL cnt
John John.Smith@abc.com 2
Bob Bob.Smith@abc.com 2

SQL Server (as at SQL Server 2017) still requires all non-aggregated columns in the GROUP BY.