How To Synchronise Data Across Two SQL Server Databases – Part 1. SQL Code And Application For Individual Objects Processing

September 17th, 2013 / No Comments » / by admin

Note: This series is comprised of two posts – this one and another one which can be viewed HERE.

Lately I have been working on a project which involves synchronising data across two different databases without using any proprietary solution. As you know, there are many great tool out there which can automate such process hassle-free, however, given the requirements set out by the client i.e. having to use existing SQL Server infrastructure with no third-party tools, the project posed some interesting challenges. To cut the long story short, solution architect in consultation with the client decided to settle on utilising log shipping but I also experimented with other approaches, one of those described below. Given the fact that the data required to be synchronised was heavily de-normalised and that all of the tables had a primary key constraint on them I took advantage of the SQL code encapsulated in a stored procedure, which was preliminarily developed by my colleague – Fredy. The code builds MERGE SQL statement on the fly based on database objects’ metadata and as long the table has a primary key constraint present, it automatically handles INSERT and UPDATE based on its content.

Let’s take a closer look at the possible scenario where such dynamic MERGE statement could be used. Suppose we have two tables, one created on a source database and one on target database. The source table has just been updated with some new data and also some of the old data has been modified. Target table, on the other hand, has some overlapping data, however, given that the source has is more up-to-date, target object requires to be synchronised to replicate the changes – both inserts and updates. To materialise this scenario on a database level, let’s execute the following SQL and create the above databases, objects and some dummy data.

USE [master]
GO
IF EXISTS (SELECT name FROM sys.databases WHERE name = N'Source_DB')
BEGIN
-- Close connections to the DW_Sample database
ALTER DATABASE [Source_DB] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
DROP DATABASE [Source_DB]
END
GO
CREATE DATABASE [Source_DB] 

IF EXISTS (SELECT name FROM sys.databases WHERE name = N'Target_DB')
BEGIN
-- Close connections to the DW_Sample database
ALTER DATABASE [Target_DB] SET SINGLE_USER WITH ROLLBACK IMMEDIATE
DROP DATABASE [Target_DB]
END
GO
CREATE DATABASE [Target_DB]

USE Source_DB
CREATE TABLE Source_Tbl (
ID int NOT NULL,
Sample_Data_Col1 varchar (50) NOT NULL,
Sample_Data_Col2 varchar (50) NOT NULL,
Sample_Data_Col3 varchar (50) NOT NULL)
GO

USE Target_DB
CREATE TABLE Target_Tbl (
ID int NOT NULL,
Sample_Data_Col1 varchar (50) NOT NULL,
Sample_Data_Col2 varchar (50) NOT NULL,
Sample_Data_Col3 varchar (50) NOT NULL)
GO

USE Source_DB
DECLARE @rowcount int = 0
WHILE @rowcount < 1000
	BEGIN
		SET NOCOUNT ON
		INSERT INTO Source_Tbl
		(ID, Sample_Data_Col1, Sample_Data_Col2, Sample_Data_Col3)
		SELECT 
		@rowcount, 
		'Sample_Data' + CAST(@rowcount as varchar(10)), 
		'Sample_Data' + CAST(@rowcount as varchar(10)), 
		'Sample_Data' + CAST(@rowcount as varchar(10))
		SET @rowcount = @rowcount + 1
	END
GO

USE Target_DB
DECLARE @rowcount int = 0
WHILE @rowcount < 1000
	BEGIN
		SET NOCOUNT ON
		INSERT INTO Target_Tbl
		(ID, Sample_Data_Col1, Sample_Data_Col2, Sample_Data_Col3)
		SELECT 
		@rowcount, 
		'Sample_Data' + CAST(@rowcount as varchar(10)), 
		'Sample_Data' + CAST(@rowcount as varchar(10)), 
		'Sample_Data' + CAST(@rowcount as varchar(10))
		SET @rowcount = @rowcount + 1
	END
GO

DELETE FROM Target_DB.dbo.Target_Tbl
WHERE ID >= 500

UPDATE Source_DB.dbo.Source_Tbl
SET Sample_Data_Col1 = 'Changed_Data'
WHERE ID < 10
UPDATE Source_DB.dbo.Source_Tbl
SET Sample_Data_Col2 = 'Changed_Data'
WHERE ID < 10
UPDATE Source_DB.dbo.Source_Tbl
SET Sample_Data_Col3 = 'Changed_Data'
WHERE ID < 10

CREATE UNIQUE CLUSTERED INDEX [Clustered_Idx_Id] ON Target_DB.dbo.Target_Tbl
([ID] ASC)
GO
CREATE UNIQUE CLUSTERED INDEX [Clustered_Idx_Id] ON Source_DB.dbo.Source_Tbl
([ID] ASC)
GO

SELECT COUNT(1) AS Target_Count FROM Target_DB.dbo.Target_Tbl
SELECT COUNT(1) AS Source_Count FROM Source_DB.dbo.Source_Tbl

SELECT COUNT(1) AS Records_Count_Difference FROM
(SELECT ID FROM Target_DB.dbo.Target_Tbl
INTERSECT
SELECT ID FROM Source_DB.dbo.Source_Tbl) AS Records_Count_Difference
 
SELECT COUNT(1) AS Data_Records_Count_Difference FROM
(SELECT * FROM Target_DB.dbo.Target_Tbl
INTERSECT
SELECT * FROM Source_DB.dbo.Source_Tbl) AS Data_Records_Count_Difference

When the two databases and tables have been created and populated, the Results pane should display the difference status as per image below.

DBMergeSync_Environment_SetUp1

You will notice that the difference for record count between the source and the target is 500 and that out of the data which should be overlapping, 10 records have different values i.e. 490 out of 500 records remained the same whereas 10 records have been changed in the source. This is a very rudimentary scenario and its simplicity is only for demonstration purposes.

Next, let’s assume that we want to synchronise the data between those two tables and databases to account for changed and new records in the target table. Below is the SQL for a stored procedure which accomplishes that, you can also download the SQL for this short series from HERE. As mentioned previously, the code builds MERGE SQL statement on the fly based on database objects’ metadata and as long the table has a primary key constraint present, it automatically handles INSERT and UPDATE based on its content.

USE [Source_DB]
GO
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[usp_DBSync]')
AND type IN (N'P',N'PC'))
DROP PROCEDURE [dbo].[usp_DBSync]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER OFF
GO

CREATE PROCEDURE [dbo].[usp_DBSync] 
(@Src_DB			varchar		(100), 
 @Tgt_DB			varchar		(100),
 @Src_Schema_Name	varchar		(10),
 @Tgt_Schema_Name	varchar		(10),
 @Src_Object_Name	varchar		(256),
 @Tgt_Object_Name	varchar		(256)) 
 AS

SET NOCOUNT ON

DECLARE
  @IsDebugMode bit,
  @ExecSQL nvarchar(max),
  @Err_Msg nvarchar (1000)

SET
  @IsDebugMode = 0


/*====================================================================================
								CREATE TEMP TABLES							
======================================================================================*/

CREATE TABLE #Src_Tgt_Tables
(
  [Data_Obj_Id] [int] NOT NULL,
  [Src_Tgt_Flag] [varchar](1) NOT NULL,
  [Object_Id] [int] NOT NULL,
  [Object_Name] [sysname] NOT NULL,
  [Schema_Name] [sysname] NOT NULL,
  [Schema_Object_Name] [varchar](260) NOT NULL,
  [Column_Id] [smallint] NULL,
  [Column_Name] [varchar](200) NULL,
  [IsIdentity] [tinyint] NULL,
  [IsComputed] [tinyint] NULL,
  [IsNullable] [tinyint] NULL,
  [Default] [varchar](max) NULL,
  [DataType] [varchar](152) NULL,
  [DataType_CastGroup] [varchar](134) NOT NULL,
  [Collation_Name] [sysname] NULL
)

CREATE TABLE #Tgt_NK_Cols
(
  [Data_Obj_Id] [int] NOT NULL,
  [Schema_Object_Name] [varchar](260) NOT NULL,
  [Where_Clause] [varchar](max) NULL,
  S_Schema_Object_Name [varchar](260) NOT NULL
)
IF @IsDebugMode = 1
  SELECT 
     [Source DB] = @Src_DB,
     [Target DB] = @Tgt_DB,
	 [Source_Schema_Name] = @Src_Schema_Name,
	 [Target_Schema_Name] = @Tgt_Schema_Name,
	 [Source_Objects_Name] = @Src_Object_Name,
	 [Target_Objects_Name] = @Tgt_Object_Name


/*====================================================================================
				PERFORM DATABASES, SCHEMAS AND OBJECTS CHECKS							
======================================================================================*/

DECLARE @CMD varchar (1024)
CREATE TABLE #Objects_List(
DatabaseName	sysname,
SchemaName		sysname,
ObjectName		sysname)

SET @CMD = 'USE [?]; SELECT DB_NAME() DATABASE_NAME, SCHEMA_NAME(schema_id),
NAME FROM sys.tables'

INSERT INTO #Objects_List
EXEC SP_MSFOREACHDB @CMD

DELETE FROM #Objects_List
WHERE DatabaseName IN ('master', 'msdb', 'tempdb', 'model')

IF NOT EXISTS (SELECT 1 FROM #Objects_List a WHERE a.databasename = @Src_DB)
BEGIN
  SET 
    @Err_Msg = 'Source database cannot be found. You nominated "' + @Src_DB + '". 
				Check that the database of that name exists on the instance' 
    RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END    

IF NOT EXISTS (SELECT 1 FROM #Objects_List a WHERE a.databasename = @Tgt_DB)
BEGIN
  SET 
    @Err_Msg = 'Target database cannot be found. You nominated "' + @Tgt_DB + '". 
				Check that the database of that name exists on the instance' 
    RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END    

IF NOT EXISTS (SELECT 1 FROM #Objects_List a WHERE a.SchemaName = @Src_Schema_Name)
BEGIN
  SET 
    @Err_Msg = 'Source schema cannot be found. You nominated "' + @Src_Schema_Name + '". 
				Check that the schema of that name exists on the database' 
    RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END   

IF NOT EXISTS (SELECT 1 FROM #Objects_List a WHERE a.SchemaName = @Tgt_Schema_Name)
BEGIN
  SET 
    @Err_Msg = 'Target schema cannot be found. You nominated "' + @Tgt_Schema_Name + '". 
				Check that the schema of that name exists on the database' 
    RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END  

IF NOT EXISTS (SELECT 1 FROM #Objects_List a WHERE a.ObjectName = @Src_Object_Name)
BEGIN
  SET 
    @Err_Msg = 'Source object cannot be found. You nominated "' + @Src_Object_Name + '". 
				Check that the object of that name exists on the database' 
    RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END 

IF NOT EXISTS (SELECT 1 FROM #Objects_List a WHERE a.ObjectName = @Tgt_Object_Name)
BEGIN
  SET 
    @Err_Msg = 'Target object cannot be found. You nominated "' + @Tgt_Object_Name + '". 
				Check that the object of that name exists on the database' 
    RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END 


/*====================================================================================
						EXTRACT SOURCE AND TARGET DATA							
======================================================================================*/
SET
   @ExecSQL =
   'INSERT INTO
	#Src_Tgt_Tables
	SELECT
	[Data_Object_Id]		=	so.id,
	[Src_Trg_Flag]			=	src_tgt.Src_Trg_Flag,
	[Object_Id]				=	so.id,
	[Object_Name]			=	so.name,
	[Schema_Name]			=	sh.name,
	[Schema_Object_Name]	=	''[' +@Src_DB+ '].[''+sh.name+''].[''+so.name+'']'',
	[Column_Id]				=	sc.column_id,
	[Column_Name]			=	''[''+sc.name+'']'',
	[IsIdentity]			=	sc.is_identity,
	[IsComputed]			=	sc.is_computed,
	[IsNullable]			=	sc.is_nullable,
	[Default]				=	dc.definition,
	[DataType]				=	(
								CASE
								WHEN T.system_type_id IN (167, 175, 231, 239) AND SC.max_length > 0 
								THEN T.Name + ''('' + CAST(SC.max_length AS varchar(10)) + '')''
								WHEN T.system_type_id IN (167, 175, 231, 239) AND SC.max_length = -1 THEN T.Name + ''(MAX)''
								-- For Numeric and Decimal data types
								WHEN T.system_type_id IN (106, 108) THEN T.Name + ''('' + CAST(SC.precision AS varchar(10)) + '', '' + 
								CAST(SC.scale AS varchar(10)) + '')''
								ELSE T.Name
								END
								),
	[DataType_CastGroup]	=	(
								CASE
								WHEN T.system_type_id IN (167, 175, 231, 239) THEN ''String''
								-- For Numeric and Decimal data types
								WHEN T.system_type_id IN (106, 108) THEN ''Numeric''
								ELSE ''Other''
								END
								),
	[Collation_Name]		=	SC.collation_name

	FROM
	'+@Src_DB+'.sys.sysobjects so (NOLOCK)
	INNER JOIN '+@Src_DB+'.sys.columns sc (NOLOCK) ON
    sc.object_id = so.id
    LEFT JOIN '+@Src_DB+'.sys.default_constraints dc (NOLOCK) ON
    dc.parent_object_id = so.id
    AND dc.parent_column_id = sc.column_id
	INNER JOIN '+@Src_DB+'.sys.types t (NOLOCK) ON
    t.user_type_id = sc.user_type_id
	INNER JOIN '+@Src_DB+'.sys.schemas sh (NOLOCK) ON
    sh.schema_id = so.uid
	INNER  JOIN (
				select  
				Data_Obj_Id = t.object_id, 
				t.name as phisical_name, 
				s.name as s_name  ,
				Src_trg_Flag = ''S'',
				Object_Type = ''U''
				FROM   '+@Src_DB+'.sys.tables t
				JOIN sys.schemas s ON t.schema_id = s.schema_id
				WHERE s.name = '''+@Src_Schema_Name+'''
				and t.name = '''+@Src_Object_Name+'''
				) src_tgt ON
	src_tgt.phisical_name = so.name and
    src_tgt.s_name = sh.name
    WHERE
	so.xtype = src_tgt.Object_Type'
	
IF @IsDebugMode = 1
  PRINT @ExecSQL
  
EXEC sp_executesql @ExecSQL
  
  SET
  @ExecSQL =
	'INSERT INTO
	#Src_Tgt_Tables
	SELECT
	[Data_Obj_Id]			=	src_tgt.Data_Obj_Id,
	[Src_Trg_Flag]			=	src_tgt.Src_Trg_Flag,
	[Object_Id]				=	so.id,
	[Object_Name]			=	so.name,
	[Schema_Name]			=	sh.name,
	[Schema_Object_Name]	=	''[' +@Tgt_DB+ '].[''+sh.name+''].[''+so.name+'']'',
	[Column_Id]				=	sc.column_id,
	[Column_Name]			=	''[''+sc.name+'']'',
	[IsIdentity]			=	sc.is_identity,
	[IsComputed]			=	sc.is_computed,
	[IsNullable]			=	sc.is_nullable,
	[Default]				=	dc.definition,
	[DataType]				=	(
								CASE
								WHEN T.system_type_id IN (167, 175, 231, 239) AND SC.max_length > 0 
								THEN T.Name + ''('' + CAST(SC.max_length AS varchar(10)) + '')''
								WHEN T.system_type_id IN (167, 175, 231, 239) AND SC.max_length = -1 THEN T.Name + ''(MAX)''
								-- For Numeric and Decimal data types
								WHEN T.system_type_id IN (106, 108) THEN T.Name + ''('' + CAST(SC.precision AS varchar(10)) + '', '' 
								+ CAST(SC.scale AS varchar(10)) + '')''
								ELSE T.Name
								END
								),
	[DataType_CastGroup]	=	(
								CASE
								WHEN T.system_type_id IN (167, 175, 231, 239) THEN ''String''
								-- For Numeric and Decimal data types
								WHEN T.system_type_id IN (106, 108) THEN ''Numeric''
								ELSE ''Other''
								END
								),
	[Collation_Name]		=	SC.collation_name
	FROM
	'+@Tgt_DB+'.sys.sysobjects so
	INNER JOIN '+@Tgt_DB+'.sys.columns sc ON
    sc.object_id = so.id
	LEFT JOIN '+@Tgt_DB+'.sys.default_constraints dc ON
    dc.parent_object_id = so.id
    AND dc.parent_column_id = sc.column_id
	INNER JOIN '+@Tgt_DB+'.sys.types t ON
    t.user_type_id = sc.user_type_id
	INNER JOIN '+@Tgt_DB+'.sys.schemas sh ON
    sh.schema_id = so.uid
	INNER JOIN (
				select  
				Data_Obj_Id = t.object_id, 
				t.name as phisical_name, 
				s.name as s_name,
				Src_trg_Flag = ''T'',
				Object_Type = ''U''
				FROM   '+@tgt_DB+'.sys.tables t
				JOIN sys.schemas s ON t.schema_id = s.schema_id
				WHERE S.name = '''+@Tgt_Schema_Name+'''
				and t.name = '''+@Tgt_Object_Name+'''
				) src_tgt ON
	src_tgt.phisical_name = so.name and
    src_tgt.s_name = sh.name
	WHERE
	so.xtype = ''U'' -- Table
	ORDER BY
	so.name,
	sc.column_id'
	
IF @IsDebugMode = 1
  PRINT @ExecSQL
 
EXEC sp_executesql @ExecSQL


/*====================================================================================
		ENSURE THAT SOURCE AND TARGET DETAILS ARE PRESENT IN TEMP TABLE							
======================================================================================*/

IF @IsDebugMode = 1
  SELECT [Table] = '#Src_Tgt_Tables', * FROM #Src_Tgt_Tables
 
IF (SELECT COUNT(*) FROM #Src_Tgt_Tables ST WHERE ST.Src_Tgt_Flag = 'S') < 1
BEGIN
	SET @Err_Msg = 'No Source table details found. Configured Source Database is "' + @Src_DB + '".'
    RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END

IF (SELECT COUNT(*) FROM #Src_Tgt_Tables ST WHERE ST.Src_Tgt_Flag = 'T') < 1
BEGIN
	SET @Err_Msg = 'No Target table details found. Configured Source Database is "' + @Tgt_DB + '".'
    RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END   


/*====================================================================================
				PREPARE 'WHERE' CLAUSE FOR THE MERGE STATEMENT							
======================================================================================*/

SET
  @ExecSQL = 
	'INSERT INTO
	#Tgt_NK_Cols
	SELECT
	TOP 1
	Data_Obj_Id				=	tgt.Data_Obj_Id,
	[Schema_Object_Name]	=	tgt.[Schema_Object_Name],
	[Where_Clause]			=	STUFF(REPLACE((SELECT
								''  AND'' + '' TGT.[''+ sc.name +''] =
								SRC.[''+ REPLACE(REPLACE(sc.name, ''<'', ''~''), ''>'', ''!'') + '']'' + CHAR(10)
								FROM
								'+@Tgt_DB+'.sys.sysindexkeys sik
								INNER JOIN '+@Tgt_DB+'.sys.syscolumns sc on
								sc.id = sik.id
								and sc.colid = sik.colid
								WHERE
								sik.id = si.object_id
								AND sik.indid = si.index_id
								ORDER BY
								sik.keyno
								FOR XML PATH('''')
								), ''&#x0D;'', ''''), 1, 5, ''''),
	[S_Schema_Object_Name]	=	(
								SELECT	Top 1
								S.Schema_Object_Name
								FROM
								#Src_Tgt_Tables S
								WHERE
								S.Src_Tgt_Flag = ''S'')
	FROM
	'+@Tgt_DB+'.sys.indexes si
	INNER JOIN '+@Tgt_DB+'.sys.sysobjects so ON
    so.id = si.object_id
	INNER JOIN '+@Tgt_DB+'.sys.schemas sh ON
    sh.schema_id = so.uid
	INNER JOIN (
				SELECT
				[Data_Obj_Id],
				[Object_Id],
				[Object_Name],
				[Schema_Name],
				[Schema_Object_Name]
				FROM
				#Src_Tgt_Tables
				WHERE
				Src_Tgt_Flag = ''T''
				) tgt ON
				tgt.[Object_Id] = so.id
	WHERE 
	si.is_unique = 1 /*Only Unique Index*/'
 
IF @IsDebugMode = 1
 PRINT @ExecSQL
 
EXEC sp_executesql @ExecSQL
 
IF @IsDebugMode = 1
  SELECT [Table] = '#Tgt_NK_Cols', * FROM #Tgt_NK_Cols
 

/*====================================================================================
				ENSURE THAT UNIQUE KEY INDEX IS PRESENT						
======================================================================================*/

IF EXISTS(SELECT 1 FROM #Tgt_NK_Cols NK WHERE NK.Where_Clause IS NULL)
BEGIN
      SET
        @Err_Msg = 'No Unique Key Index is found. 
					Configured Source Database is "' + @Tgt_DB + '".'
 
      RAISERROR (
        @Err_Msg  -- Message text.
        ,16 -- Severity.
            ,1 -- State.
  )
END


/*====================================================================================
							PREPARE MERGE STATEMENT							
======================================================================================*/

DECLARE
@MergeSQL		nvarchar(max),
@UpdateColSet	nvarchar(max),
@TargetColSet	nvarchar(max),
@SourceColSet	nvarchar(max),
@ValueColSet	nvarchar(max)

SELECT
@MergeSQL = '
MERGE '+NK.Schema_Object_Name+' TGT
USING (
  SELECT{SOURCE_COLUMN_SET}
  FROM
    '+NK.S_Schema_Object_Name+' SRC (NOLOCK)
  ) SRC ON
 '+ NK.Where_Clause+'
WHEN MATCHED THEN
  UPDATE SET{UPDATE_COLUMN_SET}
WHEN NOT MATCHED THEN
  INSERT({TARGET_COLUMN_SET}
      )
  VALUES ({VALUE_COLUMN_SET}
  )
OUTPUT $action INTO #SummaryOfChanges(Action_Name);
'
FROM
  #Tgt_NK_Cols NK
  SELECT
    @TargetColSet =	REPLACE(STUFF((SELECT ','+CHAR(10) 
					+ '    ' + CAST(TC.Column_Name as varchar(100))
					FROM
					#Src_Tgt_Tables TC (NOLOCK)
					LEFT JOIN (
								SELECT
								T.[Data_Obj_Id]
								,T.[Column_Name]
								,T.[IsNullable]
								,T.[Default]
								,T.[DataType]
								,T.[DataType_CastGroup]
								FROM
								#Src_Tgt_Tables T (NOLOCK)
								WHERE
								T.Src_Tgt_Flag = 'S'
								AND T.Data_Obj_Id = TS.S_Data_Obj_Id
								) SC 
					ON SC.Column_Name = TC.Column_Name
					WHERE
					TC.Src_Tgt_Flag = 'T'
					AND TC.Data_Obj_Id = TS.T_Data_Obj_Id
					--AND TC.IsIdentity <> 1 -- Ignore identity
					AND TC.IsComputed <> 1 -- and computed columns
					ORDER BY
					TC.Column_Id
					FOR XML PATH('')
					), 1, 1, ''), '&#x0D;', ''),

@SourceColSet =		REPLACE(STUFF((SELECT ','+CHAR(10) + '    ' + ISNULL('SRC.' + CAST(SC.Column_Name as varchar(100)), 
					CAST(TC.Column_Name as varchar(100))+' = '+ REPLACE(REPLACE('{'+TC.Column_Name+'}', '{[', '{'), ']}', '}'))
					FROM #Src_Tgt_Tables TC (NOLOCK)
					LEFT JOIN (
								SELECT
								T.[Data_Obj_Id]
								,T.[Column_Name]
								,T.[IsNullable]
								,T.[Default]
								,T.[DataType]
								,T.[DataType_CastGroup]
								FROM
								#Src_Tgt_Tables T (NOLOCK)
								WHERE
								T.Src_Tgt_Flag = 'S'
								AND T.Data_Obj_Id = TS.S_Data_Obj_Id
								) SC 
					ON SC.Column_Name = TC.Column_Name
					WHERE
					TC.Src_Tgt_Flag = 'T'
					AND TC.Data_Obj_Id = TS.T_Data_Obj_Id
					AND TC.IsIdentity <> 1 -- Ignore identity
					AND TC.IsComputed <> 1 -- and computed columns
					ORDER BY
					TC.Column_Id
					FOR XML PATH('')
					), 1, 1, ''), '&#x0D;', ''),

@UpdateColSet =		REPLACE(STUFF((SELECT ','+CHAR(10) + '    ' + CAST(TC.Column_Name as varchar(100))+' = SRC.' + ISNULL(SC.Column_Name, TC.Column_Name)
					FROM
					#Src_Tgt_Tables TC (NOLOCK)
					LEFT JOIN (
								SELECT
								T.[Data_Obj_Id]
								,T.[Column_Name]
								,T.[IsNullable]
								,T.[Default]
								,T.[DataType]
								,T.[DataType_CastGroup]
								FROM
								#Src_Tgt_Tables T (NOLOCK)
								WHERE
								T.Src_Tgt_Flag = 'S'
								AND T.Data_Obj_Id = TS.S_Data_Obj_Id
								) SC ON
					SC.Column_Name = TC.Column_Name
					WHERE
				    TC.Src_Tgt_Flag = 'T'
					AND TC.Data_Obj_Id = TS.T_Data_Obj_Id
					AND TC.IsIdentity <> 1 -- Ignore identity
					AND TC.IsComputed <> 1 -- and computed columns
					ORDER BY
					TC.Column_Id
					FOR XML PATH('')
					), 1, 1, ''), '&#x0D;', ''),
						

@ValueColSet =		REPLACE(STUFF((SELECT ','+CHAR(10) + '    ' + ISNULL('SRC.' + CAST(SC.Column_Name as varchar(100)), 
					'SRC.' + CAST(TC.Column_Name as varchar(100)))
					FROM
					#Src_Tgt_Tables TC (NOLOCK)
					LEFT JOIN (
								SELECT
								T.[Data_Obj_Id]
						        ,T.[Column_Name]
								,T.[IsNullable]
					            ,T.[Default]
								,T.[DataType]
								,T.[DataType_CastGroup]
							    FROM
							    #Src_Tgt_Tables T (NOLOCK)
						        WHERE
							    T.Src_Tgt_Flag = 'S'
							    AND T.Data_Obj_Id = TS.S_Data_Obj_Id
						        ) SC ON
			         SC.Column_Name = TC.Column_Name
				     WHERE
				     TC.Src_Tgt_Flag = 'T'
				     AND TC.Data_Obj_Id = TS.T_Data_Obj_Id
				     AND TC.IsIdentity <> 1 -- Ignore identity
				     AND TC.IsComputed <> 1 -- and computed columns
				     ORDER BY
				     TC.Column_Id
					 FOR XML PATH('')
					), 1, 1, ''), '&#x0D;', '')
FROM
	    (SELECT [T_Data_Obj_Id] = T.Data_Obj_Id,
        [T_Schema_Object_Name] = T.Schema_Object_Name,
        [S_Schema_Object_Name] = (
									SELECT Top 1 S.Schema_Object_Name
									FROM
						            #Src_Tgt_Tables S
							        WHERE
						            S.Src_Tgt_Flag = 'S'
								   ),
        [S_Data_Obj_Id]		= (
						          SELECT Top 1 S.Data_Obj_Id FROM
									#Src_Tgt_Tables S
									WHERE
						            S.Src_Tgt_Flag = 'S'
						          )
	      FROM #Src_Tgt_Tables T
	      WHERE
	      T.Src_Tgt_Flag = 'T'
		) TS

SELECT @MergeSQL = REPLACE(@MergeSQL, '{UPDATE_COLUMN_SET}', @UpdateColSet)
SELECT @MergeSQL = REPLACE(@MergeSQL, '{TARGET_COLUMN_SET}', @TargetColSet)
SELECT @MergeSQL = REPLACE(@MergeSQL, '{SOURCE_COLUMN_SET}', @SourceColSet)
SELECT @MergeSQL = REPLACE(@MergeSQL, '{VALUE_COLUMN_SET}', @ValueColSet)


/*====================================================================================
			EXECUTE MERGE STATEMENT AND CHECK FOR EXECUTION RESULTS							
======================================================================================*/

DECLARE
@UpdatedCount	int,
@InsertedCount	int,
@DeletedCount	int,
@StartTime		datetime,
@EndTime		datetime

CREATE TABLE #SummaryOfChanges (Action_Name VARCHAR(50));
SET @StartTime = GETDATE()

BEGIN TRY
	IF @IsDebugMode = 1
	BEGIN
		PRINT @MergeSQL
	END

		EXEC sp_executesql @MergeSQL

		SELECT
		@UpdatedCount	=	SUM(CASE WHEN Action_Name = 'UPDATE' THEN 1 ELSE 0 END),
		@InsertedCount	=	SUM(CASE WHEN Action_Name = 'INSERT' THEN 1 ELSE 0 END),
		@DeletedCount	=	SUM(CASE WHEN Action_Name = 'DELETE' THEN 1 ELSE 0 END)
		FROM
		#SummaryOfChanges
	
	IF @IsDebugMode = 1
	BEGIN
		SELECT @UpdatedCount as Records_Updated
		SELECT @InsertedCount as Records_Inserted
		SELECT @DeletedCount as Records_Deleted
	END
END TRY
BEGIN CATCH
END CATCH
SET NOCOUNT OFF;

In order to synchronise the data between the two previously created objects let’s run the usp_DBSync stored procedure (assuming you have already executed the above SQL) either from the context menu in Management Studio or by running the following code.

USE [Source_DB]
GO
DECLARE	@return_value int
EXEC	@return_value = [dbo].[usp_DBSync]
		@Src_DB = N'Source_DB',
		@Tgt_DB = N'Target_DB',
		@Src_Schema_Name = N'dbo',
		@Tgt_Schema_Name = N'dbo',
		@Src_Object_Name = N'Source_Tbl',
		@Tgt_Object_Name = N'Target_Tbl'
SELECT	'Return Value' = @return_value
GO

Finally, to check if the data has been synchronised successfully, let’s run the last few lines from the first SQL code batch used to create the databases and objects and compare the results. If everything executed as expected and the source has been merged with the target, the output should be as per below.

DBMergeSync_Merge_AfterExec_Results

One thing to note here is that MERGE does not seem to work across two different SQL Server instances e.g. between linked servers. For that you will need to break/alter the above MERGE SQL into INSERT and UPDATE statements. In the NEXT POST to this series I will show how to synchronise multiple objects across two databases. Rather than executing our MERGE stored procedure for each database object individually, we can provide a looping functionality by means of using another stored procedure or an SSIS package to pick up all relevant object and execute it as many times as there is tables to merge together.

Just in case you have any issues coping and pasting, all SQL code as well as any additional files can be downloaded from HERE.

Tags:

How To Build A Data Mart Using Microsoft BI Stack Part 8 – Creating Sample SSRS Report

September 16th, 2013 / 12 Comments » / by admin

In this PREVIOUS POST I focused on the concepts of SSAS solution deployment and cube creation as well as went through some basic validation techniques to ensure that our data mart has correct data in it before it can be reported out of. This final post deals with reporting out of our data mart using SQL Server Reporting Services (SSRS) – another application being part of the comprehensive Microsoft BI stack (I have already covered SSIS and SSAS in previous posts).

There are many different ways to report on data stored in the data mart. For impromptu and ad hoc data analysis Excel cannot be beaten on user friendliness and ease of development. Having said that, I have seen many Excel reports which were more like stand-alone applications, with thousands of lines of VBA code mixed with SQL and MDX so given this tool’s flexibility, I would not dismiss it as lightweight and simple. Also, given that Excel seems to be getting all the attention as far as the direction Microsoft wants to take its reporting tools in, with many bells and whistles, particularly in version 2013, it is expected that Excel, not other applications will gain traction over the coming years. For more ‘pixel-perfect’ reports however, SSRS is still a great and powerful tool to use and it will be the focus of this post.

There are two applications that SSRS reports can build in: Report Builder and Visual Studio. Report Builder is designed to be user-friendly, and its GUI interface is designed to have the look and feel of Microsoft Office so that developers accustomed to building reports in either Microsoft Access or Excel will feel right at home. Unlike Visual Studio, Report Builder is not automatically installed when you install SQL Server 2012. It must be downloaded from the Internet and installed separately. Report Builder is designed for casual developers. It provides most, but not all, of the features that come with Visual Studio. For more feature-rich development, Visual Studio’s Reporting Services project provides full-featured development. With it you can accomplish everything that can be done in Report Builder, plus you have the ability to manage multiple files concurrently, work directly with source control, and include the Reporting Services project in the same solution as SSIS and SSAS projects.

There are many aspects of Reporting Services that an astute developer would have to get acquainted with e.g. SSRS administrative services, SSRS configuration manager, deployment and scheduling process etc. They won’t be covered here but there is copious amount of information on the internet relating to the management side of SSRS specifically. Without further ado, let’s jump into creating our first SSRS report based on the data in our data mart. Let’s open up Visual Studio and create a new project selecting Report Server Project from the options provided.

Create_SSRS_project_HTDM

Once the project has been created, you can see it in Solution Explorer. Next, let’s create one or more data sources. Data sources provide the connection information for your reports. To create a shared data source, right-click the folder called Shared Data Sources and select the Add New Data Source option from the context menu. The Shared Data Source Properties dialog window will appear. In the Shared Data Source Properties dialog window, you enter a name for your new data source, select the type of connection to create, and provide a connection string as per image below.

Set_db_connection_SSRS_HTDM

After we create our data sources, we then make one or more datasets. Datasets contain query string used to generate report data. The query string language is dependent on which type of connection is used. For example, if you are connecting to a SQL Server database, use the SQL language. If you are connecting to an Analysis Server database, use a language such as MDX. To create a shared dataset, right-click the Shared Datasets folder in Solution Explorer and choose Add New Dataset from the context menu. A Shared Dataset Properties dialog window appears where you can identify the name, data source, query type, and query for the dataset as per image below. Let’s not populate the query pane with any SQL just yet and leave it blank.

Datasource_properties_SSRS_HTDM

Once our data source and initial dataset have been defined, right-clicking the Reports folder in Solution Explorer and choosing Add New Report. This will trigger the wizard which we can use to define our report data source and the query which will drive the reports data. Next, on the Design the Query pane, enter the following SQL.

SELECT SUM(FactSales.SalesQuantity) AS SalesQuantity, DimStores.StoreName, DimDates.DateName
FROM  FactSales INNER JOIN
DimStores ON FactSales.StoreKey = DimStores.StoreKey INNER JOIN
DimDates ON FactSales.OrderDateKey = DimDates.DateKey
WHERE (DimStores.IsCurrent = 1)
GROUP BY DimStores.StoreName, DimDates.DateName, DimDates.Date
ORDER BY DimDates.Date

 

At this stage the wizard should look as per image blow.

SQL_query_SSRS_HTDM

When on Select the Report Type step, click on Matrix radio button. Click Next to proceed to Design the Matrix step and adjust the properties as per image below.

Columns_assignment_SSRS_HTDM

Continuing on, you can choose to finish the wizard or go to next step to adjust the theme you wish to apply and assign the report name. When completed, click on Finish to conclude the wizard. At this stage all we have is a very simple sales matrix with sales amount measures partitioned by store name and particular month. You can also see a tree-view of individual objects which make up this report on the left-hand side with data sources, data sets and individual attributes coming from data mart tables. We can run the report clicking on Preview button in the top left corner of the design pane.

Basic_report_SSRS_HTDM

At this point the report looks quite simplistic and there is no level of interactivity. It may be OK for someone who would only run it to export the results to Excel and do further manipulation on the data or for someone who gets it emailed every day/month to casually look at the figures for any deviations from the norm. However, as it was the case with SSIS in POST 4, there is much more you can do in order to make it more functional, easy on the eye and interactive. SSRS allows the concept of parameters which you can use to slice/filter data before it gets outputted onto the report. There is a large number of components available (either default ones from the Visual Studio project toolbar or third party) which can really bring your reports to life so extensibility is not a problem. Below is a sample image of the same report with a few extra components added to make it look more dashboard-like (click on the image to enlarge it). This took only a few minutes to put together but the result is a report which is more visually appealing and a little bit more interactive (I included one parameter as a drop-down list with multiple values allowed to be selected in the top, left corner which permits end-users to filter the data based on the StoreName attribute).

Design_changes_SSRS_HTDM

All additional components in this report are included in SSRS (2012 version) by default so you can really come a long way before running out of customisation options with just out-of-the-box functionality e.g. gauges, Bing maps, sparklines and of course .Net extensibility. If you want something more interactive e.g. more analytical reporting with real-time, dynamically driven interface, there is lots of options out there. Excel should not disappoint if you already have a solid investment in Microsoft stack. Version 2013, with the inclusion of PowerView, PowerPivot, PowerMap etc. is a terrific option for self-service, dynamic data analysis. This, coupled with SharePoint deployment and SQL Server database(s) should provide an enterprise ready solution for most of your needs. Otherwise, vendors such as Tableau, ClickView or SAP with its Lumira are pretty good but the price may prove to be prohibitive for many.

This concludes this post and this series (eight posts altogether). If you stumble upon this or any other post from the ‘How To Build A Data Mart’ collection on my blog don’t hesitate to leave me a comment – any feedback is appreciated, good or bad! As usual, solution files for this SSRS project as well as all other posts for this series can be found and downloaded from HERE.

Please also check other posts from this series:

Tags: ,