Implementing SCD Type 4 in Azure SQL Database: A Step-by-Step Guide

Introduction

In my previous article , I explained the five types of Slowly Changing Dimensions (SCDs)—Type 1, Type 2, Type 3, Type 4, and Type 6—showing how data looks before and after applying each type.

In this article, I will focus on implementing SCD Type 4 in Azure SQL Database.

Link to my previous article - https://www.csharp.com/article/handling-slowly-changing-dimensions-scd-in-azure-data-warehouse/

What is SCD Type 4?

SCD (Slowly Changing Dimension) is a technique used to track historical data changes. SCD Type 4 involves maintaining a current data table while storing historical data in a separate history table. This approach ensures the operational table remains lightweight, while history is preserved for archival and auditing purposes.

Real-World Use Cases
 

1. Customer Profile History in Banking

  • Scenario: A bank maintains a current customer profile table but needs to track historical address and employment changes.
  • Implementation
    • A current table stores only the latest customer details.
    • A history table logs all changes with timestamps.
  • Why SCD Type 4?
    • Reduces the size of the operational table.
    • Allows fast access to current data while keeping history separately.

2. Product Pricing in E-Commerce

  • Scenario: An e-commerce platform tracks current product prices but must also store historical pricing for analytics and compliance.
  • Implementation
    • A current table holds only the latest price.
    • A history table logs previous prices with timestamps.
  • Why SCD Type 4?
    • Enables fast lookups for real-time pricing.
    • Historical data is preserved for reporting.

SCD Type 4 Implementation in Azure SQL Database

Schema Design

For this demo, I am implementing customer details tracking in Azure SQL Database.

I created three schemas.

  • Staging: Stores raw data before processing.
  • Dimension: Holds only the latest records.
  • History: Stores historical data separately.
CREATE SCHEMA [Staging];
CREATE SCHEMA [Dimension];
CREATE SCHEMA [History];

Tables & Stored Procedure

I created three tables and a stored procedure to implement SCD Type 4 logic.

1. Staging Table ([Staging].[Customer])

Stores incoming customer data before processing.

CREATE TABLE [Staging].[Customer] (
    [CustomerID]                [int]                 NOT NULL,
    [CustomerName]              [varchar](255)            NULL,
    [Address]                   [varchar](255)            NULL,
)

2. Dimension Table ([Dimension].[Customer])

Contains only current customer records.

CREATE TABLE [Dimension].[Customer] (
    [CustomerID]                [int]                 NOT NULL,
    [CustomerName]              [varchar](255)            NULL,
    [Address]                   [varchar](255)            NULL,
    [IsDeleted]                 [bit]                     NULL,
    [InsertedBy]                [varchar](128)        NOT NULL,
    [InsertDate]                [datetime]            NOT NULL,
    [UpdatedBy]                 [varchar](128)        NOT NULL,
    [UpdatedDate]               [datetime]            NOT NULL
)

3. History Table ([History].[Customer])

Stores all historical customer data.

CREATE TABLE [History].[Customer] (
    [CustomerID]                [int]                 NOT NULL,
    [CustomerName]              [varchar](255)            NULL,
    [Address]                   [varchar](255)            NULL,
    [IsDeleted]                 [bit]                     NULL,
    [StartDate]                 [datetime]            NOT NULL,
    [EndDate]                   [datetime]            NOT NULL,
    [InsertedBy]                [varchar](128)        NOT NULL,
    [UpdatedBy]                 [varchar](128)        NOT NULL,
    [UpdatedDate]               [datetime]            NOT NULL
)

4. Stored Procedure ([Sproc].[Customer])

A stored procedure is a precompiled collection of SQL statements that can be executed as a single unit. It is commonly used for data processing, automation, and performance optimization in databases.

In this implementation, the stored procedure has SCD Type 4 logic.

  • Inserts new records into the Dimension table if they don’t exist.
  • Moves updated or deleted records to the History table while maintaining only the latest data in the Dimension table.
CREATE PROC [Sproc].[Customer] AS
BEGIN
    -- Local variables declaration
    DECLARE @CurrentDate DATETIME = GETDATE();
    DECLARE @CurrentUser VARCHAR(128) = SYSTEM_USER;

    -- Create a new table 'DeleteCTAS' and populate it with records from 'Dimension.Customer' 
    -- that are NOT present in 'Staging.Customer'.
    SELECT
        Cust.[CustomerID],    
        Cust.[CustomerName],  
        Cust.[Address],       
        Cust.[IsDeleted],    
        Cust.[InsertedBy],     
        Cust.[InsertDate],     
        Cust.[UpdatedBy],    
        Cust.[UpdatedDate]    
    INTO DeleteCTAS
    FROM [Dimension].[Customer] Cust
    WHERE NOT EXISTS (
        SELECT 1 FROM [Staging].[Customer] Stg 
        WHERE Cust.[CustomerID] = Stg.[CustomerID]
    );

    -- Create HistoryCTAS table with updated records from Dimension.Customer
    SELECT
        Cust.[CustomerID],    
        Cust.[CustomerName],  
        Cust.[Address],       
        Cust.[IsDeleted],    
        Cust.[InsertedBy],     
        Cust.[InsertDate]  
    INTO HistoryCTAS
    FROM [Dimension].[Customer] Cust
    INNER JOIN [Staging].[Customer] Stg
    ON Cust.[CustomerID] = Stg.[CustomerID] 
    EXCEPT
    SELECT
        Stg.[CustomerID],    
        Stg.[CustomerName],  
        Stg.[Address], 
        0 AS [IsDeleted],
        Cust.[InsertedBy],
        Cust.[InsertDate]
    FROM [Staging].[Customer] Stg
    INNER JOIN [Dimension].[Customer] Cust
    ON Cust.[CustomerID] = Stg.[CustomerID];
 
    -- Create DimensionCTAS table combining Staging and DeleteCTAS tables
    SELECT
        Stg.[CustomerID],    
        Stg.[CustomerName],  
        Stg.[Address], 
        0 AS [IsDeleted],
        ISNULL(T.[InsertedBy], @CurrentUser) AS [InsertedBy],             
        ISNULL(T.[InsertDate], @CurrentDate) AS [InsertDate],             
        ISNULL(T.[UpdatedBy], @CurrentUser) AS [UpdatedBy],             
        ISNULL(T.[UpdatedDate], @CurrentDate) AS [UpdatedDate]
    INTO DimensionCTAS
    FROM [Staging].[Customer] Stg  
    LEFT JOIN (
        SELECT [CustomerID], [InsertedBy], [InsertDate], [UpdatedBy], [UpdatedDate] 
        FROM [Dimension].[Customer] Cust
        WHERE NOT EXISTS (
            SELECT 1 FROM HistoryCTAS Hist 
            WHERE Hist.[CustomerID] = Cust.[CustomerID]
        )
    ) T
    ON T.[CustomerID] = Stg.[CustomerID]
    UNION
    SELECT
        [CustomerID],   
        [CustomerName], 
        [Address],      
        1 AS [IsDeleted],
        CASE WHEN IsDeleted = 1 THEN [InsertedBy] ELSE @CurrentUser END AS [InsertedBy],
        CASE WHEN IsDeleted = 1 THEN [InsertDate] ELSE @CurrentDate END AS [InsertDate],
        CASE WHEN IsDeleted = 1 THEN [UpdatedBy] ELSE @CurrentUser END AS [UpdatedBy],
        CASE WHEN IsDeleted = 1 THEN [UpdatedDate] ELSE @CurrentDate END AS [UpdatedDate]
    FROM DeleteCTAS;

    BEGIN TRANSACTION;

    -- Delete records from the Dimension table
    DELETE FROM [Dimension].[Customer];

    -- Insert into Dimension table from DimensionCTAS
    INSERT INTO [Dimension].[Customer] (
        [CustomerID],    
        [CustomerName],  
        [Address],       
        [IsDeleted],    
        [InsertedBy],     
        [InsertDate],     
        [UpdatedBy],    
        [UpdatedDate]        
    )
    SELECT 
        [CustomerID],    
        [CustomerName],  
        [Address],       
        [IsDeleted],    
        [InsertedBy],     
        [InsertDate],     
        [UpdatedBy],    
        [UpdatedDate]  
    FROM DimensionCTAS;
    
    -- Insert into History table from HistoryCTAS and DeleteCTAS
    INSERT INTO [History].[Customer] (
        [CustomerID],       
        [CustomerName],     
        [Address],                    
        [IsDeleted], 
        [StartDate],        
        [EndDate],       
        [InsertedBy],        
        [UpdatedBy],       
        [UpdatedDate]                
    )
    SELECT
        [CustomerID],       
        [CustomerName],     
        [Address],                    
        [IsDeleted],
        InsertDate AS [StartDate],
        @CurrentDate AS [EndDate],
        InsertedBy AS [InsertedBy],
        @CurrentUser AS [UpdatedBy],
        @CurrentDate AS [UpdatedDate]
    FROM HistoryCTAS    
    UNION
    SELECT
        [CustomerID],       
        [CustomerName],     
        [Address],   
        IsDeleted,
        InsertDate AS [StartDate],
        @CurrentDate AS [EndDate],
        InsertedBy AS [InsertedBy],
        @CurrentUser AS [UpdatedBy],
        @CurrentDate AS [UpdatedDate]
    FROM DeleteCTAS
    WHERE IsDeleted = 0;

    COMMIT TRANSACTION;

    -- Drop CTAS Tables
    DROP TABLE DeleteCTAS;
    DROP TABLE DimensionCTAS;
    DROP TABLE HistoryCTAS;
END

Note. The stored procedure does not include error handling (e.g., duplicate keys), which can be added later.

Below query to list the entities.

select * from INFORMATION_SCHEMA.TABLES;
select * from INFORMATION_SCHEMA.ROUTINES;

Schema table

Execution Scenarios & Outcomes

Scenario 1. New Data Load

Staging Data

INSERT INTO [Staging].[Customer] ([CustomerID], [CustomerName], [Address]) 
VALUES 
    (1, 'John Doe', '123 Main St, New York, NY'), 
    (2, 'Jane Smith', '456 Elm St, Los Angeles, CA'), 
    (3, 'Michael Johnson', '789 Oak St, Chicago, IL'), 
    (4, 'Emily Davis', '101 Pine St, Houston, TX'), 
    (5, 'Robert Brown', '202 Maple St, San Francisco, CA');

Staging Data

SP Execution

EXEC [Sproc].[Customer];

First Load

Outcome

Since the Dimension & History tables are empty, all records from Staging are inserted into the Dimension table.

Dimension table

Scenario 2. Updating & Deleting Records

Staging Data

  • Delete CustomerID = 5
  • Update CustomerID = 1 (Address changed to 'India')

Execution

Execution

Outcome

  • Dimension Table
    • CustomerID = 5 is marked as IsDeleted = 1 instead of physical deletion.
    • CustomerID = 1 has its address updated.
      2DimValidation
  • History Table
    • CustomerID = 1 with previous address is stored with StartDate & EndDate.
    • CustomerID = 5 is recorded with IsDeleted = 0 with its timestamps period on when it was active.
      2HistoryValidation

Scenario 3. Reinsert Deleted Record & Insert New Record

Staging Data Update

  • Reinsert CustomerID = 5
  • Insert CustomerID = 6 (New customer: John Wick, New York)
    Insert

Execution

Load

Outcome

  • Dimension Table
    • New customer (John Wick) is inserted.
    • CustomerID = 5 (Robert Brown) is reactivated (IsDeleted = 0).
      3Validation
  • History Table: Tracks Robert Brown's deletion and reactivation timestamps.
    Robert

Conclusion

In this article, I demonstrated the SCD Type 4 implementation in Azure SQL Database using a staging layer, dimension table, and history table.

I executed a stored procedure manually to cover various scenarios.

  • New inserts
  • Updates
  • Deletions
  • Reinsertions

For any other scenarios or improvements, please leave a comment!

In the next article, I will cover,

  • Creating an Azure SQL Database
  • Automating the stored procedure using ADF

Stay tuned!

Up Next
    Ebook Download
    View all
    Learn
    View all