top of page

Week Five: Part 2 - Working Smart with Very Large Tables in SQL Server 2022

  • May 30, 2025
  • 9 min read

Updated: Jun 1, 2025

Not too long after we reconfigured SQL Server's memory, we expected to see the pressure drop. We did see some improvement, fewer flushes, slightly better Page Life Expectancy and a bit less IO thrashing. But the underlying issue didn’t fully go away.

PAGEIOLATCH_xx waits continued to show up. It wasn’t as constant, but it was still there.. Lurking, flaring up whenever the workload spiked.


Before committing to a bigger instance, I didn’t want to just throw hardware at the problem blindly. So I went back and reviewed all the reads, inserts and updates involving the largest tables. I looked at query structures, examined execution plans and tried to spot patterns or inefficiencies. Indexes? Honestly, not practical at this scale. Maintaining them adds overhead that often outweighs the benefit when you're dealing with very large, heavily written tables. But even without touching the indexes, just by understanding how those tables were used and trimming unnecessary overhead in some of the queries, I managed to squeeze out a little optimization. More importantly, it gave me a baseline, a clearer picture of what really needed tuning in those critical tables.


At that point, we took another step: upgrading our instance from AWS EBS r5dn.2xlarge (8 vCPU, 64 GB RAM) to r5dn.4xlarge (16 vCPU, 128 GB RAM). Bigger machine, more resources, better IOPS. It helped, YES. but still, not a fix. Something else was holding things back. I dug deeper and followed Microsoft’s own documented suggestions for resolving PAGELATCH_EX contention.


Out of all the proposed solutions, only two really made sense for our case:


This is a feature added in SQL Server 2019 and carried over into 2022. It's a simple index-level option designed to reduce contention on the last page of a B-tree for indexes that use monotonically increasing keys (e.g., identity columns, datetime values).

When many sessions try to insert new rows into the tail of a clustered index, they often target the same page, the infamous "hot spot." OPTIMIZE_FOR_SEQUENTIAL_KEY introduces a small random delay (based on contention level) to help stagger those insert attempts. Turning on OPTIMIZE_FOR_SEQUENTIAL_KEY doesn’t require you to rebuild an index. That’s cool, because a whole lot of changes to indexes require you to rebuild them.


ALTER INDEX PK_Table1 
    ON dbo.Table1 SET(OPTIMIZE_FOR_SEQUENTIAL_KEY  = ON);

In practice, this reduces blocking and smooths out concurrent inserts.

Just a side note:

When using this option, you may not see a decrease in PAGELATCH waits, in fact you may even see an increase in waits with a new wait type called BTREE_INSERT_FLOW_CONTROL

I applied this setting to all our busiest tables. It helped.. Less blocking, more stability. But still, not enough.


Method 6: Table Partitioning with Computed Column and Hybrid Data Compression

Partitioning can be a bit of a puzzle to get right. But in cases like ours (high insert volume, latch contention and IO bottlenecks) it becomes a powerful tool.


In this setup, I moved the table to another database and designed it from the ground up with scalability in mind. This time, the database had more data files to help with IO distribution. I added a persisted computed column (CreatedDate AS (CONVERT([date],[CreatedTime])) PERSISTED NOT NULL to act as the partitioning key. The goal was to slice the table by day.


The structure of this new table wasn’t random. I based it on the baseline I had built from the old table. That review of queries, usage patterns and access paths gave me a starting point. With the help of Brent Ozar’s sp_BlitzIndex, I was able to validate what mattered and what didn’t. I ran all key reads, inserts and updates in a Staging environment (mirroring Production as closely as possible) and measured execution time, STATISTICS IO and execution plans.


That testing gave me confidence. It wasn’t just about moving to a newer table, it was about making sure the overall performance was actually better, not just different.


Here’s a trimmed-down version of how the table was defined:

CREATE TABLE [dbo].[Table1](
	[ID] [uniqueidentifier] ROWGUIDCOL  NOT NULL,
	... NOT NULL,
	[CreatedDate]  AS (CONVERT([date],[CreatedTime])) PERSISTED NOT NULL,
	...,
 CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED 
(
	[ID] ASC,
	[CreatedDate] ASC
)WITH (FILLFACTOR = 90, OPTIMIZE_FOR_SEQUENTIAL_KEY = ON, DATA_COMPRESSION = ROW ON PARTITIONS (1, 60 TO 91),
DATA_COMPRESSION = PAGE ON PARTITIONS (2 TO 59)) ON [PS_BYDAY90]([CreatedDate])
) ON [PS_BYDAY90]([CreatedDate])
Why this helps:
  • Insert load is spread out. By partitioning the table on CreatedDate, new rows go into separate partitions, which reduces the chance that multiple sessions will hit the same data page at once.

  • Each partition has its own filegroup. That spreads IO across different physical files, lightening the pressure on any one drive.

  • Less index fragmentation. Because inserts are isolated to their own partitions, the kind of page splits we were seeing before basically disappear.


A note on compression strategy:

The table uses hybrid compression: ROW compression on some partitions and PAGE compression on others. There’s a reason for this mix.

  • Newer partitions (which take the brunt of inserts) stay on ROW compression. It's lighter on CPU and faster for write-heavy activity.

  • Older partitions, which aren’t being modified anymore, get PAGE compression. It squeezes the data more aggressively, saving space and improving scan performance.


This strikes a good balance. We avoid the CPU overhead that comes with compressing “hot” data, while still reclaiming space and improving read performance on colder partitions.


Before applying any compression, I wanted to estimate the space savings. SQL Server provides a built-in procedure (sp_estimate_data_compression_savings) but it can blow up tempdb pretty quickly when run against large tables. So instead, I used Eitan Blumin’s Ultimate Compression Savings Estimation Script. It samples data intelligently and avoids the tempdb pressure, giving reliable estimates without putting the server at risk.

This approach helped me decide exactly which partitions to compress and how, without guessing or making assumptions.


Dynamically Handling Partition Function and Scheme

Manually updating partition boundaries every few days isn’t practical, especially when you're working with time-based data like logs or events. So I built a dynamic approach.

--Dynamic script to create Partition Function: last 90 days + 1 for pre-history + 1 for tomorrow (total 92 boundaries = 91 partitions)

DECLARE @DatesTable TABLE (DateValue DATE NOT NULL);

INSERT INTO @DatesTable (DateValue)
SELECT TOP (92)
    CAST(DATEADD(DAY, 1 - ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), SYSDATETIME()) AS DATE)
FROM sys.all_columns;

-- Build dynamic SQL for creating the partition function
DECLARE @Statement NVARCHAR(MAX) = N'
CREATE PARTITION FUNCTION PF_BYDAY90 (DATE)
AS RANGE RIGHT FOR VALUES
(
';
SELECT @Statement += N'''' + CONVERT(NCHAR(10), DateValue, 120) + N''', '
FROM @DatesTable
ORDER BY DateValue;
SET @Statement = LEFT(@Statement, LEN(@Statement) - 2) + N'
);
';

-- Execute the dynamic SQL
EXEC sys.sp_executesql @Statement;

For the partition function, I generate 92 boundaries, one for each of the last 90 days, plus one for a “pre-history” partition and one for tomorrow. That gives me 91 partitions total, covering both active and edge-case scenarios.


The script uses a simple ROW_NUMBER() trick to build the date list relative to today and then constructs the CREATE PARTITION FUNCTION statement dynamically. This way, boundaries shift automatically with time and the structure always stays current.


--Dynamic script to create Partition Scheme: Based on the number of filegroups and defined Partition Function

DECLARE @sql NVARCHAR(MAX) = '';
DECLARE @partitionCount INT = 91; -- Number of partitions = number of ranges in PF_BYDAY90
DECLARE @filegroupCount INT = 12;
DECLARE @i INT = 1;

WHILE @i <= @partitionCount
BEGIN
    DECLARE @fgIndex INT = CASE
                           WHEN @i = 1 THEN 1 -- First partition to PRIMARY
                           ELSE ((@i - 2) % (@filegroupCount - 1)) + 2 -- FG_02 to FG_12
                           END;

    SET @sql += CASE 
                    WHEN @i > 1 THEN ', ' 
                    ELSE '' 
                END 
                + CASE 
                    WHEN @fgIndex = 1 THEN '[PRIMARY]' 
                    ELSE '[FG_' + RIGHT('0' + CAST(@fgIndex AS VARCHAR), 2) + ']' 
                  END;
    SET @i += 1;
END

DECLARE @partitionSchemeScript NVARCHAR(MAX) = '
CREATE PARTITION SCHEME PS_BYDAY90
AS
PARTITION PF_BYDAY90
TO (' + @sql + ');';

-- Review before running
PRINT @partitionSchemeScript;

-- Uncomment to execute
--EXEC sp_executesql @partitionSchemeScript;

The partition scheme is also built dynamically, using a round-robin approach to distribute partitions across 12 filegroups. The first partition goes to PRIMARY, and the rest rotate evenly across FG_02 to FG_12. This spreads the I/O load more evenly and keeps partition management clean. I keep these scripts flexible so that adding new boundaries or shifting date ranges doesn’t require manual intervention. You define the logic once, and let it roll forward with time.


It’s one of those things that takes a bit of effort up front but saves you from a lot of pain later.

Side note: Another advantage of using partitioned filegroups is tiered storage: older, rarely accessed partitions can be moved to cold storage, while recent, high-traffic partitions stay on hot storage. It’s a quiet but effective optimization that helps balance performance and cost over time.

Implementing SQL Server Partitioning Sliding Window

Partitioning really shines when you start managing rolling windows of time-based data. Instead of relying on DELETE FROM with a date filter (which is slow, generates a ton of transaction log activity and resource intensive) you can just truncate a single partition. It's cleaner, faster and far less invasive.

To make this sliding window logic repeatable and hands-off, I encapsulated everything into stored procedures and scheduled them in SQL Agent job. Once set up, it basically runs itself.

1. PartitionSplitManagement – Prepping for the Next Day
CREATE PROCEDURE [dbo].[PartitionSplitManagement]
AS
BEGIN
    DECLARE @NextFilegroupName NVARCHAR(100)
        , @ErrorMessage NVARCHAR(MAX);
    
	BEGIN TRY
        -- Get the list of available filegroups in random order
        DECLARE @Filegroups TABLE (FilegroupName NVARCHAR(100));

        INSERT INTO @Filegroups (FilegroupName)
        SELECT name
        FROM sys.filegroups
        WHERE type = 'FG';

        -- Select the next filegroup randomly for PF_BYDAY90
        SELECT TOP 1 @NextFilegroupName = FilegroupName
        FROM @Filegroups
        ORDER BY NEWID();

        -- Dynamic SQL to alter partition scheme for PF_BYDAY90
        DECLARE @AlterPartitionScheme90Days NVARCHAR(MAX);

        SET @AlterPartitionScheme90Days = 'ALTER PARTITION SCHEME PS_BYDAY90 NEXT USED [' + @NextFilegroupName + '];';

        EXEC sp_executesql @AlterPartitionScheme90Days;

        -- Dynamic SQL to alter partition function and split range for PF_BYDAY90
        DECLARE @NextBoundary90 AS DATE;

        SET @NextBoundary90 = DATEADD(DAY, 1, (
                    SELECT MAX(CAST(prv.[value] AS DATE))
                    FROM sys.partition_range_values AS prv
                    INNER JOIN sys.partition_functions AS pf
                        ON prv.function_id = pf.function_id
                    WHERE pf.[name] = 'PF_BYDAY90'
                    ));

        DECLARE @AlterPartitionFunction90Days NVARCHAR(MAX);

        SET @AlterPartitionFunction90Days = 'ALTER PARTITION FUNCTION PF_BYDAY90() SPLIT RANGE (''' + CONVERT(NVARCHAR, @NextBoundary90, 120) + ''');';

        EXEC sp_executesql @AlterPartitionFunction90Days;

    END TRY
    BEGIN CATCH
        SET @ErrorMessage = ERROR_MESSAGE();

        RAISERROR (
                @ErrorMessage
                , 16
                , 1
                );
    END CATCH;
END;

This stored procedure handles two things:

  • Assigns a filegroup for the next partition dynamically. It picks from the existing filegroups using a random selection (for demonstration—though you can plug in your own logic if you prefer round-robin or weighted options).

  • Splits the partition function at the next day’s boundary. This expands the table’s range by one more day, keeping it aligned with current time.

This forward-looking step ensures the table always has a place ready for tomorrow’s data. It’s like unrolling the carpet just before someone walks in.


2. TRUNCATE TABLE ... WITH (PARTITIONS (2)) – Fast Cleanup
TRUNCATE TABLE [dbo].[Table1] WITH (PARTITIONS (2));

Starting with SQL Server 2016, you can truncate a specific partition directly. No WHERE clauses, no logging mess. Just clean and instant removal of stale data.

This is what makes partitioning a game-changer for log or event-based data. If partition 2 holds data from 90+ days ago, you can purge it in milliseconds without touching the rest of the table.


3. PartitionMergeManagement – Closing Out the Old
CREATE PROCEDURE [dbo].[PartitionMergeManagement]
AS
BEGIN
    DECLARE @CurrentFirstBoundary AS DATE;

    -- Merge the range for 'PF_BYDAY90'
    SELECT @CurrentFirstBoundary = MIN(CAST(prv.[value] AS DATE))
    FROM sys.partition_range_values AS prv
    INNER JOIN sys.partition_functions AS pf
        ON prv.function_id = pf.function_id
    WHERE pf.name = N'PF_BYDAY90';

    ALTER PARTITION FUNCTION PF_BYDAY90 ()
    MERGE RANGE (@CurrentFirstBoundary);

END;

After truncating the oldest partition, you don’t want to leave that empty shell lying around. This procedure merges the first boundary, effectively removing the oldest (now empty) partition from the structure. This keeps your partition count in check and ensures storage isn’t wasted managing partitions that no longer serve a purpose.


Once all of this is wrapped into scheduled job (split, truncate and merge) it just runs. No manual intervention, no bloated indexes, no slow deletes. It’s the kind of setup that quietly saves you hours every week and keeps your data tier lean and responsive.


PARTITIONING MAGIC!

Throughput issues (last page insert contention) has significantly decreased after the table swap. Happy days!


Partitioning: A More Permanent Fix

To really push past this bottleneck, I had to rethink the way data was stored. Partitioning really helped:

  • Spread data out across multiple partitions and filegroups

  • Minimize blocking due to reduced latch contention

  • Improve throughput since partitions can be assigned to separate physical storage

  • Virtually eliminate index fragmentation in new partitions


A Few Bonus Tips!

To squeeze even more out of partitioned tables, there are a few settings and practices worth enabling:

  • Auto Create Incremental Statistics: This allows SQL Server to generate stats per partition, making large tables more responsive without the overhead of full-table stats.

  • Auto Update Statistics Asynchronously: Keeps the query plan from waiting on stats updates, especially useful during ETL or frequent inserts.

    • ASYNC_STATS_UPDATE_WAIT_AT_LOW_PRIORITY: Set this at the database level to reduce blocking even further during stats updates:

ALTER DATABASE SCOPED CONFIGURATION SET ASYNC_STATS_UPDATE_WAIT_AT_LOW_PRIORITY = ON;
  • Filtered Statistics: If certain partitions or values are queried more than others, filtered stats can help SQL Server make smarter decisions.

Partitioning is already powerful, but paired with these settings, it’s much easier to maintain both performance and predictability as the data grows.


Comments


Leave a Reply

Your email address will not be published. Required fields are marked *

© 2025 by Renz Bagasbas. All rights reserved.

bottom of page