Performance Impact from Page Splits
Recently I was brought in to help troubleshoot performance issues on a database server. Going through my typical check list I noticed memory pressure on the box and made the recommendationof more memory which was simple to fix. What caught my attention though was that blocking was occurring on a transaction that was trying to do a simple delete of a few records.
In a discussion with the vendor I learned that this system does massive inserts and deletes all day long. This system had only been in place for a few months and one table was nearly 30 GB in size. This was the particular table that was being blocked with a delete of a few records. We use a standard index maintenance script that reorg’s or rebuilds based on fragmentation so I was able to quickly rule out fragmentation as the issue however I still felt like I should dig into the indexes to see what was happening.
In looking at the indexes I found a clustered index that was the primary key nvarchar(255) and 13 non clustered indexes all with the default fill factor of 100%. In digging into looking at the number of page splits per second on this instance they were in the several hundred. Taking a look at the top 10 indexes with page splits, this particular table was 8 of the 10. With this information in hand I went back to the vendor. They basically told me that this level of trouble shooting and tuning was beyond their knowledge but that what I stated sounds good and to implement the changes. All I asked them was if they had an issue with me rebuilding the index and dropping the fill factor rate. 🙂
In dropping the fill factor from 100% to 80%, I decreased the page splits drastically. For the overall instance I am no longer in the 100’s per second, they are still active in the 20+ range and for my top 10 indexes this particular table only has one entry. With that being said, I still have more work to do, however this system is in much better shape. The table grew quite a bit but the tradeoff is well worth it.
8 Comments
Great post, Tim. Out of curiosity – if the box was under memory pressure, and you lowered fill factor, did you also get the additional memory installed? If not, did you notice the impact of lower fill factor on the memory metrics? You’re caching empty space on each page now, so effectively caching even less data.
Thanks for the comment Brent. I first increased the memory on the server and things drastically improved performance wise for the users, however I was still noticing the extremely high page splits. Knowing the table was very volatile with inserts and deletes I knew that a fill factor of 100% was not optimal. That is when I started working on the indexes to lower the fill factor a bit on the ones experiencing the most page splits. I haven’t noticed any negative impact on memory since adjusting the fill factor but have noticed a drastic decrease in splits.
I would love any pointers on additional things to check out and any lessons learned you may have from similiar issues.
Sure. The general starter metric for memory is Page Life Expectancy – the number of seconds a page stays in memory after it’s been read from storage. It’s a tough metric to read since it spikes all over the place, so you end up trending it. If pages are now 20% blank and we were already under memory pressure, I’d expect to see a drop in PLE. (If the database can fit completely in memory, then I wouldn’t expect to see a drop.)
A more detailed metric would be to query sys.dm_os_buffer_descriptors. It’ll give you the database_id, plus how many pages are in memory for that database, and how much free_space_in_bytes is stored on each page. Now 20% of each page will be blank, so we’re caching less data per page. The number of pages per database probably won’t change (unless we’re tweaking one database alone on an instance while the rest remain at 100% fill factor), but you can use these numbers to extrapolate how much data is actually getting cached per database. The queries in the BOL page for this DMV are a good start: http://msdn.microsoft.com/en-us/library/ms173442.aspx
About lessons learned – when you’re under memory pressure, lowering fill factor is one of the worst things you can do. It means caching even less data, and putting more pressure on storage. If you’re seeing symptoms of slow page splits and low memory metrics, then I’d start by stuffing the server with memory. The page splits might be a problem because SQL Server doesn’t have the necessary pages in cache that it needs to update, so it has to fetch them off disk – and the real root cause might just be not enough memory.
Thanks, sounds like I did good on this issue then. When I first got on the server and did my health check, PLE was 5 minutes and would drop down to seconds consistently. When I identified the tables that was being hammered we identified that way to much data base being stored. Fortunately this was on a Friday afternoon. That evening we increased the memory by 300% and was able to purge around 20% of the data. The following week things were much better for the consumer however they were still experiencing the occassional issue with blocking on deletes and inserts. PLE had then increased from minutes to averaging two hours with the occasional drop to 30 minutes. After the index rebuild dropping the fill factor from 100 to 80 PLE is still maintaining high values. Matter of fact it is 121 mins or 7313 seconds right now. I will dig into the sys.dm_os_buffer_descriptors and educate myself on that DMV.
Hi Brent I have a doubt in this part, because Im not sure if I catch it right… You said that when you have problems with memory pressure, the worst thing you could have done is lowering the fill factor, is that correct?? Ok, I guess because the life spectancy of the page decreess and that makes more buffer reading after all, but if you change the page file size is not making the same, but in the HD?? what do you recommend in this cases because I guess I had the same issue…
The page file size doesn’t matter because as soon as SQL Server starts swapping to disk, it’s going to be slow. If you have memory problems, you need to add more memory (or do a better job of packing IN the data you need, like via better indexing or a higher fill factor.)
Hi Tim, I don’t know if this is an option for you but have you considered compression (row or page)? If your clustered index is on a nvarchar column this could result in some significant performance gains and better utilise the additional memory you’ve added. It might not necessarily result in a significant reduction in page splits, but at least it will better use the space available on a page! It might then give you the option of increasing the fill factor again and that will help in reducing page splits.
Of course you’d want to test all this before implementing in production! 🙂
Below is a script that you can use to track page splits, using the extended event feature of SQL Server 2008 and later.
Copy all code into a Management Studio window, then create and start the event collection by executing Step 1.
The creation, start and stop events are commented out since you just want to execute the entire window in SSMS. So just select the parts in comments and “Execute” them.
Now, runt the entire window. That will give you what queries that are making pagesplits.
Keep it running for a while and execute/run the script to see what happens.
When you are done, select the code after “Select when done…” and execute. That will stop and remove the counter.
Then all pagesplits will be reported and
CODE –>
/*
— Step 1. Create and start event
CREATE EVENT session pagesplits ON SERVER
ADD EVENT sqlserver.page_split
(ACTION(sqlserver.plan_handle))
ADD TARGET package0.ring_buffer (SET max_memory = 2048)
WITH (max_dispatch_latency = 1 seconds)
GO
ALTER EVENT session pagesplits ON SERVER STATE = START
GO
— Step when done. Stop and delete event
ALTER EVENT session pagesplits ON SERVER STATE = STOP
GO
DROP EVENT session pagesplits ON SERVER
GO
SELECT
CAST(xet.target_data AS XML)
FROM
sys.dm_xe_session_targets xet
JOIN sys.dm_xe_sessions xe
ON (xe.address = xet.event_session_address)
WHERE
xe.name = ‘pagesplits’
*/
DECLARE @x XML
SELECT
@x = CAST(xet.target_data AS XML)
FROM
sys.dm_xe_session_targets xet
JOIN sys.dm_xe_sessions xe
ON (xe.address = xet.event_session_address)
WHERE
xe.name = ‘pagesplits’
;
WITH cte
AS
(
SELECT
*
,COUNT(*) cnt
FROM
(
SELECT
SUBSTRING(CONVERT(NVARCHAR(200) ,e.query(‘text()’)) ,18 ,50) plan_handle
FROM
@x.nodes(‘/RingBufferTarget/event/action[@name=”plan_handle”]/value’) T(e)
) T
GROUP BY
plan_handle
)
SELECT
cte.*
,sql_text.[text]
FROM
cte
CROSS APPLY sys.dm_exec_sql_text(CONVERT(VARBINARY(64) ,cte.plan_handle, 1)) sql_text
ORDER BY cte.cnt DESC