Troubleshooting SQL Server with High Memory Grants
Very often, specially on mixed workload enviroment where server deals with oltp & olap queries together, concurrent sessions allocated with High Memory Grants cause entire SQL Server workload to suffer.
To detect such a situation, I like to watch my critical metrics dashboards. For example, dashboard Monitoring – Live -All Servers of free SQLMonitor tool highlights all these metrics.
Once I know my server is having issues from Monitoring – Live -All Servers, I check to see individual & historical metrics for the server on Monitoring – Live – Distributed dashboard. This dashboard is specifically designed for all troubleshooting metrics in a single place for one server.
Once I know that a server is facing memory pressure due to some queries granted with high memory, the next step I take is to find out those sessions, and if allowed kill them. The unseen problem at this point is, even DBA diagnostics queries also might now work if asking for some significant memory grant.
So in this scenario, we need a diagnostic query, which is lightweight in terms of memory footprint. Below is the query I wrote to get the sessions/queries consuming high memory –
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
declare @start_time datetime2 = sysdatetime(); declare @my_login varchar(100) = ORIGINAL_LOGIN(); declare @kill_string nvarchar(200); declare @memory_threshold_mb int = 500; declare @include_open_tran bit = 1; declare @sql nvarchar(max); declare @params nvarchar(2000); set @params = N'@memory_threshold_mb int, @include_open_tran bit, @my_login varchar(100)'; set quoted_identifier off; set @sql = " select [memory_gb] = der.granted_query_memory*8.0/1024/1024, [--kill query--] = 'kill '+convert(varchar,der.session_id)+char(10) ,[--find-query--] = 'dbcc inputbuffer('+convert(varchar,der.session_id)+')' ,[session_tran_count] = des.open_transaction_count ,[request_tran_count] = der.open_transaction_count ,[elapsed_time] = convert(varchar,getdate()-der.start_time,108) ,der.session_id, der.status, der.command ,[db_name] = db_name(der.database_id) ,der.blocking_session_id, der.wait_type, [wait_time] = dateadd(ms,wait_time,'1900-01-01 00:00:00.000') ,der.wait_resource, der.percent_complete, der.cpu_time, der.total_elapsed_time ,der.logical_reads, der.writes, der.row_count, query_hash /* , dop, parallel_worker_count */ from sys.dm_exec_requests der join sys.dm_exec_sessions des on des.session_id = der.session_id where der.granted_query_memory >= (@memory_threshold_mb*1024/8) and ( @include_open_tran = 1 or (der.open_transaction_count = 0 and des.open_transaction_count = 0) ) --and des.login_name <> @my_login order by granted_query_memory desc; " set quoted_identifier on; exec sp_executesql @sql, @params, @memory_threshold_mb, @include_open_tran, @my_login; |
From the above result, usually whatever session/db_name/login/program I am interested in debugging, I can get deeper. But at the moment since the server is suffering because of memory pressure, I prefer to kill the top memory consumers after the consent of the Server Owner. This is the point where the below kill query is used –
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
-- Parameters declare @include_open_tran bit = 0, @skip_my_login bit = 0, @debug bit = 0, @memory_threshold_mb int = 500; -- Variables declare @start_time datetime2 = sysdatetime(), @my_login varchar(100) = ORIGINAL_LOGIN(), @kill_string nvarchar(200), @session_id int, @memory_grant_mb int; --while (dateadd(MINUTE,30,@start_time) >= SYSDATETIME()) while (dateadd(MINUTE,30,@start_time) >= SYSDATETIME() and (select cntr_value from sys.dm_os_performance_counters pc where pc.counter_name like 'Memory Grants Pending%') > 0 ) begin declare cur_connections cursor static local for select [--kill query--] = 'kill '+convert(varchar,der.session_id), der.session_id, [memory_grant_mb] = der.granted_query_memory*8/1024 from sys.dm_exec_requests der join sys.dm_exec_sessions des on des.session_id = der.session_id where der.granted_query_memory >= (@memory_threshold_mb*1024/8) and ( @include_open_tran = 1 or (der.open_transaction_count = 0 and des.open_transaction_count = 0) ) and (@skip_my_login = 0 or (@skip_my_login = 1 and des.login_name <> @my_login)) order by granted_query_memory desc; open cur_connections; fetch next from cur_connections into @kill_string, @session_id, @memory_grant_mb; while @@FETCH_STATUS = 0 begin begin try if @debug = 0 begin print @kill_string exec (@kill_string); end else begin select [@session_id] = @session_id, [@memory_grant_mb] = case when @memory_grant_mb >= 1024 then convert(varchar,convert(numeric(20,1),@memory_grant_mb*1.0/1024))+' gb' else convert(varchar,@memory_grant_mb)+' mb' end exec sp_WhoIsActive @filter = @session_id, @get_outer_command = 1, @get_plans = 1; end end try begin catch end catch fetch next from cur_connections into @kill_string, @session_id, @memory_grant_mb; end close cur_connections deallocate cur_connections if @debug = 1 waitfor delay '00:01:00' else waitfor delay '00:00:02' end go |
In above query, @memory_threshold_mb can be used to control the Memory Grant threshold for query. With @debug = 1, we don’t kill the memory consumer sessions, rather just display the session details using sp_WhoIsActive. Below is a sample result with @debug = 1, With @include_open_tran = 1, we kill even the sessions having an open transaction in them if they happen to be memory consumers above the threshold.
Above appropriate parameter values, we can find & kill the top memory consumers using the above script and release the server from memory pressure.
Once the memory pressure is gone, we need to provide RCA. For this purpose, in other circumstances, I fetch captured history from sp_WhoIsActive capturing that we do as part of our free & open source SQLMonitor tool. But when memory pressure is to such a scale where queries get stuck with RESOURCE_SEMAPHORE waits, even WhoIsActive capturing might get stuck because of memory issues.
So at this point, I go back to my next favorite tool sp_BlitzCache. I use the below query –
1 |
EXEC master..sp_BlitzCache @Top = 30, @SortOrder = 'memory grant' -- Queries using maximum memory grant |
Sample result on my lab server having memory pressure –
I hope this will be helpful to DBAs dealing with Memory Pressure on server due to bursty memory hungry workload. If you like the article, do like and share.
Nice Article Ajay. Well done