Live Troubleshooting High CPU on SQL Server – Part 1
Very often, we DBAs inherit environments that were not maintained as per best practices by “Previous accidental DBA”. One such poor practice could be application teams having full access on SQL Server hosts. Similarly, there could be other applications running on the same SQL Server host apart from SQL Server.
Since SQL Server is not a cheap application and would cost a lot of money if we need to scale it horizontally. So it is not a good practice to allow any other resource-consuming processes to run alongside with SQL Server.
In this blog, we are going to troubleshoot a high CPU scenario when the CPU issue is caused by non-SQL Server process. Let’s say, I receive a call from the monitoring team where the application team is complaining about a “slow” SQL Server.
If my SQL Server is baselined with the SQLMonitor tool, then I check the Monitoring – Live – All Servers dashboard which displays all the metrics of specific SQL Servers that need DBA help.
Here under the 3rd panel “
All Servers - Health Metrics - Require ATTENTION“, I notice that there is more than 25% gap between OS CPU and
SQL CPU utilization. This is an indicator that there is some process running at OS level consuming significant CPU and leading SQL Server process to starve of CPU.
From here, I open “Monitoring – Live – Distributed” dashboard which helps me check a lot more health metrics for a SQL Server instance. Here also, I see a big gap in CPU b/w OS & SQL.
In SQLMonitor, we capture OS processes that consume some significant bit of resources into table dbo.os_task_list using SQL Agent job ‘(dba) Collect-OSProcesses‘ . Here, we go to row “OS Tasks (Processes) – CPU Utilization“, and explore the panels under it.
In this “CPU Time – Histogram” panel, we can see that data capturing of every 2 minutes, and there are 2 non-SQL Server processes consuming about 4.84 (3.40+1.43) minutes of cpu time in an interval of 2 minutes. So now I know what processes are causing this high CPU issue. To dig deeper, I can query the table
dbo.os_task_list for time interval of issue.
declare @start_time_utc datetime2 = '2022-10-02T08:10:54Z';
declare @end_time_utc datetime2 = '2022-10-02T08:49:54Z';
declare @sql nvarchar(max);
declare @params nvarchar(max);
set @params = N'@start_time_utc datetime2, @end_time_utc datetime2';
set @sql = '
;with total_all_instances as (
select collection_time_utc, task_name, session_name, user_name
,[memory_mb] = sum(memory_kb)/1024
,[cpu_time_minutes] = sum(cpu_time_seconds)/60
from dbo.os_task_list otl
where otl.collection_time_utc between @start_time_utc and @end_time_utc
and otl.task_name not in (''System Idle Process'')
group by collection_time_utc, task_name, session_name, user_name
select top 10 task_name, session_name, user_name,
memory_mb_AVG = avg(memory_mb),
cpu_time_minutes_MAX = max(cpu_time_minutes)
group by task_name, session_name, user_name
order by cpu_time_minutes_MAX desc
exec sp_executesql @sql, @params, @start_time_utc, @end_time_utc;
Now that we know the application/login that is consuming high CPU on SQL Server host, we can reach out to appropriate teams to stop the application. Or if server stability is a high priority on SQL Server boxes, then you may even stop the process using PowerShell or manually using Task Manager.
In case I don’t have the SQL Server in my SQLMonitor tool, then I can utilize 2 other tools.
First, my SQL Server Live Health Check query. In 2nd resultset of this health check query, I get a CPU Utilization Trend for a default duration of the last 30 minutes. In this result, we can notice a difference in CPU utilization at OS vs SQL.
The second tool I use is sp_BlitzFirst. The result of sp_BlitzFirst query exposes a finding ‘High CPU Utilization – Not SQL‘ along with finding ‘High CPU Utilization‘.
These either of these query level tools, I have to RDP the SQL Server host to figure out the processes causing this CPU issue.
I hope this will be helpful to DBAs dealing with slow server due to CPU pressure caused by non-sql server processes. If you like the article, do like and share.