Live Troubleshooting High CPU on SQL Server – Part 1

Spread the love

Very often, we DBAs inherit environments that were not maintained as per best practices by “Previous accidental DBA”. One such poor practice could be application teams having full access on SQL Server hosts. Similarly, there could be other applications running on the same SQL Server host apart from SQL Server.

Since SQL Server is not a cheap application and would cost a lot of money if we need to scale it horizontally. So it is not a good practice to allow any other resource-consuming processes to run alongside with SQL Server.

In this blog, we are going to troubleshoot a high CPU scenario when the CPU issue is caused by non-SQL Server process. Let’s say, I receive a call from the monitoring team where the application team is complaining about a “slow” SQL Server.

If my SQL Server is baselined with the SQLMonitor tool, then I check the Monitoring – Live – All Servers dashboard which displays all the metrics of specific SQL Servers that need DBA help.

SQLMonitor exposing all SQL Server Health Metrics

Here under the 3rd panel “All Servers - Health Metrics - Require ATTENTION“, I notice that there is more than 25% gap between OS CPU and SQL CPU utilization. This is an indicator that there is some process running at OS level consuming significant CPU and leading SQL Server process to starve of CPU.

From here, I open “Monitoring – Live – Distributed” dashboard which helps me check a lot more health metrics for a SQL Server instance. Here also, I see a big gap in CPU b/w OS & SQL.

SQLMonitor exposing High CPU consumers

In SQLMonitor, we capture OS processes that consume some significant bit of resources into table dbo.os_task_list using SQL Agent job ‘(dba) Collect-OSProcesses‘ . Here, we go to row “OS Tasks (Processes) – CPU Utilization“, and explore the panels under it.

SQLMonitor exposing High CPU consumers

In this “CPU Time – Histogram” panel, we can see that data capturing of every 2 minutes, and there are 2 non-SQL Server processes consuming about 4.84 (3.40+1.43) minutes of cpu time in an interval of 2 minutes. So now I know what processes are causing this high CPU issue. To dig deeper, I can query the table dbo.os_task_list for time interval of issue.

CPU Consumers on SQL Server host using SQLMonitor

Now that we know the application/login that is consuming high CPU on SQL Server host, we can reach out to appropriate teams to stop the application. Or if server stability is a high priority on SQL Server boxes, then you may even stop the process using PowerShell or manually using Task Manager.

In case I don’t have the SQL Server in my SQLMonitor tool, then I can utilize 2 other tools.

First, my SQL Server Live Health Check query. In 2nd resultset of this health check query, I get a CPU Utilization Trend for a default duration of the last 30 minutes. In this result, we can notice a difference in CPU utilization at OS vs SQL.

The second tool I use is sp_BlitzFirst. The result of sp_BlitzFirst query exposes a finding ‘High CPU Utilization – Not SQL‘ along with finding ‘High CPU Utilization‘.

sp_BlitzFirst exposing High CPU Issue

These either of these query level tools, I have to RDP the SQL Server host to figure out the processes causing this CPU issue.

I hope this will be helpful to DBAs dealing with slow server due to CPU pressure caused by non-sql server processes. If you like the article, do like and share.

3 Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.