Ensure the CPU issue is not because of non-SQL Processes
If CPU is all being consumed by SQL Server process, then use tools like sp_BlitzFirst or open source SQLMonitor to validate any deviation in workload (batch requests/sec), or high compilation or recompilations (> 20% of batch requests), high forwarded fetches compared to previous times.
In rare cases, a very high-frequency non-optimized query can also cause high CPU issues. This happens when this high-frequency query has issues like string manipulations, implicit conversion, cursors, functions etc.
In the above dashboards, I could clearly notice PAGELATCH_** wait at the top along with SOS_SCHEDULER_YIELD. The presence of the above 2 waits is indicative of high CPU issues due to contention on the access of data file pages. I validated and found that this PAGELATCH_** wait is present almost all the time on the server. So decided to check the data of dbo.WhoIsActive that stores captured data of sp_WhoIsActive in SQLMonitor tool.
In the wait_info column of the above dbo.WhoIsActive result screenshot, I clearly noticed that due to sub-optimized queries creating/dropping temp tables in the tempdb database executing thousands of times per minute is leading to tempdb contention on the server.
From the result of the above query, found that the tempdb database had only a single data file. So to partially reduce the issue, I went ahead and added 7 additional tempdb files ensuring the initial size & auto growth are equal for all the data files of tempdb.
Conclusion:- The SQL Server was dealing with tempdb contention & presenting it in the form of high CPU. It was showing SOS_SCHEDULER_YIELD & PATCHLATCH_UP as the top waits. Any diagnostics tsql tool/query was getting stuck (not completing) if it was trying to drop/create #temptables. So only diagnostics queries that would avoid tempdb usage were able to help me figure out the issues. Adding additional tempdb files helped me get rid of the high CPU issue to large extent.
To reduce the CPU issue to even further, later I executed Pssdiag & SqlNexus, found the top 2 batches/queries by CPU usage, and optimized them by getting rid of parameter sniffing & Kitchen Sink design. Probably will write blogs on these issues in future. XEvent infra of SQLMonitor did not help in this cause as it capture queries only with a duration equal to or above 5 seconds, and would not help to catch high-frequency queries finishing within 5 seconds.
I hope this would be helpful to you DBAs out there.
Kindly like, subscribe & share if this alert helped you in any way.
I am Microsoft Certified Professional having 10+ years of experience in SQL Server Querying, Database Design, and Administration. I am fond of Query Tuning and like to automate things using TSQL & PowerShell. I also have experience of implementing end-to-end Data Warehouse solution, and Data Migration using ETL tools SQL Server Integration Services (SSIS), Pentaho Business Analytics, and have designed Database Inventory through PowerShell, Python, and Django etc.