SQL Server Live Health Check

Spread the love

Working as a DBA, I often get pulled into issues where application teams complain about “database server is slow”.

This general statement “server is slow” requires an equally robust approach that can help us figure out if there is an issue with CPU, memory, io, or regressed query. At this point, I prefer to use my SQLMonitor dashboard. This tool is entirely free and open source, and can be deployed on SQL Server 2014+ irrespective of any environment or edition.

But there are times when I don’t have SQLMonitor deployed on a server for which I am being brought to look at. This is the point where I use my below general Server Health Check query https://ajaydwivedi.com/query/health mentioned below –

SQL Server Health Check Query Result 1

In the above script, lines 8-26 control the output of the health check query.

Typical result on a busy server –

ResultSet 1 ->

RunningQuery Memory-Status contains static & dynamic memory related information. Static metrics include domain, ip, server name, current time, cpu count & ram on server, Dynamic memory related information includes available memory on box, memory usage by SQL Server, Memory Grant Pending etc.

ResultSet 2 ->

This provides how CPU utilization looks on the server over time. By default, we look for the last 30 minutes of CPU trend which is controlled by the parameter @cpu_trend_minutes. Key point to note is the difference b/w OS CPU & SQL CPU. If there is a high gap, then non-SQL processes could be the reason for the high CPU utilization on the server.

ResultSet 3 ->

RunningQuery Schedulers provides how the distribution of CPU scheding looks like among various numa nodes. Uneven distribution of workers/tasks among various numa nodes indicates misconfiguration on server, or some bug.

ResultSet 4 ->

RunningQuery ALL-POOL/Requests resultset shows the distribution of cpu related metrics per program & login. If resource governor is configured with various resource pools, then it would segregate this info by pool. This resultset can be controlled to provide info only for a specific resource pool using parameter @pool_name. Output from this can indicate the top program/login causing HIGH CPU issues on box. For example, in the above screenshot, there are 100 active requests utilizing all 100% schedulers on my server which is the main cause of high CPU.

ResultSet 5 ->

RunningQuery Concurrent-Session-Queries result is useful when there is a high CPU issue on the server, and this might be happening to to one rougue application launch same concurrent query on multiple sessions. This output will inform you about the host_name, program, database, login having multiple sessions running similar query.

ResultSet 6 ->

RunningQuery Memory-Consumers (>=500.00MB) resultset is really useful in case the server is facing RESOURCE_SEMAPHORE waits, which means, there are some sessions having high memory grants, leading other sessions to suffer. Other sessions not even able to start due to memory not available inside sql server. This result has top memory consumers that needs to be killed in order to release server from memory pressure. The memory threshold can be controlled by parameter @granted_memory_threshold_mb.

ResultSet 7 ->

Shows the Blocking Tree including all the details of blocker & blocked sessions.

ResultSet 8 ->

RunningQuery Running over 10 minutes shows all requests/sessions that have been executing over @long_running_query_threshold_minutes. This result as usual includes all the session details.

ResultSet 9 ->

RunningQuery All-Sessions-SortedBy-cpu shows all the active requests/sessions sorted by CPU consumption. The sorting of this result again can be controlled by parameter @sort_order_all_sessions.

ResultSet 10 ->

RunningQuery 5-Seconds-IO-Stats shows IO Latency for database files having at least 20 MB of throughput. The snapshot duration for this result can be controlled by the parameter @io_snapshot_durations_seconds.

I hope this would be helpful to you DBAs out there.

Kindly like, subscribe & share if this alert helped you in any way.


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.