Performance Tuning With Predicate Logic

Ajay Dwivedi | October 15, 2016 | Errors, Internals, Performance Tuning, TSQL | No Comments

Spread the love

I have often seen Developers write predicate logic without bothering about performance. That works most of the time. But what about queries that return huge data sets? Is there any general rule of predicate logic?

Let’s start with simple definition of Predicate Logic. A predicate is a property or an expression that either holds or doesn’t hold—in other words, is either true or false. In general term, we should define our database in such a way that every condition on WHERE clause of query should evaluate to either TRUE or FALSE. So the question is how can we do that? We can do that by solid database design with no NULLABLE columns in it. But is it possible to always have this NOT NULL design for every column of every table in database. Ideally no.

So skipping the theory for now, let’s start with dbo.Users table of [StackOverflow] database. This table has 4491134 records with NULL value in Location column. Below is the query and screenshot.

SELECT COUNT(*) AS Total_Counts FROM dbo.Users
	--	5277831
SELECT COUNT(*) AS Total_Counts FROM dbo.Users u WHERE u.Location IS NULL
	--	4491134

SELECT COUNT(*) AS Total_Counts FROM dbo.Users

-- 5277831

SELECT COUNT(*) AS Total_Counts FROM dbo.Users u WHERE u.Location IS NULL

-- 4491134

Now let’s execute below 2 queries.

--	Query 01 - Direct Comparison of Location
SET STATISTICS IO ON;
SET STATISTICS TIME ON;

DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

SELECT	u.Id, u.DisplayName, u.Location
FROM	dbo.Users AS u
WHERE	u.Location = 'India'

-- Query 01 - Direct Comparison of Location

SET STATISTICS IO ON;

SET STATISTICS TIME ON;

DBCC FREEPROCCACHE

DBCC DROPCLEANBUFFERS

SELECT u.Id, u.DisplayName, u.Location

FROM dbo.Users AS u

WHERE u.Location = 'India'

--	Query 02 - Check for NOT NULL first
SET STATISTICS IO ON;
SET STATISTICS TIME ON;

DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS

SELECT	u.Id, u.DisplayName, u.Location
FROM	dbo.Users AS u
WHERE	u.Location IS NOT NULL
	AND	u.Location = 'India'

-- Query 02 - Check for NOT NULL first

SET STATISTICS IO ON;

SET STATISTICS TIME ON;

DBCC FREEPROCCACHE

DBCC DROPCLEANBUFFERS

SELECT u.Id, u.DisplayName, u.Location

FROM dbo.Users AS u

WHERE u.Location IS NOT NULL

AND u.Location = 'India'

Now, let’s check stats for queries.
Stats for Query 01 - Direct Comparison of Location

Stats for Query 02 - Check for NOT NULL first

We can see that although the there is no difference in Logical & Physical Reads between the 2 queries, still the query with NOT NULL predicate is 2 seconds faster than the other one for even a small result set of 24503 rows. So we can conclude that if a column is nullable, and we place the predicate that <column_name> IS NOT NULL, we can expect a little performance benefit from it.

Now, let’s analyze another scenario. Below are the 2 queries.

DECLARE @myNumber TINYINT = 0
--	Query 03:	Zero value check at first
SELECT	*
FROM	(VALUES (12),(45),(NULL)) AS MyNumbers(mynumber)
WHERE	@myNumber = 0
	OR	MyNumbers.mynumber/@myNumber > 1;

DECLARE @myNumber TINYINT = 0

-- Query 03: Zero value check at first

SELECT *

FROM (VALUES (12),(45),(NULL)) AS MyNumbers(mynumber)

WHERE @myNumber = 0

OR MyNumbers.mynumber/@myNumber > 1;

DECLARE @myNumber TINYINT = 0
--	Query 04:	Zero value check at last
SELECT	*
FROM	(VALUES (12),(45),(NULL)) AS MyNumbers(mynumber)
WHERE	MyNumbers.mynumber/@myNumber > 1
	OR	@myNumber = 0;

DECLARE @myNumber TINYINT = 0

-- Query 04: Zero value check at last

SELECT *

FROM (VALUES (12),(45),(NULL)) AS MyNumbers(mynumber)

WHERE MyNumbers.mynumber/@myNumber > 1

OR @myNumber = 0;

Query 03 executed successfully returning 3 rows, whereas Query 04 failed with Divide by zero error.

Output of Query 03: Zero value check at firstOutput of Query 04: Zero value check at last

We see that for Query 03 & Query 04, even though the queries have same predicates in WHERE Clause, but due to different order of predicates, SQL Server is executing the 2 queries differently.

Conclusion: SQL Server is smart enough to analyze your query, and re-arrange the predicate order in WHERE Clause in most cases. That is the reason our queries work efficiently most of the time. But it does not re-order the predicates that would produce the different result. So it is better for us as developers to not assume anything and decide the predicate order ourselves. Happy Coding.

Please feel free to correct me by comments. If you like the article, do Like & Share.

Member: cash advance

Tags:Query Tuning, sql-server, tsql

About The Author

Ajay Dwivedi

I am Microsoft Certified Professional having 10+ years of experience in SQL Server Querying, Database Design, and Administration. I am fond of Query Tuning and like to automate things using TSQL & PowerShell. I also have experience of implementing end-to-end Data Warehouse solution, and Data Migration using ETL tools SQL Server Integration Services (SSIS), Pentaho Business Analytics, and have designed Database Inventory through PowerShell, Python, and Django etc.

SQLServer Performance Optimization and Automation

Performance Tuning With Predicate Logic

Like this:

Related

About The Author

Ajay Dwivedi

Leave a ReplyCancel reply

Share this:

Like this:

Related

Related Posts

About The Author

Ajay Dwivedi

Leave a ReplyCancel reply